Evalverse Docs
  • Evalverse
  • 🏃LET'S START!
    • Install
    • Quickstart
  • 📃DOCUMENTS
    • Evaluation
    • Report
    • FAQs
  • 🌌Portal
    • 🤗HuggingFace Space
    • 💬Discord
    • GitHub
    • GitHub Issues
Powered by GitBook
On this page
  • Evaluation
  • Evaluation with Library
  • Evaluation with CLI
  • Report
  1. LET'S START!

Quickstart

Quickstart for the evaluations and generating a report with Library or CLI

PreviousInstallNextEvaluation

Last updated 1 year ago

Evaluation

Evaluation with Library

The following code is a simple example to evaluate the from upstage on h6_en ().

import evalverse as ev

evaluator = ev.Evaluator()

model = "upstage/SOLAR-10.7B-Instruct-v1.0"
benchmark = "h6_en"

evaluator.run(model=model, benchmark=benchmark)

Evaluation with CLI

Here is a script that produces the same result as the above code:

cd evalverse

python3 evaluator.py \
--h6_en \
--ckpt_path upstage/SOLAR-10.7B-Instruct-v1.0



Report

Currently, generating a report is only available through the library. We will work on a Command Line Interface (CLI) version as soon as possible.

import evalverse as ev

db_path = "./db"
output_path = "./results"
evaluator = ev.Reporter(db_path=db_path, output_path=output_path)

reporter.update_db(save=True)

model_list = ["SOLAR-10.7B-Instruct-v1.0"]
benchmark_list = ["h6_en", "mt_bench", "ifeval", "eq_bench"]
reporter.run(model_list=model_list, benchmark_list=benchmark_list)
🏃
SOLAR-10.7B-Instruct-v1.0 model
Open LLM Leaderboard