Evaluation
You can run various evaluation by simply adding a few arguments
Last updated
You can run various evaluation by simply adding a few arguments
Last updated
These arguments are common across all benchmarks. Note that you must select at least one benchmark from the list provided in .
ckpt_path
upstage/SOLAR-10.7B-Instruct-v1.0
Model name or ckpt_path to be evaluated
output_path
./results
Path to save evaluation results
model_name
SOLAR-10.7B-Instruct-v1.0
Model name used in saving eval results
use_fast_tokenizer
False
Flag to use fast tokenizer
use_flash_attention_2
False
Flag to use flash attention 2 (highly suggested)
By running the code below,
h6_en
,mt_bench
,ifeval
,eq_bench
benchmarks will execute.