FAQs

About the project

What is Evalverse?

Evalverse is a freely accessible, open-source project designed to support your LLM (Large Language Model) evaluations. We provide a simple, standardized, and user-friendly solution for the processing and management of LLM evaluations, catering to the needs of AI research engineers and scientists. Even if you are not very familiar with LLMs, you can easily use Evalverse.

Why should I use Evalverse?

Unified evaluation with Submodules: For unified and expandable evaluation, Evalverse utilizes Git submodules to integrate external evaluation frameworks such as lm-evaluation-harness and FastChat. Thus, one can easily add new submodules to support more external evaluation frameworks. Not only that, one can always fetch upstream changes of the submodules to stay up-to-date with evaluation processes in the fast-paced LLM field.
No-code evaluation request: Evalverse supports no-code evaluation via Slack requests. The user types Request! in a direct message or Slack channel with an activate Evalverse Slack bot. The Slack bot asks the user to enter the model name in the Huggingface hub or the local model directory path and executes the evaluation process.
LLM evaluation report: Evalverse can also provide evaluation reports on finished evaluation in a no-code manner. To receive the evaluation report, the user first types Report!. Once the user selects model and evaluation criteria, Evalverse calculates the average scores and rankings using the evaluation results stored in the Database and provides a report with a performance table and a visualized graph.

How to use Evalverse?

We suggest kicking off your journey by exploring Quickstart. If you have any questions on your journey, feel free to share it on Discord.

Support

How to cite Evalverse project?

If you want to cite our Evalverse project, feel free to use the following bibtex.

@misc{evalverse,
  title = {Evalverse},
  author = {Jihoo Kim, Wonho Song, Dahyun Kim, Yoonsoo Kim, Yungi Kim, Chanjun Park},
  year = {2024},
  eprint={2404.00943},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

I found a bug.

Please report it on the GitHub Issues.

Typically, you can anticipate a response within 1 to 2 business days.

PreviousReport

Last updated 1 year ago

hashtagAbout the project

hashtagSupport

About the project

Support