FAQs

About the project

chevron-rightWhat is Evalverse?hashtag

Evalverse is a freely accessible, open-source project designed to support your LLM (Large Language Model) evaluations. We provide a simple, standardized, and user-friendly solution for the processing and management of LLM evaluations, catering to the needs of AI research engineers and scientists. Even if you are not very familiar with LLMs, you can easily use Evalverse.

chevron-rightWhy should I use Evalverse?hashtag
  • Unified evaluation with Submodules: For unified and expandable evaluation, Evalverse utilizes Git submodules to integrate external evaluation frameworks such as lm-evaluation-harness arrow-up-rightand FastChatarrow-up-right. Thus, one can easily add new submodules to support more external evaluation frameworks. Not only that, one can always fetch upstream changes of the submodules to stay up-to-date with evaluation processes in the fast-paced LLM field.

  • No-code evaluation request: Evalverse supports no-code evaluation via Slack requests. The user types Request! in a direct message or Slack channel with an activate Evalverse Slack bot. The Slack bot asks the user to enter the model name in the Huggingface hub or the local model directory path and executes the evaluation process.

  • LLM evaluation report: Evalverse can also provide evaluation reports on finished evaluation in a no-code manner. To receive the evaluation report, the user first types Report!. Once the user selects model and evaluation criteria, Evalverse calculates the average scores and rankings using the evaluation results stored in the Database and provides a report with a performance table and a visualized graph.

chevron-rightHow to use Evalverse?hashtag

We suggest kicking off your journey by exploring Quickstartarrow-up-right. If you have any questions on your journey, feel free to share it on Discord.

Support

chevron-rightHow to cite Evalverse project?hashtag

If you want to cite our Evalverse project, feel free to use the following bibtex.

@misc{evalverse,
  title = {Evalverse},
  author = {Jihoo Kim, Wonho Song, Dahyun Kim, Yoonsoo Kim, Yungi Kim, Chanjun Park},
  year = {2024},
  eprint={2404.00943},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}
chevron-rightI have a question or something to share with.hashtag

The Discord channel is where you should head for general inquiries or seeking assistance. Regarding bugs, please report them on the GitHub Issuesarrow-up-right directly.

Typically, you can anticipate a response within 1 to 2 business days.

chevron-rightI found a bug.hashtag

Please report it on the GitHub Issuesarrow-up-right.

Typically, you can anticipate a response within 1 to 2 business days.

Last updated