UltraEval is a lightweight, user-friendly, and comprehensive framework for evaluating the capabilities of large language models, featuring modular design, efficient inference, and extensive benchmark coverage.
FreeEval is a modular and extensible framework that enables trustworthy and efficient automatic evaluation of Large Language Models (LLMs) by providing a unified implementation of diverse evaluation methods, incorporating meta-evaluation techniques, and leveraging high-performance inference backends.