Use Eluether Evaluation Harness to evaluate on lm_eval engine. Supported evals are as follow for engine:

  1. lm_eval: mmlu, gsm8k, hellaswag, arc, truthfulqa, winogrande

Models >8B and context more than 8k are not currently supported. Support will be added shortly.

Language
Authorization
Bearer
JWT
Click Try It! to start a request and see the response here!