Evaluations Supported Models
The following models are supported for use as both judge models and models to be evaluated in the Bytecompute AI Evaluations API. You can specify any of these models in the model_name field of your evaluation configuration.
- Judge models: For best results, we recommend using a larger, more capable model (such as DeepSeek-V3-0324 or DeepSeek-R1-0528) as the judge.
- Evaluated models: You may use any of these models as the subject of your evaluation, either by referencing a column in your dataset or by providing a model configuration object.
Note: The list below is updated regularly as new models become available.
| Organization | Model Name | API Model String | Context Length |
|---|---|---|---|
| Moonshot | Kimi K2 Instruct | moonshotai/Kimi-K2-Instruct | 128,000 |
| DeepSeek | DeepSeek-V3-0324 | deepseek-ai/DeepSeek-V3 | 163,839 |
| DeepSeek | DeepSeek-R1-0528 | deepseek-ai/DeepSeek-R1-0528 | 163,839 |
| Meta | Llama 3.1 405B Instruct Turbo | meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo | 130,815 |
| Qwen | Qwen3 235B A22B Throughput | Qwen/Qwen3-235B-A22B-fp8-tput | 40,960 |
| Qwen | Qwen 2.5 72B Instruct Turbo | Qwen/Qwen2.5-72B-Instruct-Turbo | 32,768 |
| DeepSeek | DeepSeek R1 Distill Llama 70B | deepseek-ai/DeepSeek-R1-Distill-Llama-70B | 131,072 |
| Meta | Llama 3.3 70B Instruct Turbo | meta-llama/Llama-3.3-70B-Instruct-Turbo | 131,072 |
| Meta | Llama 3.1 70B Instruct Turbo | meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo | 131,072 |
| Qwen | QwQ-32B | Qwen/QwQ-32B | 32,768 |
| Qwen | Qwen 2.5 Coder 32B Instruct | Qwen/Qwen2.5-Coder-32B-Instruct | 32,768 |
| Qwen | Qwen3 235B-A22B Thinking 2507 | Qwen/Qwen3-235B-A22B-Thinking-2507 | 262,144 |
| Qwen | Qwen3 235B-A22B Instruct 2507 | Qwen/Qwen3-235B-A22B-Instruct-2507-tput | 262,144 |
| Gemma 2 27B | google/gemma-2-27b-it | 8,192 | |
| Mistral AI | Mistral Small 3 Instruct (24B) | mistralai/Mistral-Small-24B-Instruct-2501 | 32,768 |
| DeepSeek | DeepSeek R1 Distill Qwen 14B | deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | 131,072 |
| Marin Community | Marin 8B Instruct | marin-community/marin-8b-instruct | 4,096 |
| Meta | Llama 3.1 8B Instruct Turbo | meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo | 131,072 |
| Qwen | Qwen 2.5 7B Instruct Turbo | Qwen/Qwen2.5-7B-Instruct-Turbo | 32,768 |
| Meta | Llama 3.2 3B Instruct Turbo | meta-llama/Llama-3.2-3B-Instruct-Turbo | 131,072 |
| Meta | Llama 4 Maverick (17Bx128E) |
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | 1,048,576 |
| Meta | Llama 4 Scout (17Bx16E) |
meta-llama/Llama-4-Scout-17B-16E-Instruct | 1,048,576 |
