Documentation

Evaluations Supported Models

The following models are supported for use as both judge models and models to be evaluated in the Bytecompute AI Evaluations API. You can specify any of these models in the model_name field of your evaluation configuration.

  • Judge models: For best results, we recommend using a larger, more capable model (such as DeepSeek-V3-0324 or DeepSeek-R1-0528) as the judge.
  • Evaluated models: You may use any of these models as the subject of your evaluation, either by referencing a column in your dataset or by providing a model configuration object.

Note: The list below is updated regularly as new models become available.

Organization Model Name API Model String Context Length
Moonshot Kimi K2 Instruct moonshotai/Kimi-K2-Instruct 128,000
DeepSeek DeepSeek-V3-0324 deepseek-ai/DeepSeek-V3 163,839
DeepSeek DeepSeek-R1-0528 deepseek-ai/DeepSeek-R1-0528 163,839
Meta Llama 3.1 405B Instruct Turbo meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo 130,815
Qwen Qwen3 235B A22B Throughput Qwen/Qwen3-235B-A22B-fp8-tput 40,960
Qwen Qwen 2.5 72B Instruct Turbo Qwen/Qwen2.5-72B-Instruct-Turbo 32,768
DeepSeek DeepSeek R1 Distill Llama 70B deepseek-ai/DeepSeek-R1-Distill-Llama-70B 131,072
Meta Llama 3.3 70B Instruct Turbo meta-llama/Llama-3.3-70B-Instruct-Turbo 131,072
Meta Llama 3.1 70B Instruct Turbo meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo 131,072
Qwen QwQ-32B Qwen/QwQ-32B 32,768
Qwen Qwen 2.5 Coder 32B Instruct Qwen/Qwen2.5-Coder-32B-Instruct 32,768
Qwen Qwen3 235B-A22B Thinking 2507 Qwen/Qwen3-235B-A22B-Thinking-2507 262,144
Qwen Qwen3 235B-A22B Instruct 2507 Qwen/Qwen3-235B-A22B-Instruct-2507-tput 262,144
Google Gemma 2 27B google/gemma-2-27b-it 8,192
Mistral AI Mistral Small 3 Instruct (24B) mistralai/Mistral-Small-24B-Instruct-2501 32,768
DeepSeek DeepSeek R1 Distill Qwen 14B deepseek-ai/DeepSeek-R1-Distill-Qwen-14B 131,072
Marin Community Marin 8B Instruct marin-community/marin-8b-instruct 4,096
Meta Llama 3.1 8B Instruct Turbo meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo 131,072
Qwen Qwen 2.5 7B Instruct Turbo Qwen/Qwen2.5-7B-Instruct-Turbo 32,768
Meta Llama 3.2 3B Instruct Turbo meta-llama/Llama-3.2-3B-Instruct-Turbo 131,072
Meta Llama 4 Maverick
(17Bx128E)
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 1,048,576
Meta Llama 4 Scout
(17Bx16E)
meta-llama/Llama-4-Scout-17B-16E-Instruct 1,048,576