• featured
openai/whisper-large-v3-turbo Robot

Whisper Large-Turbo

A weakly supervised pre-trained version of the Whisper model, optimized for high-speed Automatic Speech Recognition (ASR) and speech translation. By significantly reducing the number of decoder layers to 4 while maintaining the robust large-v3 encoder architecture, this 'Turbo' variant offers an 8.8x speedup compared to large-v3 with minimal degradation in Word Error Rate (WER). It is specifically designed as a high-efficiency alternative for low-latency production environments.

$0.00066/MINUTE

Input

Please upload an audio file

Output

API Documentation: Whisper Audio Transcription 🎙️

This document specifies the API for transcribing audio files using the hosted Whisper large-v3-turbo model, following the OpenAI-compatible schema. This "Turbo" variant is a weakly supervised pre-trained model featuring a 4-layer decoder, offering 8.8x faster inference than the standard large-v3 while maintaining near-identical accuracy.


Endpoint

Method URL Summary
POST /v1/audio/transcriptions Transcribe audio/video using the Whisper model.

Authentication

The API utilizes Bearer Token authentication. A valid API_KEY must be included in the header for all requests.

Header Example Description
Authorization Bearer YOUR_API_KEY Your server-provided API Key.
x-request-id UUID_string Optional. Unique identifier for tracking. Generated by server if omitted.

Request Parameters

Requests must be sent as multipart/form-data.

Parameter Type Required Description
file File / URL Yes The audio file object or a direct URL to an audio/video file (mp3, mp4, mpeg, mpga, m4a, wav, webm, aac).
model string Yes Use "openai/whisper-large-v3-turbo" for high-speed ASR.
response_format string No Formats: json, text, srt, vtt, verbose_json. Default: json.
temperature number No 0.0 to 1.0. Controls randomness. Higher values increase variability. Default: 0.0.
language string No ISO-639-1 code (e.g., en, zh, ja) to improve transcription accuracy.
prompt string No Optional text to guide the model's style or vocabulary.
timestamp_granularities array No Only for verbose_json. Can include word or segment.

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.

Contact Sales