• featured
higgs-asr Robot

Higgs Speech V3

Higgs-Audio-v3-Speech-to-Text is a high-performance automatic speech recognition (ASR) model developed by BosonAI. Built on a 1.7B parameter architecture, it delivers accurate transcription across 60+ languages with an OpenAI Whisper-compatible API interface.

$0.006/MINUTE

Input

Please upload an audio file

Output

Higgs Audio V3 — Speech to Text

Overview

Higgs Audio V3 is a multilingual speech recognition model designed for high-accuracy transcription and speech translation.

It converts spoken audio into text across a wide range of languages and supports automatic language detection.

Key Capabilities

  • Automatic Speech Recognition (ASR):
    Convert audio into text with high accuracy.

  • Multilingual Support:
    Supports approximately 90+ languages.

  • Automatic Language Detection:
    Detects the spoken language when not specified.

  • Speech Translation (AST):
    Supports translating speech into another language.


Request Parameters

To use the Higgs-Audio v3 STT model, send a POST request to the /v1/audio/transcriptions endpoint with the following parameters.

Parameter Type Required Description
file binary Yes The audio file to transcribe (mp3, wav, flac, m4a). Max 25MB.
model string Yes Use "bosonai-higgs-audio-v3-stt".
language string No Optional, ISO 639-1,auto-detect

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.

Contact Sales