Amazon Transcribe
A fully managed automatic speech recognition (ASR) service that converts speech ... read more
Amazon $1.44000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Amazon Transcribe AssemblyAI Universal 2
Universal-2 is a state-of-the-art model built on Universal-1, offering enhanced ... read more
v2
AssemblyAI $0.37000/hr SRT VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection Confidence
View details for AssemblyAI Universal 2 AssemblyAI Universal 2
Universal-2 is a state-of-the-art model built on Universal-1, offering enhanced ... read more
v2
AssemblyAI $0.37000/hr SRT VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection Confidence
View details for AssemblyAI Universal 2 AssemblyAI Universal 3 Pro
Universal-3 Pro is the first production-quality speech model that adapts its beh... read more
v3
assemblyai $0.21000/hr None
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection Confidence
View details for AssemblyAI Universal 3 Pro Azure AI Speech-to-Text
Azure's default, general-purpose speech-to-text model, trained on a vast amount ... read more
Azure $0.18000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Azure AI Speech-to-Text Cloudflare - Whisper
A general-purpose speech recognition model based on OpenAI's Whisper, trained on... read more
Cloudflare $0.02700/hr VTT
Punctuation Language Detection
View details for Cloudflare - Whisper Cloudflare - Whisper Large V3 Turbo
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech... read more
Cloudflare $0.03060/hr VTT
Punctuation Word Timestamps Language Detection
View details for Cloudflare - Whisper Large V3 Turbo Cloudflare - Whisper Tiny (EN)
This is the English-only version of the Whisper Tiny model which was trained on ... read more
Cloudflare $0.02700/hr VTT
Punctuation Word Timestamps
View details for Cloudflare - Whisper Tiny (EN) Deepgram - Base
Standard base model for speech recognition
v2024-01-26.8851
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base Deepgram - Enhanced
Improved accuracy model for speech recognition
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Deepgram - Enhanced Finance
Enhanced model optimized for finance terminology
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Finance Deepgram - Enhanced General
Enhanced model for general-purpose transcription
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced General Deepgram - Enhanced Meeting
Enhanced model optimized for meetings and conferences
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Meeting Deepgram - Enhanced Phonecall
Enhanced model optimized for phone conversations
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Phonecall Deepgram - Nova
Advanced, high-performance speech recognition model
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Nova Deepgram - Nova 2
High-accuracy, next-generation speech recognition model
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Deepgram - Nova 2 Automotive
Nova 2 model optimized for automotive industry
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Automotive Deepgram - Nova 2 ConversationalAI
Nova 2 model optimized for conversational AI
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 ConversationalAI Deepgram - Nova 2 Drivethru
Nova 2 model optimized for drive-through scenarios
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Drivethru Deepgram - Nova 2 Finance
Nova 2 model optimized for finance terminology
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Finance Deepgram - Nova 2 General
Nova 2 model for general-purpose transcription
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 General Deepgram - Nova 2 Medical
Nova 2 model optimized for medical terminology
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Medical Deepgram - Nova 2 Meeting
Nova 2 model optimized for meetings and conferences
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Meeting Deepgram - Nova 2 Video
Nova 2 model optimized for video content
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Video Deepgram - Nova 2 Voicemail
Nova 2 model optimized for voicemail transcription
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Voicemail Deepgram - Nova General
Nova model for general-purpose transcription
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Nova General Deepgram - Nova Phonecall
Nova model optimized for phone conversations
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Nova Phonecall Deepgram Nova 3
Great accuracy in a broader range of real-world enterprise use cases and challen... read more
deepgram $0.31200/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram Nova 3 Deepgram Nova 3 General
Nova 3 model for general-purpose transcription
deepgram $0.31200/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram Nova 3 General Deepgram Nova 3 Medical
Nova 3 model optimized for medical terminology
deepgram $0.31200/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram Nova 3 Medical ElevenLabs Scribe
Scribe is a speech-to-text model built for accuracy and handling real-world audi... read more
v1
ElevenLabs $0.40000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection
View details for ElevenLabs Scribe FalAI - Cohere Transcribe
Cohere Transcribe turns your business audio into accurate text, ready for search... read more
FalAI $0.25000/hr None
Punctuation
View details for FalAI - Cohere Transcribe FalAI - Whisper
Whisper model hosted on FalAI platform
v3
FalAI $0.06900/hr None
Punctuation Speaker Labels Word Timestamps
View details for FalAI - Whisper FalAI - Wizper
Optimized version of Whisper for improved performance
v3
FalAI $0.03000/hr None
Punctuation Speaker Labels
View details for FalAI - Wizper Gemini 2.5 Flash
Best model for price-performance, ideal for high-throughput tasks like large-sca... read more
v2.5
Gemini $0.12222/hr None
Punctuation Language Detection
View details for Gemini 2.5 Flash Gemini 2.5 Flash-Lite
Most cost-efficient and fastest model, optimized for high-volume, latency-sensit... read more
v2.5
Gemini $0.01215/hr None
Punctuation Language Detection
View details for Gemini 2.5 Flash-Lite Gemini 2.5 Pro
Most advanced model for complex tasks, excelling at coding and complex prompts.
v2.5
Gemini $0.26100/hr None
Punctuation Language Detection
View details for Gemini 2.5 Pro Gladia Solaria
Gladia's cutting-edge, next-generation ASR model, launched in April 2025. Design... read more
v1
Gladia $0.61200/hr VTT
Punctuation Diarization Streaming Speaker Labels Language Detection
View details for Gladia Solaria Google Cloud - Enhanced
Enhanced speech recognition model by Google
google $0.96000/hr SRT VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Google Cloud - Enhanced Google Cloud - Standard
Standard speech recognition model by Google
google $0.96000/hr SRT VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Google Cloud - Standard Groq - Whisper Large V3
A multilingual ASR model offering high accuracy and speed for transcription and ... read more
vv3
groq $0.11100/hr None
Punctuation Word Timestamps Language Detection
View details for Groq - Whisper Large V3 Groq - Whisper Turbo Large V3
A pruned and fine-tuned version of Whisper Large v3, designed for faster and les... read more
vv3 Turbo
groq $0.04000/hr None
Punctuation Word Timestamps Language Detection
View details for Groq - Whisper Turbo Large V3 IBM Watson Speech to Text
A cloud-based speech recognition service from IBM Watson that converts audio int... read more
IBM $1.20000/hr None
Punctuation Diarization Speaker Labels Word Timestamps Confidence
View details for IBM Watson Speech to Text OpenAI - GPT-4o mini Transcribe
Speech-to-text model powered by GPT-4o mini. Offers improvements in word error r... read more
OpenAI $0.18000/hr SRT VTT
Punctuation Streaming Language Detection
View details for OpenAI - GPT-4o mini Transcribe OpenAI - GPT-4o Transcribe
Speech-to-text model powered by GPT-4o. Offers improvements in word error rate, ... read more
OpenAI $0.36000/hr SRT VTT
Punctuation Streaming Language Detection
View details for OpenAI - GPT-4o Transcribe OpenAI - GPT-4o Transcribe Diarize
Speech-to-text model powered by GPT-4o. Offers improvements in word error rate, ... read more
OpenAI $0.36000/hr SRT VTT
Punctuation Streaming Language Detection
View details for OpenAI - GPT-4o Transcribe Diarize OpenAI - Whisper
General-purpose speech recognition model. Based on the open-source Whisper large... read more
vlarge-v2
OpenAI $0.36000/hr SRT VTT
Punctuation Streaming Word Timestamps Language Detection
View details for OpenAI - Whisper Rev AI Enhanced
Rev AI's high-accuracy general-purpose speech-to-text model, trained on a divers... read more
v2.0
RevAI $0.30000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Rev AI Enhanced Rev AI Reverb ASR
Rev AI's open-source derived English Automatic Speech Recognition (ASR) model. T... read more
v1.0
RevAI $0.30000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Rev AI Reverb ASR Speechmatics Enhanced
Speechmatics' Enhanced ASR model offers very good accuracy, though processing is... read more
speechmatics $0.40000/hr SRT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Speechmatics Enhanced Speechmatics Standard
Speechmatics' Standard ASR model offers faster results with good accuracy.
speechmatics $0.24000/hr SRT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Speechmatics Standard Voxtral Mini Transcribe
Voxtral is Mistral’s audio model family designed for powerful speech understandi... read more
Mistral $0.06000/hr None
Punctuation Language Detection
View details for Voxtral Mini Transcribe Voxtral Mini Transcribe v2
Voxtral is Mistral’s audio model family designed for powerful speech understandi... read more
Mistral $0.17999/hr None
Punctuation Language Detection
View details for Voxtral Mini Transcribe v2