OpenAI Whisper

OpenAI

103678

Favorite:

Offline speech-to-text and translation using OpenAI Whisper CLI, ideal for converting audio/video files to text or subtitles without API keys

Content

OpenAI Whisper 3

OpenAI Whisper MCP Server - 基于本地 Whisper CLI 的离线语音识别与翻译，无需 API Key，支持 mp3/mp4/m4a/wav 等格式，可输出纯文本或 SRT/VTT 字幕

whisper_transcribe

Submit an audio/video transcription job using local Whisper CLI. Returns a job_id immediately. Use whisper_get_result(job_id) to poll for the result. No API key required — runs entirely on the local machine. Supports: mp3, mp4, m4a, wav, flac, ogg, webm, etc. Args: audio_url: URL of the audio/video file to transcribe. Must be publicly accessible. model: Whisper model. ONLY "tiny" or "base" are allowed (server memory limit). - tiny (default): ~2x faster, ~390MB RAM, recommended for most use cases - base: slightly better accuracy, ~740MB RAM, ~2x slower than tiny DO NOT use turbo/small/medium/large — they require 3-4GB RAM and are disabled. language: Source language code (e.g. "zh", "en", "ja"). Auto-detected if not provided. output_format: Output format. Options: txt (default) / srt / vtt / json / tsv.

Parameters (4)

audio_url string Required

model string Optional

language string Optional

output_format string Optional

whisper_translate

Submit an audio/video translation job (speech → English text) using local Whisper CLI. Returns a job_id immediately. Use whisper_get_result(job_id) to poll for the result. No API key required — runs entirely on the local machine. Args: audio_url: URL of the audio/video file to translate. Must be publicly accessible. model: Whisper model. ONLY "tiny" or "base" are allowed (server memory limit). - tiny (default): ~2x faster, recommended - base: slightly better accuracy, ~2x slower DO NOT use turbo/small/medium/large — they are disabled. output_format: Output format. Options: txt (default) / srt / vtt / json / tsv.

Parameters (3)

audio_url string Required

model string Optional

output_format string Optional

whisper_get_result

Wait for and return the result of a whisper transcription or translation job. Internally polls until the job is done or wait_seconds is exceeded — no need to call repeatedly. Args: job_id: The job ID returned by whisper_transcribe or whisper_translate. wait_seconds: Maximum seconds to wait internally before returning (default: 50). Set to 0 to return immediately without waiting. Whisper tiny model typically finishes in 15~30s. Keep ≤ 50 to stay within the 60s MCP client timeout.

Parameters (2)

job_id string Required

wait_seconds integer Optional

OpenAI Whisper

Content

OpenAI Whisper 3

whisper_transcribe

whisper_translate

whisper_get_result

Connection Info

You Might Also Like

markitdown

oh-my-opencode

claude-flow

chatbox

ai-engineering-from-scratch

everything-claude-code

OpenAI Whisper

Scan with WeChat to Share

Authentication Required

Content

OpenAI Whisper 3

whisper_transcribe

whisper_translate

whisper_get_result

Connection Info

You Might Also Like

markitdown

oh-my-opencode

claude-flow

chatbox

ai-engineering-from-scratch

everything-claude-code