Content
# MCP Video Digest (视频内容提取总结)
<div align="right">
<a href="README_EN.md">English</a> | <b>中文</b>
</div>
## Project Introduction
MCP Video Digest is a video content processing service that can extract audio from YouTube, Bilibili, TikTok, Twitter... videos and convert it to text. The service supports multiple transcription service providers, including Deepgram, Gladia, Speechmatics, and AssemblyAI, and can flexibly choose to use them based on the configured API keys. (The first MCP practice project, mainly to familiarize yourself with the development and operation process of MCP)
## Features
- Supports downloading and audio extraction of streaming content from over 1000 websites
- Multiple transcription service provider support:
- Deepgram
- Gladia
- Speechmatics
- AssemblyAI
- Flexible service selection mechanism, automatically selects services based on available API keys
- Asynchronous processing design to improve concurrent performance
- Complete error handling and logging
- Supports speaker diarization
- × Supports local model cpu/gpu accelerated processing
## Directory Structure
```
.
├── src/ # 源代码目录
│ ├── services/ # 服务实现目录
│ │ ├── download/ # 下载服务
│ │ └── transcription/ # 转录服务
│ ├── main.py # 主程序逻辑
│ └── __init__.py # 包初始化文件
├── config/ # 配置文件目录
├── test.py # 测试脚本
├── run.py # 服务启动脚本
├── pyproject.toml # 项目配置和依赖管理
├── uv.lock # UV 依赖锁定文件
└── .env # 环境变量配置
```
## Test Screenshots


## Installation Instructions
### 1. Install uv or use python
If uv is not already installed, you can install it using the following command:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
### 2. Clone the project:
```bash
git clone https://github.com/R-lz/mcp-video-digest.git
cd mcp-video-digest
```
### 3. Create and activate a virtual environment:
```bash
uv venv
source .venv/bin/activate # Linux/Mac
# 或
.venv\Scripts\activate # Windows
```
### 4. Install dependencies:
```bash
uv pip install -e .
```
> speechmatics encountered various problems when using requests for debugging (not a problem with speechmatics, but because I am a newbie), so I used the speechmatics sdk
## Configuration Instructions
1. Create a `.env` file in the project root directory or rename `.env.example`, and configure the required API keys:
```
mv .env.example .env
# 修改
DEEPGRAM_API_KEY=your_deepgram_key
GLADIA_API_KEY=your_gladia_key
SPEECHMATICS_API_KEY=your_speechmatics_key
ASSEMBLYAI_API_KEY=your_assemblyai_key
```
Note: At least one service's API key needs to be configured
2. Service priority order:
- Deepgram (recommended for Chinese content)
- Gladia
- Speechmatics
- AssemblyAI
## Usage
1. Start the service:
```bash
uv run src/main.py
```
Or use debug mode:
```bash
UV_DEBUG=1 uv run src/main.py
```
2. Call the service:
```python
from mcp.client import MCPClient
async def process_video():
client = MCPClient()
result = await client.call(
"get_video_content",
url="https://www.youtube.com/watch?v=video_id"
)
print(result)
```
3. Client SSE example
```bash
{
"mcpServers": {
"video_digest": {
"url": "http://<ip>:8000/sse"
}
}
}
# 当然可以在Client传递Key
"env": {
"DEEPGRAM_API_KEY":"your_deepgram_key"
}
```
> Modify the startup command for STDIO mode: not verified and tested [MCP Documentation](https://modelcontextprotocol.io/)
## Testing
Run the test script:
```bash
uv run test.py
# 或
python test.py
```
The test script will:
- Verify environment variable configuration
- Test YouTube download functionality
- Test each transcription service
- Test the complete video processing flow
## Development Guide
1. Add a new transcription service:
- Create a new service class in the `src/services/transcription/` directory
- Inherit the `BaseTranscriptionService` class
- Implement the `transcribe` method
2. Customize the download service:
- Modify or add a new downloader in the `src/services/download/` directory
- Inherit or modify the `YouTubeDownloader` class
## Dependency Management
- Use `uv pip install package_name` to install new dependencies
- Use `uv pip freeze > requirements.txt` to export the dependency list
- Use `pyproject.toml` to manage dependencies, `uv.lock` to lock dependency versions
## Error Handling
The service will handle the following situations:
- API key missing or invalid
- Video download failed
- Audio transcription failed
- Network connection issues
- Service limits and quotas
## Precautions
1. Ensure sufficient disk space for temporary files
2. Pay attention to the API usage limits of each service provider
3. It is recommended to use Python 3.11 or higher
4. Temporary files will be cleaned up automatically
5. Using uv can achieve faster dependency installation speed and better dependency management
6. YouTube downloads may require authentication. You can copy the cookie to the cookies.txt file in the root directory [Use the plugin to quickly generate](https://chromewebstore.google.com/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc) or use other authentication methods such as cookies-from-browser, [yt-dlp](https://github.com/yt-dlp/yt-dlp)
## STT Key Application and Free Quota
- [Speechmatics](https://www.speechmatics.com/) 8 hours free per month - [Pricing](https://www.speechmatics.com/pricing)
- [Gladia](https://app.gladia.io/) 10 hours free per month - [Pricing](https://app.gladia.io/billing)
- [AssemblyAI](https://www.assemblyai.com/) Total $50 free credit - [Pricing](https://www.assemblyai.com/pricing)
- [Deepgram](https://deepgram.com/) Total $200 free credit - [Pricing](https://deepgram.com/pricing)
> Content is for reference only
## License
Licensed under the MIT License.
Connection Info
You Might Also Like
markitdown
Python tool for converting files and office documents to Markdown.
Fetch
Retrieve and process content from web pages by converting HTML into markdown format.
oh-my-opencode
Background agents · Curated agents like oracle, librarians, frontend...
chatbox
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
continue
Continue is an open-source project for seamless server management.
semantic-kernel
Build and deploy intelligent AI agents with Semantic Kernel's orchestration...