Content
# voicevox-mcp
This project is an MCP (Model Context Protocol) server that integrates with the VOICEVOX engine to enable voice synthesis and speaker information retrieval. It is implemented in TypeScript and utilizes the MCP SDK.
<a href="https://glama.ai/mcp/servers/@Yuki10Kobayashi/voicevox-mcp">
<img width="380" height="200" src="https://glama.ai/mcp/servers/@Yuki10Kobayashi/voicevox-mcp/badge" alt="VOICEVOX Server MCP server" />
</a>
# Features
- Retrieve speaker information from the VOICEVOX engine (/speakers)
- Synthesize text into speech using a specified speaker and play it locally (/speak)
- Mac only support
# Setup
## Starting the VOICEVOX Engine (Docker recommended)
```sh
docker compose up -d
```
This will start the VOICEVOX engine at localhost:50021.
## Install Dependencies & Build
```sh
npm install
npm run build
```
# Usage
## Example Cursor Configuration
```.cursor/mcp.json
{
"mcpServers": {
"voicevox-mcp": {
"command": "node",
"args": ["${Path to Repository}/dist/index.js"],
"env": {
"SPEAKER_ID": 8,
"SPEED_SCALE": 1.2,
"VOICEVOX_API_URL": "http://localhost:50021"
}
}
}
}
```
Set the VOICEVOX_API_URL as needed.
- You can retrieve the list of speakers using the speakers tool from the MCP client.
- You can synthesize text into speech and play it locally using the speak tool (recommended for Mac environments as it uses the afplay command).
Main dependencies:
- `@modelcontextprotocol/sdk`
- `zod`
- `typescript`
# Notes
- Future improvements
- Voice synthesis cannot be used if the VOICEVOX engine is not running at localhost:50021.
- Please modify the afplay part as necessary for environments other than Mac.
# License
MIT License
You Might Also Like
OpenWebUI
Open WebUI is an extensible web interface for customizable applications.

NextChat
NextChat is a light and fast AI assistant supporting Claude, DeepSeek, GPT4...

Continue
Continue is an open-source project for seamless server management.
semantic-kernel
Build and deploy intelligent AI agents with the Semantic Kernel framework.

repomix
Repomix packages your codebase into AI-friendly formats for easy use.
UI-TARS-desktop
UI-TARS-desktop is part of the TARS Multimodal AI Agent stack.