Content
# Gemini MCP - Intelligent Image Analysis Service Based on Gemini
## Project Overview
Gemini MCP is an MCP (Model Context Protocol) server based on the Google Gemini 2.0 Flash model, specifically designed for image analysis and processing. It can be seamlessly integrated into AI assistants that support the MCP protocol, such as Claude Desktop and Cursor, providing powerful visual understanding capabilities.
## Core Features
### 🎯 Main Functions
- **Multimodal Analysis**: Supports image content understanding, scene recognition, text extraction, and more.
- **Flexible Input**: Supports various image input methods including local file paths, network URLs, and Base64 encoding.
- **Streaming Response**: Outputs analysis results in real-time, enhancing user experience.
- **Intelligent Storage**: Automatically saves processing results and generated images.
### 🚀 Technical Advantages
- **Zero Dependency Installation**: Supports direct execution with uvx, no prior installation required.
- **Cross-Platform Compatibility**: Supports mainstream operating systems such as macOS, Windows, and Linux.
- **Proxy Support**: Built-in SOCKS5 proxy support to adapt to various network environments.
- **Standard Protocol**: Fully compliant with MCP specifications, can be integrated with any MCP client.
## Quick Start
### Method 1: Run with uvx (Recommended)
No installation required, run directly:
```bash
# Set API key and start the service
GEMINI_API_KEY=your-api-key uvx gemini-mcp
```
### Method 2: Install via pip
```bash
# Install the package
pip install gemini-mcp
# Run the service
GEMINI_API_KEY=your-api-key gemini-mcp
```
### Method 3: Run from Source
```bash
# Clone the repository
git clone https://github.com/chengfeng2025/gemini-mcp-python.git
cd gemini-mcp-python
# Install dependencies
pip install -r requirements.txt
# Run the service
python -m gemini_mcp
```
## Client Configuration
### Claude Desktop Configuration
1. Open the configuration file:
- macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
- Windows: `%APPDATA%\Claude\claude_desktop_config.json`
2. Add the following configuration:
```json
{
"mcpServers": {
"gemini": {
"command": "uvx",
"args": ["gemini-mcp"],
"env": {
"GEMINI_API_KEY": "your-api-key"
}
}
}
}
```
### Cursor Configuration
Edit `~/.cursor/mcp.json`:
```json
{
"mcpServers": {
"gemini": {
"command": "uvx",
"args": ["gemini-mcp"],
"env": {
"GEMINI_API_KEY": "your-api-key"
}
}
}
}
```
## Usage Examples
In the configured Claude Desktop or Cursor, you can:
```
# Analyze a local image
Please analyze this image: /Users/name/Pictures/photo.jpg
# Analyze an online image
Describe the content of this image: https://example.com/image.png
# Extract text from an image
Extract all text from the image: /path/to/document.png
# Scene understanding
What scene was this image taken in? /path/to/scene.jpg
```
## Advanced Configuration
### Environment Variables
| Variable Name | Description | Default Value |
|---------------|-------------|---------------|
| `GEMINI_API_KEY` | Gemini API key (required) | - |
| `OUTPUT_DIR` | Output file save directory | `./outputs` |
| `ALL_PROXY` | SOCKS5 proxy address | - |
| `LOG_LEVEL` | Log level | `INFO` |
### Command Line Parameters
```bash
# View all available parameters
gemini-mcp --help
# Run in HTTP service mode
gemini-mcp --mode http --port 8080
# Enable debug mode
gemini-mcp --debug
# Specify output directory
gemini-mcp --output-dir /custom/path
```
## API Reference
### Supported Tools
#### `analyze_image`
Analyze image content and return a description.
**Parameters:**
- `image_input`: Image input (file path, URL, or Base64)
- `prompt`: Analysis prompt (optional)
**Example:**
```python
{
"tool": "analyze_image",
"arguments": {
"image_input": "/path/to/image.jpg",
"prompt": "Describe the main content of this image"
}
}
```
## Development Guide
### Local Development
```bash
# Clone the project
git clone https://github.com/chengfeng2025/gemini-mcp-python.git
cd gemini-mcp-python
# Create a virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
```
### Contributing Code
1. Fork the project
2. Create a feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Create a Pull Request
## Troubleshooting
### Common Issues
**Q: "API key not found" message**
A: Ensure that the `GEMINI_API_KEY` environment variable is set correctly.
**Q: Connection timeout error**
A: Check your network connection or configure the proxy:
```bash
ALL_PROXY=socks5://127.0.0.1:1080 gemini-mcp
```
**Q: Claude Desktop cannot recognize the service**
A: Restart the Claude Desktop application to reload the configuration.
## Project Information
- **Author**: chengfeng2025
- **License**: MIT
- **Version**: 1.0.0
- **Last Updated**: January 2025
- **GitHub**: [gemini-mcp-python](https://github.com/chengfeng2025/gemini-mcp-python)
## Related Links
- [MCP Protocol Specification](https://modelcontextprotocol.io/)
- [Gemini API Documentation](https://ai.google.dev/gemini-api/docs)
- [Issue Feedback](https://github.com/chengfeng2025/gemini-mcp-python/issues)
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
---
**Note**: Using this project requires a valid Gemini API key.
### How to Obtain an API Key
1. **Official Channel**: Visit [Google AI Studio](https://makersuite.google.com/app/apikey) to obtain an official key (requires VPN).
2. **Rabbit API**: Visit [Rabbit API Recharge Platform](https://api.tu-zi.com/topup) to purchase API services compatible with the official format (direct connection within the country, no VPN required, fully compatible with the Gemini official API interface).
Connection Info
You Might Also Like
semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
repomix
📦 Repomix is a powerful tool that packs your entire repository into a...
Serena
A powerful coding agent toolkit providing semantic retrieval and editing...
Blender
BlenderMCP integrates Blender with Claude AI for enhanced 3D modeling.
pydantic-ai
GenAI Agent Framework, the Pydantic way
cua
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and...