Content
# Unreal Engine Documentation MCP Server
[English](README_EN.md) | 中文
This project provides an MCP (Model Context Protocol) server for the official Unreal Engine documentation, supporting intelligent document querying and access features based on **vector semantic search**.
## Project Background
During the process of learning Unreal Engine development, developers often need to interact with large AI models for technical guidance and solutions. However, large models can easily produce hallucinations, providing inaccurate or outdated information, which can mislead learners.
To address this issue, we need to provide large models with an accurate and reliable index of the official Unreal Engine documentation. Through keyword searches, large models can directly obtain real links and information from the official Unreal Engine documentation, thereby providing more accurate technical guidance.
## Solution
This project offers an MCP (Model Context Protocol) server specifically designed for intelligent search and indexing of the official Unreal Engine documentation. To obtain a complete document structure, we use dynamic data collection techniques to overcome the limitations of the official website's dynamically loaded navigation menus, ensuring the inclusion of all 2400+ document pages.
## Features
- 🔍 **Intelligent Document Search**: Supports mixed Chinese and English searches, quickly finding relevant official Unreal Engine documentation.
- 🎯 **Exact Matching**: Keyword exact matching ensures the accuracy of search results.
- 🧠 **Semantic Search**: Intelligent search based on vector embeddings understands query intent.
- 📚 **Complete Document Coverage**: Includes 2400+ official document pages, covering various aspects of Unreal Engine.
- 🔀 **Hybrid Search Strategy**: Combines keyword matching and semantic search to provide the best search results.
## How to Use in MCP Client
### Prerequisites
To use the semantic search feature, you need to install Ollama and the vector embedding model:
```bash
# 1. Install Ollama (based on your operating system)
curl -fsSL https://ollama.ai/install.sh | sh
# 2. Start the Ollama service
ollama serve
# 3. Download the embedding model
ollama pull bge-m3
```
### Cursor Configuration
Create or edit the `.cursor/mcp.json` configuration file in the project root directory:
```json
{
"mcpServers": {
"unreal-engine-docs-mcp": {
"command": "npx",
"args": [
"-y",
"unreal-engine-docs-mcp"
],
"env": {
"MAX_KEYWORD_RESULTS": "20",
"MAX_SEMANTIC_RESULTS": "20",
"OLLAMA_BASE_URL": "http://localhost:11434"
}
}
}
}
```
### VSCode Configuration
Create or edit the `.vscode/mcp.json` configuration file in the project root directory:
```json
{
"servers": {
"unreal-engine-docs-mcp": {
"type": "stdio",
"command": "npx",
"args": [
"-y",
"unreal-engine-docs-mcp"
],
"env": {
"MAX_KEYWORD_RESULTS": "20",
"MAX_SEMANTIC_RESULTS": "20",
"OLLAMA_BASE_URL": "http://localhost:11434"
}
}
}
}
```
After completing the configuration, restart your IDE to use the Unreal Engine documentation search feature in the AI assistant.
### Environment Variable Explanation
You can adjust the search behavior by setting environment variables. These variables can be configured in the `env` field of `.cursor/mcp.json` or `.vscode/mcp.json`.
| Environment Variable | Meaning | Default Value |
| ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------- |
| `MAX_KEYWORD_RESULTS` | Maximum number of results returned for exact keyword matching. | `10` |
| `MAX_SEMANTIC_RESULTS` | Maximum number of results returned for vector semantic search. | `10` |
| `OLLAMA_BASE_URL` | Address of the Ollama service used for generating vector embeddings. | `http://localhost:11434` |
**Performance and Token Consumption Explanation:**
The values of `MAX_KEYWORD_RESULTS` and `MAX_SEMANTIC_RESULTS` determine the number of returned results.
- **Higher Values**: Return more relevant documents, providing richer context for large language models, thus improving answer accuracy.
- **Token Consumption**: Each returned document link (link item) consumes approximately 50 tokens. If you set both `MAX_KEYWORD_RESULTS` and `MAX_SEMANTIC_RESULTS` to 100, the theoretical maximum token consumption will be close to `(100 + 100) * 50 = 10,000` tokens.
- **Recommended Value**: Considering that the results of exact searches are usually fewer, the actual consumption is typically around 5000 tokens. We recommend setting both values to `100` for optimal results. However, you can adjust these values based on your token usage costs and accuracy requirements.
## MCP Tool Functions
### search_docs_list
Queries and returns a list of links to the official Unreal Engine documentation, supporting **hybrid search** technology that combines vector semantic search and exact keyword matching.
**Parameters:**
- `search` (required): Semantic search keyword object containing English and Chinese fields, using vector semantic search technology
- `en` (required): English semantic search keyword
- `cn` (required): Chinese semantic search keyword
- `keyword` (required): Exact matching keyword array, each element contains English and Chinese fields, performing exact matching through lowercase text comparison, **prioritized by array order** (results matching earlier keywords are ranked higher), with higher priority than semantic search
- Structure of each element in the array:
- `en` (required): English exact matching keyword
- `cn` (required): Chinese exact matching keyword
**Return Count Limit:**
- Exact keyword matching: Controlled by the environment variable `MAX_KEYWORD_RESULTS`, defaulting to 10 results.
- Semantic search: Controlled by the environment variable `MAX_SEMANTIC_RESULTS`, defaulting to 10 results.
**Return Data Format:**
```json
{
"total": 2415,
"search": {
"en": "animation",
"cn": "角色动画制作"
},
"keyword": [
{
"en": "blueprint",
"cn": "蓝图"
}
],
"combinedSearchTerm": "animation 角色动画制作",
"searchMethod": "hybrid_search",
"maxKeywordResults": 10,
"maxSemanticResults": 10,
"keywordResultCount": 3,
"semanticResultCount": 2,
"vectorSearchAvailable": true,
"error": null,
"links": [
{
"navTitle": "物体和角色动画制作",
"pageTitle": "在虚幻引擎中制作角色和物体动画",
"pageDescription": "学习如何在虚幻引擎中创建和管理角色与物体的动画系统,包括动画蓝图、状态机等高级功能。",
"link": "https://dev.epicgames.com/documentation/zh-cn/unreal-engine/animating-characters-and-objects-in-unreal-engine",
"searchSource": "keyword"
},
...
]
}
```
**Return Data Field Explanation:**
- `navTitle`: Navigation title (from the document navigation menu)
- `pageTitle`: Page title (from the page content)
- `pageDescription`: Page description (from the page content summary)
- `link`: Document link
- `searchSource`: Type of search source, possible values:
- `"keyword"`: From exact keyword matching
- `"semantic"`: From vector semantic search
**Search Mode Explanation:**
- `hybrid_search`: Hybrid search (exact keyword matching + vector semantic search)
- `hybrid_search_partial`: Partial hybrid search (only keyword matching, vector search unavailable or failed)
- `error`: Search execution failed
**Usage Examples:**
- Hybrid search for animation: `search_docs_list(search={en:"animation", cn:"角色动画"}, keyword=[{en:"blueprint", cn:"蓝图"}])`
- Search for blueprint materials: `search_docs_list(search={en:"blueprint", cn:"蓝图编程"}, keyword=[{en:"material", cn:"材质"}])`
- Find installation guide: `search_docs_list(search={en:"installation", cn:"安装虚幻引擎"}, keyword=[{en:"guide", cn:"指南"}])`
- Physics collision search: `search_docs_list(search={en:"physics", cn:"物理仿真"}, keyword=[{en:"collision", cn:"碰撞"}])`
- Lighting shadow features: `search_docs_list(search={en:"lighting", cn:"光照设置"}, keyword=[{en:"shadow", cn:"阴影"}])`
- Multi-keyword priority search: `search_docs_list(search={en:"game development", cn:"游戏开发"}, keyword=[{en:"blueprint", cn:"蓝图"}, {en:"material", cn:"材质"}, {en:"animation", cn:"动画"}])` (blueprint matches first, followed by material, and finally animation)
**Note:** The maximum return quantity of search results is controlled by environment variables:
- Exact keyword matching is controlled by `MAX_KEYWORD_RESULTS`, defaulting to 10 results.
- Semantic search is controlled by `MAX_SEMANTIC_RESULTS`, defaulting to 10 results.
## Data Statistics
### Collection Results
- **Original Navigation Links**: 87
- **Dynamically Retrieved Links**: 2415
- **New Links Count**: 2328
- **Expanded Menu Items**: 492
- **Expansion Rounds**: 7
### Data Integrity
- **Page Title Coverage**: ~98.5%
- **Page Description Coverage**: ~97.2%
- **Number of Vectorized Documents**: 2415
- **Vector Dimensions**: 1024 (bge-m3 model)
## Technical Implementation
### Vector Search Engine
Based on the following tech stack:
- **LanceDB**: High-performance vector database
- **Ollama**: Local embedding model service
- **bge-m3**: Multilingual embedding model, supporting mixed Chinese and English queries
### Hybrid Search Workflow
1. **Parameter Parsing**:
- Receive objectified `search` and `keyword` parameters
- Extract English (`en`) and Chinese (`cn`) field contents separately
2. **Exact Keyword Matching**:
- Convert English and Chinese keywords to lowercase
- Perform text inclusion matching in navigation titles, page titles, and page descriptions
- Match both English and Chinese keywords simultaneously to expand the matching range
- Return the specified number of matching results
3. **Vector Semantic Search**:
- Merge English and Chinese search terms into a single query (`search.cn + " " + search.en`)
- Convert the merged query into vector embeddings
- Execute similarity search in LanceDB
- Return the semantically most relevant results
4. **Result Merging and Deduplication**:
- Prioritize keyword matching results based on the order of the keyword array (results matching earlier keywords are ranked higher)
- Deduplicate keyword matching results, retaining the highest priority results
- Preferentially add keyword matching results (higher priority than semantic search)
- Add semantic search results (deduplicated based on the link field)
- Ensure no duplicate links, maintaining result quality
5. **Intelligent Downgrade**: If vector search is unavailable, keyword matching results can still be provided.
### Automated Expansion Strategy
```javascript
// Find unexpanded menu buttons
const expandButtons = await page.$$('.btn-expander .icon-arrow-forward-ios:not(.is-rotated)');
```
### Error Handling Mechanism
- Automatic retry mechanism
- Scroll to the visible area
- Wait for DOM updates
- Exception capture and logging
- Vector service connection detection
## Generated Data Format
### enhanced-list.json - Enhanced Link Data
```json
{
"total": 2415,
"generated": "2025-01-12T10:30:15.387Z",
"stats": {
"totalLinks": 2415,
"withPageTitle": 2380,
"withPageDescription": 2347,
"completionRate": {
"pageTitle": "98.5%",
"pageDescription": "97.2%"
}
},
"links": [
{
"navTitle": "New Content",
"pageTitle": "New Features in Unreal Engine",
"pageDescription": "Learn about the new features and improvements in Unreal Engine 5.6.",
"link": "https://dev.epicgames.com/documentation/zh-cn/unreal-engine/whats-new"
}
]
}
```
## Performance Metrics
### Build Performance
- Browser Startup: ~2-3 seconds
- Page Load: ~5-10 seconds
- Navigation Expansion: ~30-60 seconds
- Data Parsing: ~1-2 seconds
- Vector Processing: ~5-10 minutes (2415 documents)
- Total Build Time: ~15-20 minutes
### Query Performance
- Vector Search Response: <200 milliseconds
- Database Connection: <100 milliseconds
- Embedding Vector Generation: ~50-100 milliseconds
## Tech Stack
### Core Technologies
- **Node.js**: Runtime environment
- **TypeScript**: Type-safe development language
- **MCP SDK**: Implementation of Model Context Protocol
### Data Collection
- **Puppeteer**: Headless browser control
- **JSDOM**: HTML parsing and processing
- **Playwright**: Browser installation management
### Search Technologies
- **LanceDB**: Vector database
- **Ollama**: Local AI model service
- **bge-m3**: Multilingual embedding model
- **Apache Arrow**: High-performance data processing
### Development Tools
- **Vitest**: Unit testing framework
- **tsx**: TypeScript executor
- **Zod**: Parameter validation
- **Rimraf**: Cross-platform file deletion
## Environment Variables
```bash
# Maximum return quantity for exact keyword matching
MAX_KEYWORD_RESULTS=10
# Maximum return quantity for semantic search
MAX_SEMANTIC_RESULTS=10
# Ollama service address
OLLAMA_BASE_URL=http://localhost:11434
```
## Development and Testing
### File Structure
```
├── scripts/ # Build scripts (TypeScript)
│ ├── fetch-nav.ts # Dynamically fetch navigation structure
│ ├── parse-nav.ts # Parse HTML and generate JSON
│ ├── fetch-descriptions.ts # Fetch page titles and description information
│ ├── merge-data.ts # Merge navigation and page data
│ └── build-vector-db.ts # Build vector database
├── src/ # Source code
│ ├── index.ts # MCP server implementation
│ └── vector-search.ts # Vector search engine
├── sources/ # Data files
│ ├── list.json # Basic link list
│ ├── descriptions.json # Page description data
│ ├── enhanced-list.json # Enhanced link data (merged)
│ └── db/ # Vector database
├── tests/ # Test files
│ └── mcp-client.test.ts # MCP client tests
├── dist/ # Compiled JavaScript files
├── nav-dist.html # Dynamically fetched complete navigation (2415 links)
├── tsconfig.json # TypeScript configuration
├── tsconfig.build.json # Build configuration
└── package.json # Project configuration
```
### Installation and Configuration
#### Prerequisites
1. **Node.js**: Version >= 18.0.0
2. **Ollama**: For generating vector embeddings
```bash
# Install Ollama (based on your operating system)
curl -fsSL https://ollama.ai/install.sh | sh
# Start the Ollama service
ollama serve
# Download the embedding model
ollama pull bge-m3
```
#### Clone the Project Locally
```bash
git clone https://github.com/your-username/unreal-engine-docs-mcp.git
cd unreal-engine-docs-mcp
```
#### Install Dependencies
```bash
npm install
```
#### Build the Project
```bash
npm run build
```
### Using Existing Data
The `sources/` directory of the project already contains pre-processed metadata:
- `enhanced-list.json`: Complete data containing 2415 links to Unreal Engine documentation
- `db/`: Pre-built vector database files
You can directly use this data without rebuilding.
### Rebuild Document Data (Optional)
If you need to obtain the latest document data, you can rebuild:
#### Complete Build Process
```bash
# Complete build process (fetch navigation → parse → fetch descriptions → merge data)
npm run build-docs
```
#### Step-by-Step Execution
```bash
# 1. Fetch dynamic navigation structure
npm run fetch-nav
# 2. Parse HTML to generate link list
npm run parse-nav
# 3. Fetch page titles and descriptions
npm run fetch-descriptions
# 4. Merge data to generate enhanced link list
npm run merge-data
```
### Build Vector Database
If you need to rebuild the vector database:
```bash
# Ensure the Ollama service is running
ollama serve
# Build the vector database
npm run build-vector-db
```
### Testing
You can write test cases in the `tests/` directory:
```bash
# Run all tests
npm test
# Run tests in watch mode
npm run test:watch
```
## Troubleshooting
### Vector Search Unavailable
1. Check if the Ollama service is running: `ollama serve`
2. Confirm that the model is installed: `ollama list`
3. Check if the vector database exists: `sources/db/`
4. Rebuild the vector database: `npm run build-vector-db`
### Data Retrieval Failure
1. Check network connection
2. Confirm that the Unreal Engine documentation website is accessible
3. Check browser installation: `npm run install-browsers`
## Future Optimization Plans
- [ ] Support incremental data updates
## Contribution Guidelines
Contributions are welcome! Please submit Issues and Pull Requests to improve this project.
### Development Process
1. Fork the project
2. Create a feature branch
3. Submit changes
4. Run tests to ensure they pass
5. Submit a Pull Request
## License
MIT License
Connection Info
You Might Also Like
Filesystem
Node.js MCP Server for filesystem operations with dynamic access control.
git
A Model Context Protocol server for Git automation and interaction.
everything
Model Context Protocol Servers
chrome-devtools-mcp
Chrome DevTools for coding agents
python-sdk
Python SDK for the Model Context Protocol (MCP) implementation.
python-sdk
The official Python SDK for Model Context Protocol servers and clients