Content
# Document Operations MCP Server
[](https://www.npmjs.com/package/doc-ops-mcp)
[](https://opensource.org/licenses/MIT)
[](https://www.npmjs.com/package/doc-ops-mcp)
**Language / 语言**: [English](README.md) | [中文](README_zh.md)
> **Document Operations MCP Server** - A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.
## Demo
### Video
<https://github.com/user-attachments/assets/43dfeeec-8097-413e-8519-a7de98e31136>
In this demo, we showcase how to:
- Configure doc-ops-mcp in MCP clients
- Convert DOCX documents to PDF format
- Add default watermarks to converted PDF files
## Table of Contents
1. [Quick Start](#1-quick-start)
2. [System Architecture](#2-system-architecture)
3. [Optional Integration](#3-optional-integration)
4. [Features](#4-features)
5. [Open Source Licenses](#5-open-source-licenses)
6. [Future Roadmap](#6-future-roadmap)
7. [Docker Deployment](#7-docker-deployment)
8. [Development Guide](#8-development-guide)
9. [Troubleshooting](#9-troubleshooting)
10. [Contributing](#10-contributing)
## 1. Quick Start
First, add the Document Operations MCP server to your MCP client.
**Standard config** works in most MCP clients:
```json
{
"mcpServers": {
"doc-ops-mcp": {
"command": "npx",
"args": ["-y", "doc-ops-mcp"],
"env": {
"OUTPUT_DIR": "/path/to/your/output/directory",
"CACHE_DIR": "/path/to/your/cache/directory",
}
}
}
}
```
<details>
<summary>Claude Desktop</summary>
Follow the MCP install [guide](https://modelcontextprotocol.io/quickstart/user), use the standard config above.
</details>
<details>
<summary>VS Code</summary>
Follow the MCP install [guide](https://code.visualstudio.com/docs/copilot/chat/mcp-servers#_add-an-mcp-server), use the standard config above.
</details>
<details>
<summary>Cursor</summary>
Go to `Cursor Settings` -> `MCP` -> `Add new MCP Server`. Name to your liking, use `command` type with the command `npx -y doc-ops-mcp`.
</details>
<details>
<summary>Other MCP Clients</summary>
For other MCP clients, use the standard config above and refer to your client's documentation for MCP server installation.
</details>
### Configuration
The Document Operations MCP server supports configuration through environment variables. These can be provided in the MCP client configuration as part of the `"env"` object:
```json
{
"mcpServers": {
"doc-ops-mcp": {
"command": "npx",
"args": ["-y", "doc-ops-mcp"],
"env": {
"OUTPUT_DIR": "/path/to/your/output/directory",
"CACHE_DIR": "/path/to/your/cache/directory",
"WATERMARK_IMAGE": "/path/to/watermark.png",
"QR_CODE_IMAGE": "/path/to/qrcode.png"
}
}
}
}
```
### Supported Document Operations
| Format | Convert to PDF | Convert to DOCX | Convert to HTML | Convert to Markdown | Content Rewriting | Watermark/QR Code |
|--------|----------------|-----------------|-----------------|---------------------|-------------------|-------------------|
| **PDF** | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| **DOCX** | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| **HTML** | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ |
| **Markdown** | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
**Rewriting Features:**
- **Content Replacement**: Support batch text replacement and regular expression replacement
- **Format Adjustment**: Modify document structure, heading levels, and style formatting
- **Smart Rewriting**: Content optimization while preserving original document format
### Usage Examples
**Format Conversion:**
```
Convert /Users/docs/report.docx to PDF
Convert /Users/docs/article.md to HTML
Convert /Users/docs/presentation.html to DOCX
Convert /Users/docs/readme.md to PDF (with theme styling)
```
**Document Rewriting:**
```
Rewrite company names in /Users/docs/contract.md
Batch replace terminology in /Users/docs/manual.docx
Adjust heading levels in /Users/docs/article.html
Update dates and version numbers in /Users/docs/policy.md
```
**PDF Enhancement:**
```
Add watermark to /Users/docs/document.pdf
Add QR code to /Users/docs/report.pdf
Add company logo watermark to /Users/docs/invoice.pdf
```
### Environment Variables
The server supports environment variables for controlling output paths and PDF enhancement features:
#### Core Directories
- **`OUTPUT_DIR`**: Controls where all generated files are saved (default: `~/Documents`)
- **`CACHE_DIR`**: Directory for temporary and cache files (default: `~/.cache/doc-ops-mcp`)
#### PDF Enhancement Features
- **`WATERMARK_IMAGE`**: Default watermark image path for PDF files
- Automatically added to all PDF conversions
- Supported formats: PNG, JPG
- If not set, default text watermark "doc-ops-mcp" will be used
- **`QR_CODE_IMAGE`**: Default QR code image path for PDF files
- Added to PDFs only when explicitly requested (`addQrCode=true`)
- Supported formats: PNG, JPG
- If not set, QR code functionality will be unavailable
**Output Path Rules:**
1. If `outputPath` is not provided → files saved to `OUTPUT_DIR` with auto-generated names
2. If `outputPath` is relative → resolved relative to `OUTPUT_DIR`
3. If `outputPath` is absolute → used as-is, ignoring `OUTPUT_DIR`
See [OUTPUT_PATH_CONTROL.md](./OUTPUT_PATH_CONTROL.md) for detailed documentation.
## 2. System Architecture
Document Operations MCP Server adopts a pure JavaScript architecture design, providing complete document processing capabilities:
```
┌─────────────────────────────────────────────────────────────┐
│ MCP Client Layer │
│ (Claude Desktop, Cursor, VS Code, etc.) │
└─────────────────────┬───────────────────────────────────────┘
│ JSON-RPC 2.0
┌─────────────────────┴───────────────────────────────────────┐
│ Doc-Ops-MCP Server │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────┐ │
│ │ Tool Router │ │ Request │ │ Response │ │
│ │ & Handler │ │ Validator │ │ Formatter │ │
│ └────────┬────────┘ └────────┬────────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌────────┴────────────────────┴──────────────────┴─────┐ │
│ │ Document Processing Engine │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Document │ │ Format │ │ Style │ │ │
│ │ │ Reader │ │ Converter │ │ Processor │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ PDF │ │ Watermark/ │ │ Conversion │ │ │
│ │ │ Enhancement │ │ QR Code │ │ Planner │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
└────┴───────────────────────────────────────────────────────┴─┘
│
┌───────────────────────────┴─────────────────────────────────┐
│ Core Dependencies Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ pdf-lib │ │word-extractor│ │ marked │ │
│ │ (PDF Tools) │ │(DOCX Reader)│ │ (Markdown) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ cheerio │ │ jszip │ │ docx │ │
│ │(HTML Parser)│ │(ZIP Handler)│ │(DOCX Gen.) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ xml2js │ │Custom OOXML │ │
│ │(XML Parser) │ │ Parser │ │
│ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### Architecture Overview
**Core Features**:
- Pure JavaScript implementation with no external system dependencies
- Complete document reading, conversion, and style processing capabilities
- Built-in PDF watermark and QR code addition functionality
- Intelligent conversion planning and path optimization
**Conversion Flow**:
- **Direct Conversion**: Supports direct conversion between most formats
- **Multi-step Conversion**: Complex conversions achieved through intermediate formats
- **Style Preservation**: Uses OOXML parser to ensure complete style integrity
## 3. Optional Integration
This server can work with `playwright-mcp` for enhanced PDF conversion capabilities. Please refer to the official `playwright-mcp` documentation for detailed configuration.
### 🔧 PDF Conversion Workflow
This server supports complete PDF conversion functionality:
1. **Document Parsing**: Use OOXML parser to ensure complete style preservation
2. **Format Conversion**: Convert documents to high-quality HTML format
3. **PDF Generation**: Built-in converter or optionally work with `playwright-mcp`
4. **Enhancement Processing**: Automatically add watermarks and QR codes (if configured)
### How It Works
This server uses intelligent conversion architecture:
1. **Smart Planning**: `plan_conversion` analyzes conversion requirements and selects optimal paths
2. **Format Conversion**: Use specialized converters to handle various document formats
3. **Style Preservation**: Ensure style integrity through OOXML parser
4. **Enhancement Processing**: Automatically add watermarks, QR codes and other enhancements
5. **Optional Integration**: Support working with `playwright-mcp` for enhanced capabilities
## 4. Features
### MCP Tools
#### Core Document Tools
| Tool Name | Description | Input Parameters | External Dependencies |
|-----------|-------------|------------------|----------------------|
| `read_document` | Read document content | `filePath`: Document path<br>`extractMetadata`: Extract metadata<br>`preserveFormatting`: Preserve formatting | None |
| `write_document` | Write document content | `content`: Document content<br>`outputPath`: Output file path<br>`encoding`: File encoding | None |
| `convert_document` | Smart document conversion | `inputPath`: Input file path<br>`outputPath`: Output file path<br>`preserveFormatting`: Preserve formatting | None |
| `plan_conversion` | Conversion planner | `sourceFormat`: Source format<br>`targetFormat`: Target format<br>`preserveStyles`: Preserve styles<br>`quality`: Conversion quality | None |
##### **read_document**
Read various document formats including PDF, DOCX, DOC, HTML, MD, and more.
**Parameters:**
- `filePath` (string, required) - Document path to read
- `extractMetadata` (boolean, optional) - Extract document metadata, defaults to `false`
- `preserveFormatting` (boolean, optional) - Preserve formatting (HTML output), defaults to `false`
##### **write_document**
Write content to document files in specified formats.
**Parameters:**
- `content` (string, required) - Content to write
- `outputPath` (string, optional) - Output file path (auto-generated if not provided)
- `encoding` (string, optional) - File encoding, defaults to `utf-8`
##### **convert_document**
Convert documents between formats with enhanced style preservation.
**Parameters:**
- `inputPath` (string, required) - Input file path
- `outputPath` (string, optional) - Output file path (auto-generated if not provided)
- `preserveFormatting` (boolean, optional) - Preserve formatting, defaults to `true`
- `useInternalPlaywright` (boolean, optional) - Use built-in Playwright for PDF conversion, defaults to `false`
##### **convert_docx_to_pdf**
Convert DOCX to PDF with automatic watermark addition (if configured).
**Parameters:**
- `docxPath` (string, required) - DOCX file path
- `outputPath` (string, optional) - Output PDF path (auto-generated if not provided)
- `addQrCode` (boolean, optional) - Whether to add QR code, defaults to `false`
- `preserveFormatting` (boolean, optional) - Preserve original formatting, defaults to `true`
- `chineseFont` (string, optional) - Chinese font, defaults to `Microsoft YaHei`
##### **convert_markdown_to_pdf**
Convert Markdown to PDF with automatic watermark addition (if configured).
**Parameters:**
- `markdownPath` (string, required) - Markdown file path
- `outputPath` (string, optional) - Output PDF path (auto-generated if not provided)
- `theme` (string, optional) - Theme style, defaults to `"github"`
- `includeTableOfContents` (boolean, optional) - Include table of contents, defaults to `false`
- `addQrCode` (boolean, optional) - Whether to add QR code, defaults to `false`
##### **convert_markdown_to_html**
Convert Markdown to HTML.
**Parameters:**
- `markdownPath` (string, required) - Markdown file path
- `outputPath` (string, optional) - Output HTML path (auto-generated if not provided)
- `theme` (string, optional) - Theme style, defaults to `"github"`
- `includeTableOfContents` (boolean, optional) - Include table of contents, defaults to `false`
##### **convert_markdown_to_docx**
Convert Markdown to DOCX.
**Parameters:**
- `markdownPath` (string, required) - Markdown file path
- `outputPath` (string, optional) - Output DOCX path (auto-generated if not provided)
##### **convert_html_to_markdown**
Convert HTML to Markdown.
**Parameters:**
- `htmlPath` (string, required) - HTML file path
- `outputPath` (string, optional) - Output Markdown path (auto-generated if not provided)
##### **plan_conversion**
🎯 Smart Conversion Planner - Analyze conversion requirements and generate optimal conversion plans.
**Parameters:**
- `sourceFormat` (string, required) - Source file format (pdf, docx, html, markdown, md, txt, doc)
- `targetFormat` (string, required) - Target file format (pdf, docx, html, markdown, md, txt, doc)
- `sourceFile` (string, optional) - Source file path (for generating specific conversion parameters)
- `preserveStyles` (boolean, optional) - Whether to preserve style formatting, defaults to `true`
- `includeImages` (boolean, optional) - Whether to include images, defaults to `true`
- `theme` (string, optional) - Conversion theme, defaults to `github`
- `quality` (string, optional) - Conversion quality requirement (fast, balanced, high), defaults to `balanced`
##### **process_pdf_post_conversion**
**Parameters:**
- `playwrightPdfPath` (string, required) - Generated PDF file path
- `targetPath` (string, optional) - Target PDF file path (auto-generated if not provided)
- `addWatermark` (boolean, optional) - Whether to add watermark, defaults to `false`
- `addQrCode` (boolean, optional) - Whether to add QR code, defaults to `false`
- `watermarkImage` (string, optional) - Watermark image path
- `qrCodePath` (string, optional) - QR code image path
#### PDF Enhancement Tools
##### **add_watermark**
🎨 PDF Watermark Addition Tool - Add image or text watermarks to PDF documents.
**Parameters:**
- `pdfPath` (string, required) - PDF file path
- `watermarkImage` (string, optional) - Watermark image path (PNG/JPG)
- `watermarkText` (string, optional) - Watermark text content
- `watermarkImageScale` (number, optional) - Image scale ratio, defaults to `0.25`
- `watermarkImageOpacity` (number, optional) - Image opacity, defaults to `0.6`
- `watermarkImagePosition` (string, optional) - Image position, defaults to `fullscreen`
##### **add_qrcode**
📱 PDF QR Code Addition Tool - Add QR codes to PDF documents.
**Parameters:**
- `pdfPath` (string, required) - PDF file path
- `qrCodePath` (string, optional) - QR code image path
- `qrScale` (number, optional) - QR code scale ratio, defaults to `0.15`
- `qrOpacity` (number, optional) - QR code opacity, defaults to `1.0`
- `qrPosition` (string, optional) - QR code position, defaults to `bottom-center`
- `addText` (boolean, optional) - Whether to add explanatory text, defaults to `true`
## System Requirements
### System Requirements
- **Node.js** ≥ 18.0.0
- **Zero external system dependencies** - All processing via npm packages
- **Optional Integration**: playwright-mcp for enhanced PDF conversion
### Core Technology Stack
- **pdf-lib** - PDF operations and enhancement
- **word-extractor** - DOCX document text extraction
- **marked** - Markdown parsing and rendering
- **cheerio** - HTML parsing and manipulation
- **docx** - DOCX document generation
- **jszip** - ZIP file processing
- **xml2js** - XML parsing and conversion
- **Custom OOXML Parser** - Advanced DOCX style preservation
### Installation
```bash
# Global installation
npm install -g doc-ops-mcp
# Or using pnpm
pnpm add -g doc-ops-mcp
# Or using bun
bun add -g doc-ops-mcp
```
### Architecture Components
- **MCP Server Core**: Handles JSON-RPC 2.0 communication and tool registration
- **Smart Router**: Routes requests to optimal processing modules
- **Conversion Engine**: Contains specialized converters for different document types
- **Style Processor**: Ensures style preservation during format conversion
- **Security Module**: Provides path validation and content security handling
## 5. Open Source Licenses
### Project License
- **This Project**: MIT License
- **Compatibility**: Available for commercial and non-commercial use
### Third-Party Dependencies
| Library | Version | License | Purpose |
|---------|---------|---------|----------|
| **pdf-lib** | ^1.17.1 | MIT | PDF document manipulation |
| **word-extractor** | ^1.0.4 | MIT | DOCX document text extraction |
| **marked** | ^15.0.12 | MIT | Markdown parsing and rendering |
| **cheerio** | ^1.0.0-rc.12 | MIT | HTML parsing and manipulation |
| **docx** | ^9.5.1 | Apache-2.0 | DOCX document generation |
| **jszip** | ^3.10.1 | MIT | ZIP file processing |
| **xml2js** | ^0.6.2 | MIT | XML parsing and conversion |
### License Compatibility
- ✅ **Commercial Use**: All dependencies support commercial use
- ✅ **Distribution**: Free to distribute and modify
- ✅ **Patent Protection**: Apache-2.0 provides patent protection
- ⚠️ **Notice**: Original license notices must be retained
## 6. Future Roadmap
### Core Features
- 🔄 **Enhanced Conversion Quality**: Improve style preservation for complex documents
- 📊 **Excel Support**: Complete Excel read/write and conversion functionality
- 🎨 **Template System**: Support for custom document templates
- 🔍 **OCR Integration**: Image text recognition capabilities
### System Improvements
- 🌐 **Multi-language Support**: Internationalization and localization
- 🔐 **Security Enhancements**: Document encryption and access control
- ⚡ **Performance Optimization**: Large file handling and memory optimization
- 🔌 **Plugin System**: Extensible processor architecture
### Version Roadmap
- **v2.0**: Complete Excel support and template system
- **v3.0**: OCR integration and multi-language support
- **v4.0**: Advanced security features and plugin system
## 7. Docker Deployment
### Quick Start
#### Using Pre-built Image
```bash
# Pull the latest image
docker pull docops/doc-ops-mcp:latest
# Run with default configuration
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
docops/doc-ops-mcp:latest
```
#### Building from Source
```bash
# Clone the repository
git clone https://github.com/JefferyMunoz/doc-ops-mcp.git
cd doc-ops-mcp
# Build the Docker image
docker build -t doc-ops-mcp .
# Run the container
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-v $(pwd)/documents:/app/documents \
doc-ops-mcp
```
### Docker Compose Deployment
Create a `docker-compose.yml` file:
```yaml
version: '3.8'
services:
doc-ops-mcp:
image: docops/doc-ops-mcp:latest
container_name: doc-ops-mcp
ports:
- "3000:3000"
volumes:
- ./documents:/app/documents
- ./config:/app/config
environment:
- NODE_ENV=production
- PORT=3000
restart: unless-stopped
# Optional: Add Nginx for reverse proxy
nginx:
image: nginx:alpine
container_name: doc-ops-nginx
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- doc-ops-mcp
restart: unless-stopped
```
### Environment Variables
| Variable | Description | Default |
|----------|-------------|----------|
| `PORT` | Server port | `3000` |
| `NODE_ENV` | Environment mode | `production` |
| `LOG_LEVEL` | Logging level | `info` |
| `MAX_FILE_SIZE` | Maximum file size (MB) | `50` |
### Volume Mounts
Mount local directories for persistent storage:
```bash
# Documents directory for file processing
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-v $(pwd)/documents:/app/documents \
-v $(pwd)/output:/app/output \
doc-ops-mcp
```
### Docker Configuration Examples
#### Production Deployment
```bash
# Production setup with Docker Swarm
docker swarm init
docker stack deploy -c docker-compose.yml doc-ops
# Scale the service
docker service scale doc-ops_mcp=3
```
### Health Checks
The container includes built-in health checks:
```bash
# Check container health
docker ps
# View health check logs
docker inspect --format='{{.State.Health.Status}}' doc-ops-mcp
# Manual health check
docker exec doc-ops-mcp curl -f http://localhost:3000/health || exit 1
```
## 8. Development Guide
### Local Development
```bash
# Clone the repository
git clone https://github.com/your-org/doc-ops-mcp.git
cd doc-ops-mcp
# Install dependencies
npm install
# Run in development mode
npm run dev
# Build the project
npm run build
# Run tests
npm test
```
### Project Structure
```
src/
├── index.ts # MCP server entry point
├── tools/ # Tool implementations
│ ├── documentConverter.ts
│ ├── pdfTools.ts
│ └── ...
├── types/ # Type definitions
└── utils/ # Utility functions
```
### Adding New Tools
1. Create a new tool file in `src/tools/`
2. Implement the tool logic
3. Register the tool in `src/index.ts`
4. Add test cases
5. Update documentation
## 9. Troubleshooting
### Common Issues
1. **Port conflicts**: Change the host port in docker-compose.yml
2. **Permission issues**: Ensure volume mounts have correct permissions
3. **Memory issues**: Increase Docker memory allocation
### Debug Mode
```bash
# Run with debug logging
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-e LOG_LEVEL=debug \
doc-ops-mcp
# View logs
docker logs -f doc-ops-mcp
```
## 10. Contributing
### How to Contribute
1. **Fork the Project**
2. **Create a Feature Branch** (`git checkout -b feature/AmazingFeature`)
3. **Commit Your Changes** (`git commit -m 'Add some AmazingFeature'`)
4. **Push to the Branch** (`git push origin feature/AmazingFeature`)
5. **Open a Pull Request**
#### Intellectual Property License
**By submitting a Pull Request, you agree that all contributions submitted through Pull Requests will be licensed under the MIT License.** This means:
- You grant the project maintainers and users the right to use, modify, and distribute your contributions under the MIT License
- You confirm that you have the right to make these contributions
- You understand that your contributions will become part of the open source project
- You waive any claims to exclusive ownership of the contributed code
If you cannot agree to these terms, please do not submit a Pull Request.
### Code Standards
- Use TypeScript
- Follow ESLint configuration
- Add appropriate tests
- Update relevant documentation
### Reporting Issues
- Use [GitHub Issues](https://github.com/your-org/doc-ops-mcp/issues)
- Provide detailed error information and reproduction steps
- Include system environment information
### License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Connection Info
You Might Also Like
markitdown
Python tool for converting files and office documents to Markdown.
Fetch
Retrieve and process content from web pages by converting HTML into markdown format.
chatbox
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
oh-my-opencode
Background agents · Curated agents like oracle, librarians, frontend...
continue
Continue is an open-source project for seamless server management.
semantic-kernel
Build and deploy intelligent AI agents with Semantic Kernel's orchestration...