Content
This is a sample for embedding and vectorizing markdown documents so that they can be explained from MCP via RAG.
Vectorization uses [Plamo-Embedding-1B](https://tech.preferred.jp/ja/blog/plamo-embedding-1b/).
## Features
- Text extraction and vectorization from markdown files
- Vector search using DuckDB
- Persistence of vector data via Parquet files
- Vector search from MCP
## Usage
### Generating Vector Data
First, place the markdown files you want to search in a specific directory and convert them to Parquet files using the following command.
```bash
uv run main.py --directory ~/path/to/markdown/files --parquet vectors.parquet
```
### MCP Configuration
#### Build
The following command generates a single binary at `dist/server`.
```
uv run pyinstaller --clean --strip --noconfirm --onefile server.py
```
#### MCP Client Configuration
Configure according to the client you want to use.
For Claude Desktop, it looks like this:
Please specify the file you converted earlier for VECTOR_PARQUET.
```bash
uv run mcp install server.py -v VECTOR_PARQUET=/path/to/vectors.parquet
```
It will be configured as follows:
```JSON:~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"DuckDB-RAG-MCP-Sample": {
"command": "/path/to/dist/server",
"env": {
"VECTOR_PARQUET": "/path/to/vectors.parquet"
}
}
}
}
```
### Starting Development Server
```bash
uv run mcp dev server.py
```
## License
DuckDB RAG MCP Sample is provided under the Apache License, Version 2.0.
Connection Info
You Might Also Like
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
markitdown
Python tool for converting files and office documents to Markdown.
Filesystem
Node.js MCP Server for filesystem operations with dynamic access control.
codex-seo
Codex-first SEO skill suite. 26 workflows, 24 TOML agents,...
coze-mcp-for-openclaw
Coze MCP and Skill Management for OpenClaw
excalidraw-architect-mcp
An MCP server that generates beautiful Excalidraw architecture diagrams with...