Content
# LLM-MCP-RAG Experimental Project
> This project is a Python implementation based on [KelvinQiu802/llm-mcp-rag](https://github.com/KelvinQiu802/llm-mcp-rag) for learning and practicing LLM, MCP, and RAG technologies.
>
> The author of this project has a demonstration video available at https://www.bilibili.com/video/BV1dcRqYuECf/
>
> It is highly recommended to browse its README first, as this repository has made some adjustments to the logic and naming!
## Project Overview
This project is an experimental project based on Large Language Models (LLM), Model Context Protocol (MCP), and Retrieval-Augmented Generation (RAG). It demonstrates how to build an AI assistant system that can interact with external tools and utilize retrieval-augmented generation technology.
### Core Features
- Large language model calls based on OpenAI API
- Interaction between LLM and external tools via MCP (Model Context Protocol)
- Implementation of a RAG (Retrieval-Augmented Generation) system based on vector retrieval
- Support for file system operations and web content retrieval
## System Architecture
```mermaid
graph TD
A[User] -->|Ask| B[Agent]
B -->|Call| C[LLM]
C -->|Generate Answer/Tool Call| B
B -->|Tool Call| D[MCP Client]
D -->|Execute| E[MCP Server]
E -->|File System Operation| F[File System]
E -->|Web Retrieval| G[Web Content]
H[Documents/Knowledge Base] -->|Embed| I[Vector Store - In-Memory]
B -->|Query| I
I -->|Relevant Context| B
```
## Main Components
```mermaid
classDiagram
class Agent {
+mcp_clients: list[MCPClient]
+model: str
+llm: AsyncChatOpenAI
+system_prompt: str
+context: str
+init()
+cleanup()
+invoke(prompt: str)
}
class MCPClient {
+name: str
+command: str
+args: list[str]
+version: str
+init()
+cleanup()
+get_tools()
+call_tool(name: str, params: dict)
}
class AsyncChatOpenAI {
+model: str
+messages: list
+tools: list[Tool]
+system_prompt: str
+context: str
+chat(prompt: str, print_llm_output: bool)
+get_tools_definition()
+append_tool_result(tool_call_id: str, tool_output: str)
}
class EembeddingRetriever {
+embedding_model: str
+vector_store: VectorStore
+embed_query(query: str)
+embed_documents(document: str)
+retrieve(query: str, top_k: int)
}
class VectorStore {
+items: list[VectorStoreItem]
+add(item: VectorStoreItem)
+search(query_embedding: list[float], top_k: int)
}
class ALogger {
+prefix: str
+title(text: str, rule_style: str)
}
Agent --> MCPClient
Agent --> AsyncChatOpenAI
Agent ..> EembeddingRetriever
EembeddingRetriever --> VectorStore
Agent ..> ALogger
AsyncChatOpenAI ..> ALogger
```
## Quick Start
### Environment Setup
1. Ensure Python 3.12 or higher is installed
2. Clone this repository
3. Copy `.env.example` to `.env` and fill in the necessary configuration information:
- `OPENAI_API_KEY`: OpenAI API key
- `OPENAI_BASE_URL`: OpenAI API base URL, make sure to keep the trailing '/v1' (default is 'https://api.openai.com/v1')
- `DEFAULT_MODEL_NAME`: (optional) Default model name to use (default is "gpt-4o-mini")
- `EMBEDDING_KEY`: (optional) Embedding model API key (default is $OPENAI_API_KEY)
- `EMBEDDING_BASE_URL`: (optional) Embedding model API base URL, such as silicon-based flow API or API compatible with OpenAI format (default is $OPENAI_BASE_URL)
- `USE_CN_MIRROR`: (optional) Whether to use the China mirror, set any value (e.g., '1') to true (default is false)
- `PROXY_URL`: (optional) Proxy URL (e.g., "http(s)://xxx"), for `fetch` (mcp-tool) to go through the proxy
### Install Dependencies
```bash
# Use uv to install dependencies
uv sync
```
### Run Examples
This project uses the `just` command tool to run different examples:
```bash
# View available commands
just help
```
## RAG Example Flow
```mermaid
sequenceDiagram
participant User as User
participant Agent as Agent
participant LLM as LLM
participant ER as EmbeddingRetriever
participant VS as VectorStore
participant MCP as MCP Client
participant Logger as ALogger
User->>Agent: Provide query
Agent->>Logger: Log operation
Agent->>ER: Retrieve relevant documents
ER->>VS: Query vector store
VS-->>ER: Return relevant documents
ER-->>Agent: Return context
Agent->>LLM: Send query and context
LLM-->>Agent: Generate answer or tool call
Agent->>Logger: Log tool call
Agent->>MCP: Execute tool call
MCP-->>Agent: Return tool result
Agent->>LLM: Send tool result
LLM-->>Agent: Generate final answer
Agent-->>User: Return answer
```
## Project Structure
- `src/augmented/`: Main source code directory
- `agent.py`: Implementation of Agent, responsible for coordinating LLM and tools
- `chat_openai.py`: OpenAI API client wrapper
- `mcp_client.py`: Implementation of MCP client
- `embedding_retriever.py`: Implementation of embedding retriever
- `vector_store.py`: Implementation of vector store
- `mcp_tools.py`: Definition of MCP tools
- `utils/`: Utility functions
- `info.py`: Project information and configuration
- `pretty.py`: Unified logging output system
- `rag_example.py`: RAG example program
- `justfile`: Task running configuration file
## Learning Resources
- [Model Context Protocol (MCP)](https://modelcontextprotocol.io/): Learn about the MCP protocol
- [OpenAI API Documentation](https://platform.openai.com/docs/api-reference): OpenAI API reference
- [RAG (Retrieval-Augmented Generation)](https://arxiv.org/abs/2005.11401): RAG technical paper
Connection Info
You Might Also Like
MarkItDown MCP
MarkItDown-MCP is a lightweight server for converting various URIs to Markdown.
Context 7
Context7 MCP provides up-to-date code documentation for any prompt.

Continue
Continue is an open-source project for seamless server management.
semantic-kernel
Build and deploy intelligent AI agents with the Semantic Kernel framework.
Github
GitHub MCP Server connects AI tools to manage repositories, issues, and workflows.
Playwright
A lightweight MCP server for browser automation using Playwright, enabling...