Content
# LLM-MCP-RAG Experimental Project
> This project is a Python implementation based on [KelvinQiu802/llm-mcp-rag](https://github.com/KelvinQiu802/llm-mcp-rag) for learning and practicing LLM, MCP, and RAG technologies.
>
> The author of this project has a demonstration video available at https://www.bilibili.com/video/BV1dcRqYuECf/
>
> It is highly recommended to browse the original README first, as this repository has made some logical adjustments and naming changes!
## Project Overview
This project is an experimental project based on Large Language Models (LLM), Model Context Protocol (MCP), and Retrieval-Augmented Generation (RAG). It demonstrates how to build an AI assistant system that can interact with external tools and utilize retrieval-augmented generation technology.
### Core Features
- Large Language Model calls based on OpenAI API
- Interaction between LLM and external tools via MCP (Model Context Protocol)
- Implementation of a retrieval-augmented generation (RAG) system based on vector retrieval
- Support for file system operations and web content retrieval
## System Architecture
```mermaid
graph TD
A[User] -->|Ask| B[Agent]
B -->|Call| C[LLM]
C -->|Generate Answer/Tool Call| B
B -->|Tool Call| D[MCP Client]
D -->|Execute| E[MCP Server]
E -->|File System Operations| F[File System]
E -->|Web Retrieval| G[Web Content]
H[Documents/Knowledge Base] -->|Embed| I[Vector Store - In-Memory]
B -->|Query| I
I -->|Relevant Context| B
```
## Main Components
```mermaid
classDiagram
class Agent {
+mcp_clients: list[MCPClient]
+model: str
+llm: AsyncChatOpenAI
+system_prompt: str
+context: str
+init()
+cleanup()
+invoke(prompt: str)
}
class MCPClient {
+name: str
+command: str
+args: list[str]
+version: str
+init()
+cleanup()
+get_tools()
+call_tool(name: str, params: dict)
}
class AsyncChatOpenAI {
+model: str
+messages: list
+tools: list[Tool]
+system_prompt: str
+context: str
+chat(prompt: str, print_llm_output: bool)
+get_tools_definition()
+append_tool_result(tool_call_id: str, tool_output: str)
}
class EembeddingRetriever {
+embedding_model: str
+vector_store: VectorStore
+embed_query(query: str)
+embed_documents(document: str)
+retrieve(query: str, top_k: int)
}
class VectorStore {
+items: list[VectorStoreItem]
+add(item: VectorStoreItem)
+search(query_embedding: list[float], top_k: int)
}
class ALogger {
+prefix: str
+title(text: str, rule_style: str)
}
Agent --> MCPClient
Agent --> AsyncChatOpenAI
Agent ..> EembeddingRetriever
EembeddingRetriever --> VectorStore
Agent ..> ALogger
AsyncChatOpenAI ..> ALogger
```
## Quick Start
### Environment Setup
1. Ensure Python 3.12 or higher is installed
2. Clone this repository
3. Copy `.env.example` to `.env` and fill in the necessary configuration information:
- `OPENAI_API_KEY`: OpenAI API key
- `OPENAI_BASE_URL`: OpenAI API base URL, make sure to keep the trailing '/v1' (default is 'https://api.openai.com/v1')
- `DEFAULT_MODEL_NAME`: (optional) Default model name to use (default is "gpt-4o-mini")
- `EMBEDDING_KEY`: (optional) Embedding model API key (default is $OPENAI_API_KEY)
- `EMBEDDING_BASE_URL`: (optional) Embedding model API base URL, such as silicon-based flow API or API compatible with OpenAI format (default is $OPENAI_BASE_URL)
- `USE_CN_MIRROR`: (optional) Whether to use the China mirror, set any value (e.g., '1') to true (default is false)
- `PROXY_URL`: (optional) Proxy URL (e.g., "http(s)://xxx"), used for `fetch` (mcp-tool) to go through the proxy
### Install Dependencies
```bash
# Install dependencies using uv
uv sync
```
### Run Examples
This project uses the `just` command tool to run different examples:
```bash
# View available commands
just help
```
## RAG Example Flow
```mermaid
sequenceDiagram
participant User as User
participant Agent as Agent
participant LLM as LLM
participant ER as EmbeddingRetriever
participant VS as VectorStore
participant MCP as MCP Client
participant Logger as ALogger
User->>Agent: Provide Query
Agent->>Logger: Log Operation
Agent->>ER: Retrieve Relevant Documents
ER->>VS: Query Vector Store
VS-->>ER: Return Relevant Documents
ER-->>Agent: Return Context
Agent->>LLM: Send Query and Context
LLM-->>Agent: Generate Answer or Tool Call
Agent->>Logger: Log Tool Call
Agent->>MCP: Execute Tool Call
MCP-->>Agent: Return Tool Result
Agent->>LLM: Send Tool Result
LLM-->>Agent: Generate Final Answer
Agent-->>User: Return Answer
```
## Project Structure
- `src/augmented/`: Main source code directory
- `agent.py`: Implementation of Agent, responsible for coordinating LLM and tools
- `chat_openai.py`: OpenAI API client wrapper
- `mcp_client.py`: Implementation of MCP client
- `embedding_retriever.py`: Implementation of embedding retriever
- `vector_store.py`: Implementation of vector store
- `mcp_tools.py`: Definition of MCP tools
- `utils/`: Utility functions
- `info.py`: Project information and configuration
- `pretty.py`: Unified logging output system
- `rag_example.py`: RAG example program
- `justfile`: Task running configuration file
## Learning Resources
- [Model Context Protocol (MCP)](https://modelcontextprotocol.io/): Learn about the MCP protocol
- [OpenAI API Documentation](https://platform.openai.com/docs/api-reference): OpenAI API reference
- [RAG (Retrieval-Augmented Generation)](https://arxiv.org/abs/2005.11401): RAG technical paper
You Might Also Like
Ollama
Ollama enables easy access to large language models on various platforms.

n8n
n8n is a secure workflow automation platform for technical teams, offering...

Dify
Dify is a platform for AI workflows, enabling file uploads and self-hosting.
context-space
Context Space integrates AI agents with real-world APIs for seamless task...
human-in-the-loop
An MCP Server enabling AI assistants to seek human input via Discord.
vibevideo-mcp
VibeVideo-MCP is an MCP server for agentic video editing and user editing.