Content
<p align="center">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="./docs/ultrarag_dark.svg">
<source media="(prefers-color-scheme: light)" srcset="./docs/ultrarag.svg">
<img alt="UltraRAG" src="./docs/ultrarag.svg" width="55%">
</picture>
</p>
<h3 align="center">
Less Code, Lower Threshold, Faster Implementation
</h3>
<p align="center">
|
<a href="https://openbmb.github.io/UltraRAG"><b>Project Homepage</b></a>
|
<a href="https://ultrarag.openbmb.cn"><b>Tutorial Documentation</b></a>
|
<a href="https://huggingface.co/datasets/UltraRAG/UltraRAG_Benchmark"><b>Dataset</b></a>
|
<b>Simplified Chinese</b>
|
<a href="./docs/README-English.md"><b>English</b></a>
|
</p>
---
*Changelog* 🔥
- [2025.09.01] We recorded a hands-on video to guide you through installing UltraRAG and running a complete RAG 👉[📺 bilibili](https://www.bilibili.com/video/BV1B9apz4E7K/?share_source=copy_web&vd_source=7035ae721e76c8149fb74ea7a2432710)
- [2025.08.28] 🎉 Released UltraRAG 2.0! UltraRAG 2.0 has been upgraded: achieve high-performance RAG with just a few lines of code, allowing researchers to focus on innovative ideas!
- [2025.01.23] Released UltraRAG! Enabling large models to understand and effectively utilize knowledge bases! We have retained the code for UltraRAG 1.0, which can be viewed [v1](https://github.com/OpenBMB/UltraRAG/tree/v1).
---
## UltraRAG 2.0: A "RAG Experiment" Accelerator for Research
Retrieval-Augmented Generation (RAG) systems are evolving from the early simple concatenation of "retrieval + generation" to complex knowledge systems that integrate **Adaptive Knowledge Organization**, **Multi-turn Reasoning**, and **Dynamic Retrieval** (typical examples include *DeepResearch* and *Search-o1*). However, this increase in complexity poses high engineering implementation costs for researchers when it comes to **method reproduction** and **rapid iteration of new ideas**.
To address this pain point, Tsinghua University's [THUNLP](https://nlp.csai.tsinghua.edu.cn/) lab, Northeastern University's [NEUIR](https://neuir.github.io) lab, [OpenBMB](https://www.openbmb.cn/home), and [AI9stars](https://github.com/AI9Stars) have jointly launched UltraRAG 2.0 (UR-2.0) — the first RAG framework designed based on the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/overview). This design allows researchers to declare complex logic such as serial, loop, and conditional branches directly by writing YAML files, enabling rapid implementation of multi-stage reasoning systems with minimal code.
The core idea is:
- **Componentized Encapsulation**: Encapsulating the core components of RAG as **standardized independent MCP Servers**;
- **Flexible Invocation and Extension**: Providing **function-level Tool** interfaces to support flexible invocation and extension of functionalities;
- **Lightweight Process Orchestration**: Using **MCP Client** to establish a simplified top-down link building;
Compared to traditional frameworks, UltraRAG 2.0 significantly lowers the **technical threshold and learning costs** of complex RAG systems, allowing researchers to focus more on **experimental design and algorithm innovation** rather than getting bogged down in lengthy engineering implementations.
## 🌟 Core Highlights
- 🚀 **Low-Code Construction of Complex Pipelines**
Natively supports reasoning control structures such as **serial, loop, and conditional branches**. Developers only need to write YAML files to create an **iterative RAG process** with just a few lines of code (e.g., *Search-o1*).
- ⚡ **Rapid Reproduction and Functional Extension**
Based on the **MCP architecture**, all modules are encapsulated as independent, reusable **Servers**.
- Users can customize Servers as needed or directly reuse existing modules;
- The functionality of each Server is registered as function-level **Tools**, and adding new features only requires adding a function to integrate into the complete process;
- Supports calling **external MCP Servers**, easily extending Pipeline capabilities and application scenarios.
- 📊 **Unified Evaluation and Comparison**
Built-in **standardized evaluation processes and metric management**, ready to use with 17 mainstream research benchmarks.
- Continuously integrates the latest baselines;
- Provides leaderboard results;
- Facilitates systematic comparison and optimization experiments for researchers.
## The Secret: MCP Architecture and Native Process Control
In different RAG systems, core capabilities such as retrieval and generation have high functional similarity, but due to varying implementation strategies by developers, modules often lack unified interfaces, making cross-project reuse difficult. The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/overview) serves as an open protocol that standardizes the way context is provided to large language models (LLMs) and employs a **Client–Server** architecture, allowing Server components developed under this protocol to be seamlessly reused across different systems.
Inspired by this, UltraRAG 2.0 is built on the **MCP architecture**, abstracting and encapsulating core functions such as retrieval, generation, and evaluation in the RAG system as mutually independent **MCP Servers**, and implementing calls through standardized function-level **Tool interfaces**. This design ensures both flexibility in module functionality expansion and allows new modules to be integrated in a "hot-plug" manner without invasive modifications to the global code. In research scenarios, this architecture enables researchers to quickly adapt new models or algorithms with minimal code while maintaining the overall system's stability and consistency.
<p align="center">
<picture>
<img alt="UltraRAG" src="./docs/architecture.png" width=90%>
</picture>
</p>
Developing complex RAG inference frameworks presents significant challenges, and UltraRAG 2.0's ability to support the construction of complex systems under **low-code** conditions is fundamentally due to its native support for multi-structure **Pipeline process control**. Whether serial, loop, or conditional branches, all control logic can be defined and scheduled at the YAML level, covering various process expressions required for complex reasoning tasks. During actual execution, the scheduling of the reasoning process is carried out by the built-in **Client**, with its logic entirely described by the external **Pipeline YAML script** written by the user, achieving decoupling from the underlying implementation. Developers can invoke instructions like loop and step as if using programming language keywords, quickly constructing multi-stage reasoning processes in a declarative manner.
By deeply integrating the **MCP architecture** with **native process control**, UltraRAG 2.0 makes building complex RAG systems as natural and efficient as "orchestrating processes." Additionally, the framework includes 17 mainstream benchmark tasks and various high-quality baselines, along with a unified evaluation system and knowledge base support, further enhancing the efficiency of system development and the reproducibility of experiments.
## Installation
Create a virtual environment using Conda:
```shell
conda create -n ultrarag python=3.11
conda activate ultrarag
```
Clone the project to your local machine or server using git:
```shell
git clone https://github.com/OpenBMB/UltraRAG.git
cd UltraRAG
```
We recommend using uv for package management, providing a faster and more reliable Python dependency management experience:
```shell
pip install uv
uv pip install -e .
```
If you prefer pip, you can run:
```shell
pip install -e .
```
【Optional】UR-2.0 supports a rich set of Server components, and developers can flexibly install the required dependencies based on their actual tasks:
```shell
# If you need to use faiss for vector indexing:
# You need to manually compile and install the CPU or GPU version of FAISS based on your hardware environment:
# CPU version:
uv pip install faiss-cpu
# GPU version (example: CUDA 12.x)
uv pip install faiss-gpu-cu12
# For other CUDA versions, please install the corresponding package (e.g., use faiss-gpu-cu11 for CUDA 11.x)
# If you need to use infinity_emb for corpus encoding and indexing:
uv pip install -e ".[infinity_emb]"
# If you need to use lancedb vector database:
uv pip install -e ".[lancedb]"
# If you need to use vLLM service to deploy models:
uv pip install -e ".[vllm]"
# If you need to use corpus document parsing functionality:
uv pip install -e ".[corpus]"
# ====== Install all dependencies (except faiss) ======
uv pip install -e ".[all]"
```
Run the following command to verify if the installation was successful:
```shell
# Successful execution displays the welcome message 'Hello, UltraRAG 2.0!'
ultrarag run examples/sayhello.yaml
```
## Quick Start
We provide a complete set of teaching examples from beginner to advanced. Feel free to visit the [tutorial documentation](https://ultrarag.openbmb.cn) to quickly get started with UltraRAG 2.0!
Read the [Quick Start](https://ultrarag.openbmb.cn/pages/cn/getting_started/quick_start) to understand the usage process of UltraRAG. The overall process is divided into three steps: **① Compile the Pipeline file to generate parameter configuration; ② Modify the parameter file; ③ Run the Pipeline file**.
Additionally, we have compiled a directory of commonly used functions in research, and you can directly click to jump to the desired module:
- [Using the retriever for corpus encoding and indexing](https://ultrarag.openbmb.cn/pages/cn/tutorials/part_3/emb_and_index)
- [Deploying the retriever](https://ultrarag.openbmb.cn/pages/cn/tutorials/part_4/deploy_retriever_serve)
- [Deploying LLM](https://github.com/OpenBMB/UltraRAG/blob/main/script/vllm_serve.sh)
- [Baseline reproduction](https://ultrarag.openbmb.cn/pages/cn/tutorials/part_3/reproduction)
- [Experimental result case analysis](https://ultrarag.openbmb.cn/pages/cn/tutorials/part_4/case_study)
- [Debugging tutorial](https://ultrarag.openbmb.cn/pages/cn/tutorials/part_4/debug)
## Support
UltraRAG 2.0 is ready to use out of the box, with built-in support for the most commonly used **public evaluation datasets**, **large-scale corpora**, and **typical baseline methods** in the current RAG field, making it easy for researchers to quickly reproduce and extend experiments. You can also refer to the [data format description](https://ultrarag.openbmb.cn/pages/cn/tutorials/part_3/prepare_dataset) to flexibly customize and add any dataset or corpus. The complete [dataset](https://huggingface.co/datasets/UltraRAG/UltraRAG_Benchmark) can be accessed and downloaded via this link.
### 1. Supported Datasets
| Task Type | Dataset Name | Original Data Count | Evaluation Sample Count |
|------------------|-----------------------|----------------------------------------------|-------------------------|
| QA | [NQ](https://huggingface.co/datasets/google-research-datasets/nq_open) | 3,610 | 1,000 |
| QA | [TriviaQA](https://nlp.cs.washington.edu/triviaqa/) | 11,313 | 1,000 |
| QA | [PopQA](https://huggingface.co/datasets/akariasai/PopQA) | 14,267 | 1,000 |
| QA | [AmbigQA](https://huggingface.co/datasets/sewon/ambig_qa) | 2,002 | 1,000 |
| QA | [MarcoQA](https://huggingface.co/datasets/microsoft/ms_marco/viewer/v2.1/validation) | 55,636 | 1,000 |
| QA | [WebQuestions](https://huggingface.co/datasets/stanfordnlp/web_questions) | 2,032 | 1,000 |
| Multi-hop QA | [HotpotQA](https://huggingface.co/datasets/hotpotqa/hotpot_qa) | 7,405 | 1,000 |
| Multi-hop QA | [2WikiMultiHopQA](https://www.dropbox.com/scl/fi/heid2pkiswhfaqr5g0piw/data.zip?e=2&file_subpath=%2Fdata&rlkey=ira57daau8lxfj022xvk1irju) | 12,576 | 1,000 |
| Multi-hop QA | [Musique](https://drive.google.com/file/d/1tGdADlNjWFaHLeZZGShh2IRcpO6Lv24h/view) | 2,417 | 1,000 |
| Multi-hop QA | [Bamboogle](https://huggingface.co/datasets/chiayewken/bamboogle) | 125 | 125 |
| Multi-hop QA | [StrategyQA](https://huggingface.co/datasets/tasksource/strategy-qa) | 2,290 | 1,000 |
| Multiple-choice | [ARC](https://huggingface.co/datasets/allenai/ai2_arc) | 3,548 | 1,000 |
| Multiple-choice | [MMLU](https://huggingface.co/datasets/cais/mmlu) | 14,042 | 1,000 |
| Long-form QA | [ASQA](https://huggingface.co/datasets/din0s/asqa) | 948 | 948 |
| Fact-verification | [FEVER](https://fever.ai/dataset/fever.html) | 13,332 | 1,000 |
| Dialogue | [WoW](https://huggingface.co/datasets/facebook/kilt_tasks) | 3,054 | 1,000 |
| Slot-filling | [T-REx](https://huggingface.co/datasets/facebook/kilt_tasks) | 5,000 | 1,000 |
---
### 2. Supported Corpora
| Corpus Name | Document Count |
|-------------|----------------|
| [wiki-2018](https://huggingface.co/datasets/RUC-NLPIR/FlashRAG_datasets/tree/main/retrieval-corpus) | 21,015,324 |
| wiki-2024 | In preparation, coming soon |
---
### 3. Supported Baseline Methods (continuously updated)
| Baseline Name | Script |
|---------------|------------|
| Vanilla LLM | examples/vanilla.yaml |
| Vanilla RAG | examples/rag.yaml |
| [IRCoT](https://arxiv.org/abs/2212.10509) | examples/IRCoT.yaml |
| [IterRetGen](https://arxiv.org/abs/2305.15294) | examples/IterRetGen.yaml |
| [RankCoT](https://arxiv.org/abs/2502.17888) | examples/RankCoT.yaml |
| [R1-searcher](https://arxiv.org/abs/2503.05592) | examples/r1_searcher.yaml |
| [Search-o1](https://arxiv.org/abs/2501.05366) | examples/search_o1.yaml |
| [Search-r1](https://arxiv.org/abs/2503.09516) | examples/search_r1.yaml |
| WebNote | examples/webnote.yaml |
## Contribution
Thanks to the following contributors for their efforts in code submissions and testing. We also welcome new members to join us in building a comprehensive RAG ecosystem!
You can contribute through the following standard process: **Fork this repository → Submit an Issue → Initiate a Pull Request (PR)**.
<a href="https://github.com/OpenBMB/UltraRAG/contributors">
<img src="https://contrib.rocks/image?repo=OpenBMB/UltraRAG&nocache=true" />
</a>
## Support Us
If you find this project helpful for your research, feel free to give us a ⭐ to support us!
## Contact Us
- For technical issues and feature requests, please use the [GitHub Issues](https://github.com/OpenBMB/UltraRAG/issues) feature.
- For usage questions, feedback, and any discussions about RAG technology, feel free to join our [WeChat group](https://github.com/OpenBMB/UltraRAG/blob/main/docs/wechat_qr.png), [Feishu group](https://github.com/OpenBMB/UltraRAG/blob/main/docs/feishu_qr.png), and [Discord](https://discord.gg/yRFFjjJnnS) to communicate with us.
<table>
<tr>
<td align="center">
<img src="docs/wechat_qr.png" alt="WeChat Group QR Code" width="220"/><br/>
<b>WeChat Group</b>
</td>
<td align="center">
<img src="docs/feishu_qr.png" alt="Feishu Group QR Code" width="220"/><br/>
<b>Feishu Group</b>
</td>
<td align="center">
<a href="https://discord.gg/yRFFjjJnnS">
<img src="https://img.shields.io/badge/Discord-5865F2?logo=discord&logoColor=white" alt="Join Discord"/>
</a><br/>
<b>Discord</b>
</td>
</tr>
</table>
You Might Also Like
Ollama
Ollama enables easy access to large language models on macOS, Windows, and Linux.

n8n
n8n is a secure workflow automation platform for technical teams with 400+...
OpenWebUI
Open WebUI is an extensible web interface for customizable applications.

Dify
Dify is a platform for AI workflows, enabling file uploads and self-hosting.

Zed
Zed is a high-performance multiplayer code editor from the creators of Atom.
MarkItDown MCP
markitdown-mcp is a lightweight MCP server for converting URIs to Markdown.