Content
# Tool List





**LeiAI Agent** is an intelligent tourism assistant system built with Spring Boot 3 and Spring AI, utilizing advanced technologies such as RAG (Retrieval-Augmented Generation), Tool Calling, and MCP (Multiple Conversation Paths) to provide users with personalized travel planning, information query, and intelligent conversation services.
## 📚 Table of Contents
- [Project Overview](#project-overview)
- [System Architecture](#system-architecture)
- [Core Functional Modules](#core-functional-modules)
- [Technology Selection](#technology-selection)
- [Project Structure](#project-structure)
- [Quick Start](#quick-start)
- [API Documentation](#api-documentation)
- [Project Highlights](#project-highlights)
- [Latest Optimizations](#latest-optimizations)
- [Future Plans](#future-plans)
- [License](#license)
## 🌟 Project Overview
LeiAI Agent is an intelligent assistant system for the tourism field, integrating large language models (such as Alibaba Cloud Tongyi Qianwen), knowledge retrieval enhancement, tool calling, and multi-round dialogue management to provide users with an immersive travel planning and consultation experience. The system not only answers travel-related questions but also actively calls various tools (such as web search, PDF generation, file operations, etc.) to complete complex tasks, achieving true intelligent agent functionality.
### Project Background
With the development of artificial intelligence technology, especially the emergence of large language models (LLM), new possibilities have been provided for constructing intelligent dialogue systems. However, relying solely on LLM has issues such as knowledge timeliness, tool usage ability, and context management. This project aims to solve these pain points and construct an intelligent system that can think, plan, and act autonomously, providing comprehensive intelligent services for the tourism field.
### Project Value
- **Enhance User Experience**: Simplify the travel planning process through natural language interaction
- **Reduce Information Acquisition Costs**: Integrate multi-source data and provide one-stop travel information services
- **Personalized Recommendations**: Provide customized travel plans based on user preferences
- **Automated Execution**: Capable of autonomously calling tools to complete complex tasks, such as generating travel plan PDFs, booking queries, etc.
## 🏗️ System Architecture
LeiAI Agent adopts a modular and layered architecture design to ensure the system's scalability and maintainability.
```mermaid
graph TD
Client[Client] --> API[API Layer]
API --> Service[Service Layer]
Service --> Agent[Agent Layer]
Agent --> LLM[Large Language Model]
Agent --> ToolCalling[Tool Calling]
Agent --> RAG[Knowledge Retrieval Enhancement]
LLM --> DashScope[Alibaba Cloud Tongyi Qianwen]
ToolCalling --> WebSearch[Web Search]
ToolCalling --> PDFGen[PDF Generation]
ToolCalling --> FileOps[File Operations]
ToolCalling --> WebScraping[Web Scraping]
ToolCalling --> Terminal[Terminal Operations]
RAG --> Embedding[Vector Embedding]
RAG --> VectorDB[Vector Database]
RAG --> Retriever[Retriever]
Service --> Memory[Session Memory]
Memory --> Redis[Redis Cache]
VectorDB --> PostgreSQL[PostgreSQL+pgvector]
```
### Architecture Description
1. **API Layer**: Responsible for handling HTTP requests and providing RESTful API interfaces
2. **Service Layer**: Implements business logic and coordinates the work of various components
3. **Agent Layer**: Core AI agent layer, implementing the thinking-action cycle
4. **Large Language Model**: Provides natural language understanding and generation capabilities
5. **Tool Calling**: Various tools that extend the capabilities of the agent
6. **Knowledge Retrieval Enhancement**: Enhances the quality of model answers through vector databases
7. **Session Memory**: Manages multi-round dialogue context
## 🧩 Core Functional Modules
### 1. Agent System (Agent)
The agent is the core of the system, adopting the ReAct (Reasoning and Acting) mode, solving complex problems through a "thinking-acting" cycle.
- **BaseAgent**: Abstract basic agent class, providing state management and execution process control
- **ReActAgent**: Abstract class implementing the thinking-acting cycle
- **ToolCallAgent**: Agent implementation capable of calling external tools
- **LiManus**: A super agent integrating all capabilities, can autonomously plan and execute tasks
The core advantage of the agent system lies in its autonomy and scalability, dynamically selecting appropriate tools and execution paths according to user needs.
### 2. Tool Calling System (Tools)
The tool calling system extends the ability boundary of the agent, enabling it to interact with the external world.
Tool calling process

Framework-controlled tool execution

- **WebSearchTool**: Web search tool, obtaining real-time information
- **PDFGenerationTool**: PDF generation tool, creating travel plan documents
- **FileOperationTool**: File operation tool, managing local files
- **WebScrapingTool**: Web scraping tool, extracting web content
- **TerminalOperationTool**: Terminal operation tool, executing system commands
- **ResourceDownloadTool**: Resource download tool, obtaining remote resources
- **TerminateTool**: Termination tool, ending current task execution
The tool system adopts a unified registration and calling mechanism, making it easy to extend new tools.
### 3. RAG Knowledge Enhancement System (RAG)
RAG system enhances model answers by retrieving relevant knowledge, especially suitable for handling professional domain problems.
- **Vector Storage**: Using PostgreSQL+pgvector to store document vectors
- **Document Processing**: Supporting multiple format document processing and chunking
- **Retriever**: Retrieving relevant document fragments based on semantic similarity
- **Embedding Model**: Converting text into vector representations
### 4. Dialogue Management System
- **Session State Management**: Tracking and maintaining dialogue state
- **Context Memory**: Using Redis to store short-term memory
- **Multi-Round Dialogue Support**: Maintaining coherent dialogue experience
### 5. API Interface System
- **AgentController**: Providing agent interaction interface
- **AppController**: Providing application core function interface
- **RagController**: Providing knowledge retrieval-related interface
## 🔧 Technology Selection
### Backend Technology
- **Core Framework**: Spring Boot 3.5.0
- **AI Framework**: Spring AI 1.0.0
- **Large Language Model**: Alibaba Cloud Tongyi Qianwen (Qwen-Plus)
- **Vector Database**: PostgreSQL + pgvector
- **Cache System**: Redis
- **ORM Framework**: MyBatis-Plus 3.5.12
- **API Documentation**: Knife4j 4.5.0 (based on OpenAPI 3)
- **Tool Library**: Hutool 5.8.37, Lombok 1.18.38
- **PDF Processing**: iText 9.1.0
- **Web Scraping**: Jsoup 1.19.1
- **Exception Handling**: Global unified exception handling mechanism
- **Response Encapsulation**: Unified API response format
### Frontend Technology
The frontend adopts modern web technology stack, providing an intuitive user interface (note: this project mainly focuses on backend implementation, and the frontend is an optional component).
- **Framework**: Vue.js/React (can be selected according to needs)
- **UI Components**: Element-Plus/Ant Design
- **HTTP Client**: Axios
- **State Management**: Pinia/Redux
## 📂 Project Structure
The project adopts a clear layered architecture, with each module having a clear responsibility, making it easy to maintain and extend:
```
src/main/java/com/lilei/leiaiagent/
├── agent/ # Core implementation of the agent
│ ├── BaseAgent.java # Abstract basic agent class
│ ├── ReActAgent.java # ReAct mode agent
│ ├── ToolCallAgent.java # Tool calling agent
│ ├── LiManus.java # Comprehensive agent implementation
│ └── model/ # Agent model
│ └── AgentState.java # Agent state
├── config/ # Configuration class
│ ├── SwaggerConfig.java # Swagger document configuration
│ ├── WebMvcConfig.java # Web MVC configuration
│ └── ResponseAdvice.java # Unified response processing
├── constant/ # Constant definition
│ ├── ApiConstants.java # API-related constants
│ └── ErrorCode.java # Error code constants
├── controller/ # API controller
│ ├── AgentController.java # Agent controller
│ └── ...
├── domain/ # Domain model
│ ├── entity/ # Entity class
│ │ ├── ChatSession.java # Session entity
│ │ └── ChatMessage.java # Message entity
│ └── dto/ # Data transfer object
│ ├── ChatSessionDTO.java # Session DTO
│ └── ChatMessageDTO.java # Message DTO
├── exception/ # Exception handling
│ ├── GlobalExceptionHandler.java # Global exception handler
│ └── custom/ # Custom exception
│ ├── BusinessException.java # Business exception
│ ├── ResourceNotFoundException.java # Resource not found exception
│ └── UnauthorizedException.java # Unauthorized exception
├── llm/ # LLM integration
├── pojo/ # Plain Java object
│ └── vo/ # View object
│ ├── AgentRequestVO.java # Agent request
│ ├── AgentResponseVO.java # Agent response
│ └── ApiResponse.java # Unified API response
├── prompt/ # Prompt template
├── rag/ # RAG implementation
├── service/ # Service layer
│ ├── api/ # Service interface
│ │ └── AgentService.java # Agent service interface
│ └── impl/ # Service implementation
│ └── AgentServiceImpl.java # Agent service implementation
├── tools/ # Tool implementation
├── utils/ # Tool class
└── LeiAiAgentApplication.java # Application entry
```
## 🚀 Quick Start
### Environment Requirements
- JDK 21+
- Maven 3.8+
- PostgreSQL 14+ (pgvector extension installed)
- Redis 6+
### Installation Steps
1. **Clone repository**
```bash
git clone https://github.com/lilei/lei-ai-agent.git
cd lei-ai-agent
```
2. **Configure environment variables**
Create `.env` file and configure the following environment variables:
```properties
AI_DASHSCOPE_API_KEY=your_dashscope_api_key
SEARCH_API_KEY=your_search_api_key
```
3. **Compile project**
```bash
mvn clean package
```
4. **Run application**
```bash
java -jar target/lei-ai-agent-0.0.1-SNAPSHOT.jar
```
5. **Access Swagger documentation**
Open browser and access `http://localhost:8082/api/swagger-ui.html`
### Configuration Description
Main configuration files are located in `src/main/resources/application.yml` and `src/main/resources/application-dev.yml`, containing the following key configurations:
- Database connection information
- Redis configuration
- AI model configuration
- RAG system parameters
- Tool calling parameters
- Log configuration
## 📖 API Documentation
The project integrates Knife4j, providing a beautiful and easy-to-use API documentation interface. Main API is divided into the following categories:
### 1. Agent API
- `POST /api/agent/execute`: Execute agent task (synchronous mode)
- `POST /api/agent/execute/advanced`: Execute agent task (advanced synchronous mode)
- `GET /api/agent/stream`: Execute agent task (streaming mode)
- `POST /api/agent/stream/advanced`: Execute agent task (advanced streaming mode)
- `GET /api/agent/status`: Get agent status
- `POST /api/agent/reset`: Reset agent status
- `POST /api/agent/stream/close/{sessionId}`: Close streaming connection
### 2. Application API
- `GET /api/ai/chat`: Chat interface
### 3. RAG API
- `POST /api/rag/query`: Knowledge base query interface
## 💡 Project Highlights
### 1. Agent Architecture Innovation
I paid special attention to the scalability and flexibility of the agent architecture design. By abstracting the basic agent class (BaseAgent) and implementing the ReAct mode (ReActAgent), I built an autonomous thinking and acting agent system. This architecture is not only suitable for the tourism field but can also be easily extended to other vertical fields.
The agent's state management mechanism (AgentState) ensures the reliability of task execution, while the step-based execution cycle enables complex tasks to be decomposed into manageable small steps. This design enables the system to handle long-chain reasoning and action sequences, greatly improving the agent's ability boundary.
### 2. Flexible Design of Tool Calling System
The tool calling system adopts a unified registration and calling mechanism, making it easy to add new tools. Each tool implements a standard interface and is managed through ToolRegistration. This design enables the agent to dynamically select appropriate tools according to task needs, achieving true tool-enhanced AI.
It's worth mentioning that I implemented a tool calling error recovery mechanism. When a tool call fails, the system will try to retry or select an alternative solution, improving the system's robustness.
### 3. High-Performance RAG Implementation
In the RAG system design, I used PostgreSQL+pgvector as the vector database. Compared to pure memory vector databases, this solution performs more stably on large-scale datasets and supports persistence storage.
To improve retrieval efficiency, I implemented an index mechanism based on HNSW (Hierarchical Navigable Small World), greatly improving vector retrieval speed. At the same time, by optimizing the chunking strategy (chunk-size and chunk-overlap parameters), I balanced retrieval accuracy and system performance.
### 4. Streaming Response Mechanism
To improve user experience, I designed a streaming response mechanism based on SSE (Server-Sent Events), enabling users to see the agent's thinking and acting process in real-time without waiting for the entire task to complete. This design is particularly suitable for long-running complex tasks, providing a better interactive experience.
The streaming response mechanism also includes complete error handling and resource cleaning logic, ensuring the system's stability and resource utilization efficiency.
## 🔄 Latest Optimizations
Recently, a series of optimizations and improvements were made to the project, mainly including:
### 1. Code Architecture Optimization
- **Interface and implementation separation**: Separating service layer interface and implementation, improving code testability and maintainability
- **Domain model layering**: Introducing entity (Entity) and data transfer object (DTO) layering, making data flow more clear
- **Constant extraction**: Extracting API paths, error codes, and other constants into dedicated constant classes, avoiding hardcoding
### 2. Exception Handling Mechanism
- **Global exception handling**: Implementing unified global exception handler, standardizing exception response format
- **Custom exception system**: Designing business exception, resource not found exception, unauthorized exception, and other custom exception classes
- **Error code standardization**: Establishing unified error code system, facilitating problem location and troubleshooting
### 3. API Response Standardization
- **Unified response format**: All API returns unified response format, containing status code, message, data, and other fields
- **Response interceptor**: Implementing ResponseAdvice interceptor, automatically packaging controller return value
- **Status code specification**: Defining success, error, warning, and other status response specifications
### 4. Configuration Optimization
- **Swagger documentation enhancement**: Improving API documentation configuration, adding more detailed descriptions and examples
- **Cross-domain support**: Configuring global CORS support, facilitating front-end and back-end separation development
- **Log configuration**: Optimizing log configuration, supporting file logs and console logs, facilitating problem troubleshooting
### 5. Agent Service Improvement
- **Session management**: Enhancing session management function, supporting session state tracking and history record
- **Streaming connection management**: Improving SSE connection lifecycle management, avoiding resource leakage
- **Parameter verification**: Adding request parameter verification, improving system robustness
## 🔮 Future Plans
1. **Multi-model support**: Integrating more large language models, such as OpenAI GPT, Claude, etc.
2. **Multi-modal capabilities**: Adding image understanding and generation capabilities
3. **Knowledge base extension**: Enriching tourism domain knowledge base, improving professional Q&A quality
4. **Tool ecosystem extension**: Developing more tourism-related tools, such as hotel booking, flight query, etc.
5. **Frontend interface optimization**: Developing more friendly user interface, improving user experience
6. **Performance optimization**: Further optimizing system performance, supporting higher concurrency
7. **Multi-language support**: Adding multi-language support, serving international users
## 📄 License
This project adopts MIT license - see [LICENSE](LICENSE) file
Connection Info
You Might Also Like
awesome-mcp-servers
A collection of MCP servers.
git
A Model Context Protocol server for Git automation and interaction.
Appwrite
Build like a team of hundreds
TrendRadar
TrendRadar: Your hotspot assistant for real news in just 30 seconds.
oh-my-opencode
Background agents · Curated agents like oracle, librarians, frontend...
cc-switch
All-in-One Assistant for Claude Code, Codex & Gemini CLI across platforms.