Content
# 🚀 MultiAgentPPT
A multi-agent system based on A2A + MCP + ADK that supports streaming concurrent generation of high-quality (editable online) PPT content.
## 🧠 Project Introduction
MultiAgentPPT utilizes a multi-agent architecture to automate the process from topic input to complete presentation generation. The main steps include:
1. **Outline Generation Agent**: Generates an initial content outline based on user requirements.
2. **Topic Splitting Agent**: Breaks down the outline content into multiple topics.
3. **Research Agents Working in Parallel**: Multiple agents conduct in-depth research on each topic separately.
4. **Summary Agent Aggregates Output**: Compiles the research results to generate PPT content, returning it to the frontend in real-time.
## Advantages
- **Multi-Agent Collaboration**: Increases the efficiency and accuracy of content generation through parallel work by multiple agents.
- **Real-Time Streaming Return**: Supports streaming return of generated PPT content, enhancing user experience.
- **High-Quality Content**: Combines external retrieval and agent collaboration to produce high-quality content outlines and presentations.
- **Scalability**: The system is designed flexibly, making it easy to expand with new agents and functional modules.
## Recent Upgrades
### ✅ Completed (Done)
- ✅ Fixed output bugs except for Gemini, addressing package issues with ADK and A2A: [View Details](https://github.com/johnson7788/MultiAgentPPT/blob/stream/backend/birthday_planner/README.md)
- ✅ In image rendering: Dynamically switch styles based on whether the image is a background image (`object-cover` or `object-contain`), and display explanatory text for non-background images. To ensure the uniqueness of PPT pages, use the `page_number` from the large model output as a unique identifier, replacing the previous title-based method to support content updates and proofreading.
- ✅ Use a loop agent to generate each PPT page instead of generating all content at once, facilitating the generation of more pages and avoiding LLM token output limits.
- ✅ Introduced the PPTChecker Agent to check the quality of each generated PPT page. Actual testing has shown good results; please replace with real image data and content RAG data.
- ✅ Frontend displays the generation process status of each agent.
### 📝 To Do
- 🔄 Multi-modal understanding of images: Including processing of image orientation, size, and other formats to adapt to different positions in the PPT.
- 🔄 Metadata data transmission: Support the frontend in transmitting configurations to agents, with agents returning results along with metadata information.
## Interface Screenshots
Here are the core functionality demonstrations of the MultiAgentPPT project:
### 1. Topic Input Interface
Users input the desired PPT topic content in the interface:

### 2. Streaming Outline Generation Process
The system returns the generated outline structure in real-time based on the input content:

### 3. Complete Outline Generation
The final system will display the complete outline for user confirmation:

### 4. Streaming PPT Content Generation
After confirming the outline, the system begins to stream the content of each slide and returns it to the frontend:

### 5. Progress Detail Display for Multi-Agent Generated PPT in slide_agent





## 📊 Concurrent Multi-Agent Collaboration Process (slide_agent + slide_outline)
```mermaid
flowchart TD
A[User Inputs Research Content] --> B[Call Outline Agent]
B --> C[MCP Retrieves Data]
C --> D[Generate Outline]
D --> E{User Confirms Outline}
E --> F[Send Outline to PPT Generation Agent]
F --> G[Split Outline Agent Splits Outline]
G --> H[Parallel Agents Process]
%% Concurrent Research Agents
H --> I1[Research Agent 1]
H --> I2[Research Agent 2]
H --> I3[Research Agent 3]
I1 --> RAG1[Automatic Knowledge Base Retrieval RAG]
I2 --> RAG2[Automatic Knowledge Base Retrieval RAG]
I3 --> RAG3[Automatic Knowledge Base Retrieval RAG]
RAG1 --> J
RAG2 --> J
RAG3 --> J
J --> L[Loop PPT Agent Generates Slide Pages]
subgraph Loop PPT Agent
L1[Write PPT Agent<br>Generates Each Slide]
L2[Check PPT Agent<br>Checks Each Page Quality, with up to 3 retries]
L1 --> L2
L2 --> L1
end
L --> L1
```
## 🗂️ Project Structure
```bash
MultiAgentPPT/
├── backend/ # Backend multi-agent service directory
│ ├── simpleOutline/ # Simplified outline generation service (no external dependencies)
│ ├── simplePPT/ # Simplified PPT generation service (no retrieval or concurrency)
│ ├── slide_outline/ # Outline generation service with external retrieval (more accurate outline based on MCP tool)
│ ├── slide_agent/ # Concurrent multi-agent PPT generation primarily in XML format
├── frontend/ # Next.js frontend interface
```
---
## ⚙️ Quick Start
### 🐍 Backend Environment Setup (Python)
1. Create and activate a Conda virtual environment (recommended Python version 3.11 or higher to avoid bugs):
```bash
conda create --name multiagent python=3.12
conda activate multiagent
```
2. Install dependencies:
```bash
cd backend
pip install -r requirements.txt
```
3. Set backend environment variables:
```bash
# Copy template configuration files for all modules
cd backend/simpleOutline && cp env_template .env
cd ../simplePPT && cp env_template .env
cd ../slide_outline && cp env_template .env
cd ../slide_agent && cp env_template .env
```
---
### 🧪 Start Backend Services
| Module | Function | Default Port | Start Command |
| ------------------ | --------------------- | --------------------------------- | ----------------------------- |
| `simpleOutline` | Simple Outline Generation | 10001 | `python main_api.py` |
| `simplePPT` | Simple PPT Generation | 10011 | `python main_api.py` |
| `slide_outline` | High-Quality Outline Generation (with retrieval) | 10001 (must close `simpleOutline`) | `python main_api.py` |
| `slide_agent` | Multi-Agent Concurrent Full PPT Generation | 10011 (must close `simplePPT`) | `python main_api.py` |
---
## 🧱 Frontend Database Setup and Installation & Running (Next.js)
The database stores user-generated PPTs:
1. Start PostgreSQL using Docker:
```bash
# Use when connected to VPN
docker run --name postgresdb -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=welcome -d postgres
# Use domestically:
docker run --name postgresdb -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=welcome -d swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/sclorg/postgresql-15-c9s:latest
```
2. Modify the `.env` sample configuration:
```env
DATABASE_URL="postgresql://postgres:welcome@localhost:5432/presentation_ai"
A2A_AGENT_OUTLINE_URL="http://localhost:10001"
A2A_AGENT_SLIDES_URL="http://localhost:10011"
```
3. Install dependencies and push the database model:
```bash
# Install frontend dependencies
pnpm install
# Push database model and insert user data
pnpm db:push
# Start frontend
npm run dev
```
4. Open the browser and visit: [http://localhost:3000/presentation](http://localhost:3000/presentation)
---
---
## 🧪 Sample Data Description
> The current system has a built-in research example: **“Overview of Electric Vehicle Development”**. For research on other topics, please configure the corresponding agent and connect to real data sources.
> To configure real data, simply change the prompt and the corresponding MCP tool.
---
## 📎 References
The frontend project is partially based on the open-source repository: [allweonedev/presentation-ai](https://github.com/allweonedev/presentation-ai)
## Author WeChat ID
johnsongzc
You Might Also Like
OpenWebUI
Open WebUI is an extensible web interface for various applications.

NextChat
NextChat is a light and fast AI assistant supporting Claude, DeepSeek, GPT4...

Continue
Continue allows developers to create and share custom AI code assistants.
semantic-kernel
Build and deploy intelligent AI agents with Semantic Kernel's orchestration...

repomix
Repomix packages your codebase into AI-friendly formats.

MaxKB
MaxKB is an open-source platform for building enterprise-grade agents.