Content

# 🚀 MultiAgentPPT A multi-agent system based on A2A + MCP + ADK that supports streaming concurrent generation of high-quality (editable online) PPT content. ## 🧠 Project Introduction MultiAgentPPT utilizes a multi-agent architecture to automate the process from topic input to complete presentation generation. The main steps include: 1. **Outline Generation Agent**: Generates an initial content outline based on user requirements. 2. **Topic Splitting Agent**: Breaks down the outline content into multiple topics. 3. **Research Agents Working in Parallel**: Multiple agents conduct in-depth research on each topic separately. 4. **Summary Agent Aggregates Output**: Compiles the research results to generate PPT content, returning it to the frontend in real-time. ## Advantages - **Multi-Agent Collaboration**: Increases the efficiency and accuracy of content generation through parallel work by multiple agents. - **Real-Time Streaming Return**: Supports streaming return of generated PPT content, enhancing user experience. - **High-Quality Content**: Combines external retrieval and agent collaboration to produce high-quality content outlines and presentations. - **Scalability**: The system is designed flexibly, making it easy to expand with new agents and functional modules. ## Recent Upgrades ### ✅ Completed (Done) - ✅ Fixed output bugs except for Gemini, addressing package issues with ADK and A2A: [View Details](https://github.com/johnson7788/MultiAgentPPT/blob/stream/backend/birthday_planner/README.md) - ✅ In image rendering: Dynamically switch styles based on whether the image is a background image (`object-cover` or `object-contain`), and display explanatory text for non-background images. To ensure the uniqueness of PPT pages, use the `page_number` from the large model output as a unique identifier, replacing the previous title-based method to support content updates and proofreading. - ✅ Use a loop agent to generate each PPT page instead of generating all content at once, facilitating the generation of more pages and avoiding LLM token output limits. - ✅ Introduced the PPTChecker Agent to check the quality of each generated PPT page. Actual testing has shown good results; please replace with real image data and content RAG data. - ✅ Frontend displays the generation process status of each agent. ### 📝 To Do - 🔄 Multi-modal understanding of images: Including processing of image orientation, size, and other formats to adapt to different positions in the PPT. - 🔄 Metadata data transmission: Support the frontend in transmitting configurations to agents, with agents returning results along with metadata information. ## Interface Screenshots Here are the core functionality demonstrations of the MultiAgentPPT project: ### 1. Topic Input Interface Users input the desired PPT topic content in the interface: ![Topic Input Interface](docs/1测试界面输入主题.png) ### 2. Streaming Outline Generation Process The system returns the generated outline structure in real-time based on the input content: ![Streaming Outline Generation](docs/2流式生成大纲.png) ### 3. Complete Outline Generation The final system will display the complete outline for user confirmation: ![Complete Outline](docs/3完整大纲.png) ### 4. Streaming PPT Content Generation After confirming the outline, the system begins to stream the content of each slide and returns it to the frontend: ![Streaming PPT Generation](docs/4流式生成PPT.png) ### 5. Progress Detail Display for Multi-Agent Generated PPT in slide_agent ![process_detail1.png](docs/process_detail1.png) ![process_detail2.png](docs/process_detail2.png) ![process_detail3.png](docs/process_detail3.png) ![process_detail4.png](docs/process_detail4.png) ![image_update.png](docs/image_update.png) ## 📊 Concurrent Multi-Agent Collaboration Process (slide_agent + slide_outline) ```mermaid flowchart TD A[User Inputs Research Content] --> B[Call Outline Agent] B --> C[MCP Retrieves Data] C --> D[Generate Outline] D --> E{User Confirms Outline} E --> F[Send Outline to PPT Generation Agent] F --> G[Split Outline Agent Splits Outline] G --> H[Parallel Agents Process] %% Concurrent Research Agents H --> I1[Research Agent 1] H --> I2[Research Agent 2] H --> I3[Research Agent 3] I1 --> RAG1[Automatic Knowledge Base Retrieval RAG] I2 --> RAG2[Automatic Knowledge Base Retrieval RAG] I3 --> RAG3[Automatic Knowledge Base Retrieval RAG] RAG1 --> J RAG2 --> J RAG3 --> J J --> L[Loop PPT Agent Generates Slide Pages] subgraph Loop PPT Agent L1[Write PPT Agent<br>Generates Each Slide] L2[Check PPT Agent<br>Checks Each Page Quality, with up to 3 retries] L1 --> L2 L2 --> L1 end L --> L1 ``` ## 🗂️ Project Structure ```bash MultiAgentPPT/ ├── backend/ # Backend multi-agent service directory │ ├── simpleOutline/ # Simplified outline generation service (no external dependencies) │ ├── simplePPT/ # Simplified PPT generation service (no retrieval or concurrency) │ ├── slide_outline/ # Outline generation service with external retrieval (more accurate outline based on MCP tool) │ ├── slide_agent/ # Concurrent multi-agent PPT generation primarily in XML format ├── frontend/ # Next.js frontend interface ``` --- ## ⚙️ Quick Start ### 🐍 Backend Environment Setup (Python) 1. Create and activate a Conda virtual environment (recommended Python version 3.11 or higher to avoid bugs): ```bash conda create --name multiagent python=3.12 conda activate multiagent ``` 2. Install dependencies: ```bash cd backend pip install -r requirements.txt ``` 3. Set backend environment variables: ```bash # Copy template configuration files for all modules cd backend/simpleOutline && cp env_template .env cd ../simplePPT && cp env_template .env cd ../slide_outline && cp env_template .env cd ../slide_agent && cp env_template .env ``` --- ### 🧪 Start Backend Services | Module | Function | Default Port | Start Command | | ------------------ | --------------------- | --------------------------------- | ----------------------------- | | `simpleOutline` | Simple Outline Generation | 10001 | `python main_api.py` | | `simplePPT` | Simple PPT Generation | 10011 | `python main_api.py` | | `slide_outline` | High-Quality Outline Generation (with retrieval) | 10001 (must close `simpleOutline`) | `python main_api.py` | | `slide_agent` | Multi-Agent Concurrent Full PPT Generation | 10011 (must close `simplePPT`) | `python main_api.py` | --- ## 🧱 Frontend Database Setup and Installation & Running (Next.js) The database stores user-generated PPTs: 1. Start PostgreSQL using Docker: ```bash # Use when connected to VPN docker run --name postgresdb -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=welcome -d postgres # Use domestically: docker run --name postgresdb -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=welcome -d swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/sclorg/postgresql-15-c9s:latest ``` 2. Modify the `.env` sample configuration: ```env DATABASE_URL="postgresql://postgres:welcome@localhost:5432/presentation_ai" A2A_AGENT_OUTLINE_URL="http://localhost:10001" A2A_AGENT_SLIDES_URL="http://localhost:10011" ``` 3. Install dependencies and push the database model: ```bash # Install frontend dependencies pnpm install # Push database model and insert user data pnpm db:push # Start frontend npm run dev ``` 4. Open the browser and visit: [http://localhost:3000/presentation](http://localhost:3000/presentation) --- --- ## 🧪 Sample Data Description > The current system has a built-in research example: **“Overview of Electric Vehicle Development”**. For research on other topics, please configure the corresponding agent and connect to real data sources. > To configure real data, simply change the prompt and the corresponding MCP tool. --- ## 📎 References The frontend project is partially based on the open-source repository: [allweonedev/presentation-ai](https://github.com/allweonedev/presentation-ai) ## Author WeChat ID johnsongzc

MultiAgentPPT

Content

You Might Also Like

OpenWebUI

NextChat

Continue

semantic-kernel

repomix

MaxKB

MultiAgentPPT

Scan with WeChat to Share

Content

You Might Also Like

OpenWebUI

NextChat

Continue

semantic-kernel

repomix

MaxKB