Content
# BeanBuddy-AI

## 🎯 Project Introduction
> **BeanBuddy-AI** is a multi-agent collaborative multi-modal Q-version pixel art design system.
> Built on the [NAT framework](https://github.com/NVIDIA/NeMo-Agent-Toolkit/tree/develop), this system integrates multiple intelligent agent modules such as visual understanding, style transfer, pattern generation, and material planning, allowing users to automatically generate cute Q-version pixel art design schemes through text, sketches, or image input. The project uses the MCP protocol to implement inter-agent communication and task coordination, and combines generative AI technology to provide creative support and visual output, aiming to lower the threshold for pixel art creation and stimulate users' creativity.
### ✨ Core Features
- 🤖 **Official Architecture**: 100% uses the official NVIDIA NeMo Agent Toolkit
- 🔧 **Flexible Configuration**: Supports any OpenAI-compatible API interface
- 🎨 **Modern Interface**: Official UI, supports real-time dialogue and streaming responses
- 🚀 **One-Click Deployment**: Supports Windows/Linux/macOS
## 🏗️ Technical Architecture
### Frontend
- **Framework**: Next.js 14 + TypeScript
- **UI Library**: Official [NeMo-Agent-Toolkit-UI](https://github.com/NVIDIA/NeMo-Agent-Toolkit-UI)
- **Features**: Real-time chat, theme switching, history recording
### Backend
- **Core**: [NVIDIA NeMo Agent Toolkit (AIQ)](https://github.com/NVIDIA/NeMo-Agent-Toolkit/tree/develop)
- **Workflow**: React Agent
- **Tools**: Description Enhancement Tool, Knowledge Graph Query Tool, Subject Extraction Tool, Text-to-Image Tool, Image-to-Pixel Art Design Tool
### Model Support
- **Default**: Qwen model
- **Compatibility**: Any OpenAI format API
- **Customization**: Users can configure API keys, model names, base_url
## 🚀 Quick Start
### 📋 Environment Requirements
- **Python**: 3.12+
- **Node.js**: 18+
- **Git**: Latest version
- **Operating System**: Windows 10+/macOS 10.15+/Ubuntu 20.04+
### ⚡ One-Click Installation
#### Clone the project
```shell
git clone https://git@github.com:ItGarbager/BeanBuddy-AI.git
cd BeanBuddy-AI
```
#### Install dependencies
##### Backend
```shell
cd backend/
pip install -r requirements.txt
pip install -e beanbuddy_ai
```
##### Frontend
```shell
cd frontend
npm install # You can also use cnpm install to speed up the download
```
### 🔑 Configure API Keys
After installation, you need to configure the following API keys:
#### 1. Frontend OSS Object Storage Key
In `frontend/components/Chat/ChatInput.tsx`, replace `accessKeySecret`
```ts
new OSS({
region: process.env.REACT_APP_OSS_REGION || 'oss-cn-beijing',
accessKeyId: process.env.REACT_APP_OSS_ACCESS_KEY_ID || 'LTAI5tMdnfA1ZARnE1r8pVFf',
accessKeySecret: process.env.REACT_APP_OSS_ACCESS_KEY_SECRET || 'oss对象存储密钥',
bucket: process.env.REACT_APP_OSS_BUCKET || 'hackathon-aiqtoolkit',
authorizationV4: true, // Browser must enable V4 signature
secure: true, // Use HTTPS (avoid HTTP cross-domain issues)
});
```
#### 2. Large Model API Key
Edit `backend/beanbuddy_ai/src/beanbuddy_ai/configs/config.yml`, replace it with your own Bailian API Key:
```yaml
llms:
# Use BAILIAN API by default (users can modify)
default_llm:
_type: openai
model_name: "qwen-plus"
api_key: "阿里云百炼平台的API-KEY"
base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1"
temperature: 0.7
max_tokens: 2048
```
**Supported API Providers**:
- **Alibaba Cloud Bailian Platform Qwen Series**: `https://bailian.console.aliyun.com/?tab=model#/model-market`
- **Other**: Any OpenAI-compatible API
### 🎮 Start the System
#### Start the backend service
```shell
cd backend
nat serve --config_file beanbuddy_ai/src/beanbuddy_ai/configs/config.yml --host 0.0.0.0 --port 8001
```
#### Start the frontend service
```shell
cd frontend
npm run dev
```
### 🌐 Access Address
- **Frontend Interface**: http://localhost:3000
- **API Documentation**: http://localhost:8001/docs
- **Health Check**: http://localhost:8001/health
## 🧪 Functionality Testing
### Subject Pixel Art Design Generation
> - User: 佐助
> - AI:
> <img src="docs/images/entity_name/zuozhu_perler.png" width="40%">
>
> ### Material List
> #### Color Card: 卡卡
> | Bead Number | Quantity | Color Preview |
> | --- | --- | --- |
> | B09 | 1192 | $${\color{green}■}$$ |
> | B16 | 822 | $${\color{blue}■}$$ |
> | B08 | 806 | $${\color{red}■}$$ |
> | B159 | 771 | $${\color{yellow}■}$$ |
>
> ...
### Image-to-Pixel Art Design
> - User:
> <img src="docs/images/extract_subject/lizijia.jpg" width="40%">
> - AI:
> <img src="docs/images/extract_subject/lizijia_perler.png" width="40%">
### Description-to-Pixel Art Design
> - User: A horse with three legs in the air, one foot stepping on a flying swallow
> - AI:
> <img src="docs/images/dscription/mtfy.png" width="40%">
## 📁 Project Structure
```
BeanBuddy-AI/
├── backend/ # Backend project
│ └── beanbuddy_ai/
│ ├── src/ # Backend project source code
│ │ ├── beanbuddy_ai/
│ │ │ ├── configs/ # Startup configuration directory
│ │ │ ...
│ │ └── beanbuddy_ai.egg-info/
│ │
│ └── pyproject.toml # Project configuration
├── docs/ # Documentation directory
│ └── images/... # Documentation screenshots
│
├── frontend/ # Frontend project
└── README.md # Documentation
```
## ⚙️ Advanced Configuration
### Custom Tools
Add a new tool in the configuration file:
```yaml
functions:
your_custom_tool:
_type: your_tool_type
description: "Tool description"
# Other configuration parameters
```
### Custom Workflow
```yaml
workflow:
_type: react_agent
tool_names:
- internet_search
- current_datetime
- your_custom_tool
llm_name: default_llm
verbose: true
```
### Debug Mode
```bash
cd backend
nat serve --config_file beanbuddy_ai/src/beanbuddy_ai/configs/config.yml --verbose
```
## 🐛 Troubleshooting
### Common Issues
#### 1. Port Occupancy
```bash
# Check port occupancy
netstat -tlnp | grep :8001
# Use a different port
aiq serve --port 8002
```
#### 2. API Key Error
```text
Error 1
nat.agent.react_agent.agent - ERROR - [AGENT] Failed to call agent_node: 'ascii' codec can't encode characters in position 7-14: ordinal not in range(128)
Error 2
openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided. ', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}, 'request_id': 'xxx'}
```
- Check the API key configuration in `backend/beanbuddy_ai/src/beanbuddy_ai/configs/config.yml`
- Verify the validity and permissions of the API key
#### 3. rembg Model Cannot Be Downloaded
```text
rembg model link: https://pan.baidu.com/s/1VuEo_s_phxKlskoo-KUsag?pwd=ynua Extraction code: ynua
```
Download and store it in the `~/.u2net/` directory.
#### 4. Frontend Cannot Connect to Backend
- Check if the backend is running normally (access http://localhost:8001/health)
- Confirm that the port configuration is correct
- Check firewall settings
#### 4. System Missing OpenGL Graphics Library (Opencv Unavailable)
Ubuntu/Debian
```shell
sudo apt-get update && sudo apt-get install -y libgl1-mesa-glx libglib2.0-0
```
Centos/RHEL
```shell
sudo yum install -y mesa-libGL
```
### Log Viewing
```bash
# View backend logs
tail -f logs/aiq.log
# View frontend logs
cd external/aiqtoolkit-opensource-ui
npm run dev --verbose
```
## 📚 Related Resources
### Official Documentation
- [NVIDIA NeMo Agent Toolkit](https://github.com/NVIDIA/NeMo-Agent-Toolkit)
- [Official Documentation](https://docs.nvidia.com/nemo-agent-toolkit/)
- [NeMo Agent Toolkit UI](https://github.com/NVIDIA/NeMo-Agent-Toolkit-UI)
### API Documentation
- [Tavily API Documentation](https://docs.tavily.com/)
- [Alibaba Cloud Bailian Platform](https://bailian.console.aliyun.com/?tab=doc#/doc)
- [OpenAI API Documentation](https://platform.openai.com/docs/)
### Learning Resources
- [AI Agent Development Guide](https://docs.nvidia.com/nemo-agent-toolkit/user-guide/)
- [React Agent Workflow](https://docs.nvidia.com/nemo-agent-toolkit/workflows/react-agent/)
- [MCP Protocol Documentation](https://docs.nvidia.com/nemo-agent-toolkit/mcp/)
## 🏆 Hackathon Information
This project is developed to promote NVIDIA NeMo Agent Toolkit technology, aiming to:
- 🎯 **Showcase AI Agent Capabilities**: Demonstrate the powerful features of NVIDIA NeMo Agent Toolkit through practical applications
- 🚀 **Lower the Learning Threshold**: Provide complete sample code and detailed documentation to help developers get started quickly
- 🌟 **Promote Technical Exchange**: Provide a platform for AI Agent technology enthusiasts to learn and communicate
- 💡 **Stimulate Innovative Thinking**: Encourage developers to create more innovative applications based on this project
### Technical Highlights
- ✅ **Fully Official Architecture**: Strictly follows NVIDIA official technical specifications
- ✅ **Production-Grade Quality**: Includes complete error handling, logging, and monitoring
- ✅ **Easy to Extend**: Modular design, supports quick addition of new features
- ✅ **Cross-Platform Support**: One set of code, runs on multiple platforms
---
**🎯 Let's explore the infinite possibilities of AI Agents together!**
> This project demonstrates the powerful capabilities of NVIDIA NeMo Agent Toolkit in practical applications, contributing to the popularization and development of AI Agent technology. Whether you are an AI beginner or an experienced developer, you can gain valuable learning experience from this project.
Connection Info
You Might Also Like
markitdown
Python tool for converting files and office documents to Markdown.
Fetch
Retrieve and process content from web pages by converting HTML into markdown format.
oh-my-opencode
Background agents · Curated agents like oracle, librarians, frontend...
chatbox
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
continue
Continue is an open-source project for seamless server management.
semantic-kernel
Build and deploy intelligent AI agents with Semantic Kernel's orchestration...