Content
<h1 align="center">Xiaozhi ESP32 Server Java</h1>
<p align="center">
A Java version server developed based on the <a href="https://github.com/78/xiaozhi-esp32">Xiaozhi ESP32</a> project, featuring a complete front-end and back-end management platform.<br/>
Provides robust back-end support and an intuitive management interface for smart hardware devices.
</p>
<p align="center">
<a href="https://github.com/joey-zhou/xiaozhi-esp32-server-java/issues">Report Issues</a>
· <a href="#deployment">Deployment Documentation</a>
· <a href="https://github.com/joey-zhou/xiaozhi-esp32-server-java/blob/main/CHANGELOG.md">Changelog</a>
</p>
<p align="center">
<a href="https://github.com/joey-zhou/xiaozhi-esp32-server-java/graphs/contributors">
<img alt="GitHub Contributors" src="https://img.shields.io/github/contributors/joey-zhou/xiaozhi-esp32-server-java?logo=github" />
</a>
<a href="https://github.com/joey-zhou/xiaozhi-esp32-server-java/issues">
<img alt="Issues" src="https://img.shields.io/github/issues/joey-zhou/xiaozhi-esp32-server-java?color=0088ff" />
</a>
<a href="https://github.com/joey-zhou/xiaozhi-esp32-server-java/pulls">
<img alt="GitHub pull requests" src="https://img.shields.io/github/issues-pr/joey-zhou/xiaozhi-esp32-server-java?color=0088ff" />
</a>
<a href="https://github.com/joey-zhou/xiaozhi-esp32-server-java/blob/main/LICENSE">
<img alt="License" src="https://img.shields.io/badge/license-MIT-white?labelColor=black" />
</a>
<a href="https://github.com/joey-zhou/xiaozhi-esp32-server-java">
<img alt="stars" src="https://img.shields.io/github/stars/joey-zhou/xiaozhi-esp32-server-java?color=ffcb47&labelColor=black" />
</a>
</p>
---
## Project Overview 📝
Xiaozhi ESP32 Server Java is a Java version server developed based on the [Xiaozhi ESP32](https://github.com/78/xiaozhi-esp32) project, featuring a complete front-end and back-end management platform. This project aims to provide users with a feature-rich and user-friendly management interface to help them better manage devices, configurations, and more.
Considering the needs of enterprise-level applications, Java, as a mature enterprise-level development language, offers a more complete ecosystem support and stronger concurrency capabilities. Therefore, we chose to develop this Java version server to provide more possibilities and expansion space for the project.
- **Back-end Framework**: Spring Boot + Spring MVC
- **Front-end Framework**: Vue.js + Ant Design
- **Data Storage**: MySQL + Redis
- **Global Responsiveness**: Adapts to various devices and resolutions
---
## Target Audience 👥
If you have purchased ESP32-related hardware and wish to control and manage your devices through a feature-complete and user-friendly management platform, this project is perfect for you. It is particularly suitable for:
- Users requiring enterprise-level stability
- Individual developers looking for a quick setup
- Users wanting a complete front-end management interface
- Users needing stronger data management and analysis capabilities
- Users with high requirements for system scalability
- Scenarios requiring support for a large number of concurrent device connections
- Applications with high demands for real-time data processing
---
## Feature Modules ✨
### Completed Features ✅
| Feature Module | Status | Description |
|----------------|--------|-------------|
| **Device Management** | ✅ | View the list of all connected devices, real-time monitoring of device status, add/edit/delete device information, automatic application of default settings for device binding |
| **Tone Selection** | ✅ | Provides various tone templates, previews tone effects, assigns different tone configurations for different devices |
| **Tone Cloning** | ✅ | Supports tone cloning with Volcano Engine and Alibaba Cloud, enabling personalized voice customization |
| **Chat History** | ✅ | View historical chat records, search chat content by date/keywords, delete messages, clear memory function |
| **Agent** | ✅ | Integrates with platforms like Coze and Dify to enable complex scenario dialogue capabilities |
| **Role Switching** | ✅ | Pre-set role switching (AI teacher, boyfriend/girlfriend, smart home assistant, etc.) supports voice role switching |
| **Persistent Dialogue** | ✅ | Supports persistent dialogue records for easy viewing of historical conversation content |
| **LLM Multi-Platform Support** | ✅ | Supports various large language models such as OpenAI, Zhiyuan AI, iFlytek Spark, Ollama, etc. |
| **Default Configuration Management** | ✅ | Supports setting default configurations, automatically applying default settings for newly bound devices |
| **IoT Device Control** | ✅ | Supports managing IoT devices through voice commands, enabling smart home control |
| **Smart Function Invocation** | ✅ | Supports smart invocation of functions like music playback (music service provided by third parties for personal entertainment use, this project does not assume any copyright responsibility), role switching, etc. |
| **Multi-Voice Recognition Services** | ✅ | Supports various voice recognition services such as Funasr, Alibaba, Tencent, Vosk, etc. |
| **Bidirectional Streaming Interaction** | ✅ | Supports real-time voice input and real-time reply output, improving dialogue fluency |
| **Real-Time Interruption** | ✅ | Supports real-time interruption functionality, enhancing dialogue fluency |
| **Local Offline Recognition** | ✅ | Supports Vosk local offline voice recognition, usable without internet connection |
| **WebSocket Communication** | ✅ | High-performance WebSocket communication, supports real-time status updates and control of devices |
| **MQTT Communication** | ✅ | Supports MQTT communication protocol, long connections, server-initiated wake-up |
| **Automatic Voice Wake-Up** | ✅ | Supports custom wake words for activation, no button press required to wake the device |
| **Multi-Device Concurrent Access** | ✅ | Supports simultaneous access of multiple devices, achieving full-house voice coverage |
| **TTS Multi-Engine Support** | ✅ | Supports various TTS engines such as Microsoft, Alibaba, Volcano, etc. |
| **Multi-User Support** | ✅ | Supports multi-user configuration to meet the needs of multiple family members |
| **Device Grouping** | ✅ | Supports device grouping management for easier classification and management of devices |
| **User End** | ✅ | Native card-style user end device management page for easy user configuration |
### Features Under Development 🚧
| Feature Module | Status | Description |
|----------------|--------|-------------|
| **Chat Data Visualization** | 🚧 | Data visualization features such as chat frequency statistics charts |
| **Mixed Mode Roles** | 🚧 | Supports multi-role mixed mode, waking different roles with different wake words (automatic switching) |
| **Memory Management** | 🚧 | Customizable memory dialogue count, historical dialogue summary/abstract function, manual operation of dialogue records |
| **Voiceprint Recognition** | 🚧 | Supports voiceprint recognition functionality for personalized voice assistants |
| **Multi-Language Support** | 🚧 | Supports multi-language interface to meet the needs of users from different regions |
| **Function Call** | 🚧 | Supports LLM function calling capabilities for complex task processing and intelligent decision-making |
| **Home Assistant** | 🚧 | Supports smart home device control, managing Home Assistant devices through voice commands |
| **Multimodal Interaction** | 🚧 | Supports image recognition and processing for richer interaction methods |
| **Sentiment Analysis** | 🚧 | Provides more humanized replies through voice sentiment analysis |
| **Multi-Device Collaboration** | 🚧 | Supports collaborative work of multiple devices, achieving a voice assistant system that covers the entire house |
| **Custom Plugin System** | 🚧 | Supports custom plugin development to extend system functionality |
| **Knowledge Base Integration** | 🚧 | Supports integration with external knowledge bases to enhance Q&A capabilities |
| **Voice Reminders and Alarms** | 🚧 | Supports setting voice reminders and alarm functions |
| **Remote Control** | 🚧 | Supports remote control of devices for management while away |
---
## UI Showcase 🎨
<div align="center">
<img src="docs/images/device.jpg" alt="Device Management" width="600" style="margin: 10px;" />
<p><strong>Device Management</strong> - Comprehensive management and monitoring of all connected devices</p>
</div>
<details>
<summary style="cursor: pointer; font-size: 1.2em; color: #0366d6; text-align: center; display: block; margin: 20px 0; padding: 10px; background-color:rgb(48, 48, 48); border-radius: 5px;">
<strong>👉 Click to see more interface screenshots 👈</strong>
</summary>
<div align="center">
<img src="docs/images/login.jpg" alt="Login Interface" width="600" style="margin: 10px;" />
<p><strong>Login Interface</strong> - Secure access point to the system</p>
<img src="docs/images/dashboard.jpg" alt="Dashboard" width="600" style="margin: 10px;" />
<p><strong>Dashboard</strong> - Overview of the system and display of key data</p>
<img src="docs/images/user.jpg" alt="User Management" width="600" style="margin: 10px;" />
<p><strong>User Management</strong> - Manage user information and permissions</p>
<img src="docs/images/message.jpg" alt="Message Records" width="600" style="margin: 10px;" />
<p><strong>Message Records</strong> - View and search historical dialogue content</p>
<img src="docs/images/model.jpg" alt="Model Management" width="600" style="margin: 10px;" />
<p><strong>Model Management</strong> - Configure and manage AI models</p>
<img src="docs/images/agent.jpg" alt="Agent Management" width="600" style="margin: 10px;" />
<p><strong>Agent Management</strong> - Set and switch agents, Coze/Dify</p>
<img src="docs/images/role.jpg" alt="Role Management" width="600" style="margin: 10px;" />
<p><strong>Role Management</strong> - Set and switch AI roles</p>
<img src="docs/images/voiceClone.jpg" alt="Voice Cloning" width="600" style="margin: 10px;" />
<p><strong>Voice Cloning</strong> - Clone your own voice for a personalized voice assistant</p>
</div>
</details>
---
<a id="deployment"></a>
## Deployment Documentation 📚
We provide various deployment methods to meet different user needs:
### 1. Local Source Run
- [Windows Deployment Documentation](./docs/WINDOWS_DEVELOPMENT.md) - Suitable for development and testing in Windows environment - Provided by community member "汇合"
- [CentOS Deployment Documentation](./docs/CENTOS_DEVELOPMENT.md) - Suitable for deployment in Linux server environment - Provided by community member "汇合"
After successful operation, the console will output the OTA and WebSocket connection addresses. Refer to the firmware compilation documentation to connect the device to the service.
### 2. Docker Deployment
- [Docker Deployment Documentation](./docs/DOCKER.md) - Quick containerized deployment solution - Provided by community member "💍Mr_li"
After successful startup, the WebSocket connection needs to communicate through the host IP, for example: `ws://192.168.31.100:8091/ws/xiaozhi/v1/`
### 3. Video Tutorial
- [Video Deployment Tutorial](https://doc.sivitacraft.com/article/xiaozhiai-javaserver/) - Recorded by community member "苦瓜"
### 4. Firmware Compilation
- [Firmware Compilation Documentation](./docs/FIRMWARE-BUILD.md) - Detailed firmware compilation and flashing process
After successful flashing and internet connection, wake up Xiaozhi with the wake word and pay attention to the console information output from the server.
---
## Development Roadmap 🗺️
Based on our [project development requirements list](https://github.com/users/joey-zhou/projects/1), we plan to implement the following features in the future:
### Short-Term Plans (2025 Q2)
- Improve Function Call capabilities to support more complex task processing
- Implement mixed mode for multiple roles, supporting different wake words for different roles
- Optimize the memory management system to provide more flexible historical dialogue management
- Implement chat data visualization features to provide data analysis capabilities
### Mid-Term Plans (2025 Q3-Q4)
- Implement voiceprint recognition functionality to support personalized voice assistants
- Enhance Home Assistant integration to provide more comprehensive smart home control capabilities
- Develop multimodal interaction features to support image recognition and processing
- Implement a custom plugin system to support functional extensions
### Long-Term Plans (2026+)
- Develop a multi-device collaborative working mechanism to achieve a voice assistant system that covers the entire house
- Implement sentiment analysis functionality to provide a more humanized interaction experience
- Develop knowledge base integration features to enhance Q&A capabilities
- Implement multi-user support to meet the needs of multiple family members
We will continuously adjust the development plan based on community feedback and technological advancements to ensure the project continues to meet user needs.
---
## Contribution Guidelines 👐
We welcome contributions in any form! If you have good ideas or find issues, please contact us through the following methods:
### WeChat
The first group is full; scan to join the second group.
<img src="docs/images/wechat_group.jpg" alt="WeChat" width="200" />
### QQ
Feel free to join our QQ group for discussions, QQ group number: 790820705
<img src="./web/static/img/qq.jpg" alt="QQ Group" width="200" />
### Custom Development
We accept various customized development projects. If you have specific needs, please contact us via WeChat for discussions.
<img src="./web/static/img/wechat.jpg" alt="WeChat" width="200" />
---
## Star History 📈
<a href="https://www.star-history.com/#joey-zhou/xiaozhi-esp32-server-java&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=joey-zhou/xiaozhi-esp32-server-java&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=joey-zhou/xiaozhi-esp32-server-java&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=joey-zhou/xiaozhi-esp32-server-java&type=Date" />
</picture>
</a>