Content

# AuditLuma - Advanced Code Audit AI System 🔍 <div align="center"> ![Version](https://img.shields.io/badge/version-2.0.0-blue) ![License](https://img.shields.io/badge/license-MIT-green) ![Python](https://img.shields.io/badge/python-3.8+-yellow) ![Architecture](https://img.shields.io/badge/architecture-hierarchical_RAG-orange) </div> AuditLuma is an intelligent code audit system that adopts an innovative **hierarchical RAG architecture**, combining multiple AI agents and advanced technologies, including Haystack-AI Orchestrator, txtai knowledge retrieval, R2R context enhancement, and Self-RAG validation, to provide comprehensive and accurate security analysis for codebases. ## 🌟 Architecture Highlights - 🏗️ **Hierarchical RAG Architecture** - Four-layer intelligent architecture: Haystack Orchestration + txtai Retrieval + R2R Enhancement + Self-RAG Validation - 🚀 **Haystack-AI Orchestrator** - Intelligent task decomposition and result integration, supports fallback to traditional orchestrators - 🔍 **Intelligent Knowledge Retrieval** - txtai-driven semantic retrieval and context understanding - 🎯 **Precise Validation** - Self-RAG multi-model cross-validation, effectively reduces false positives - 🔄 **Adaptive Architecture** - Automatically selects the optimal architecture mode based on project size ## ✨ Core Features ### 🏗️ Hierarchical RAG Architecture - **Haystack Orchestration Layer** - Intelligent task decomposition, parallel execution, and result integration - **txtai Knowledge Retrieval Layer** - Semantic retrieval and context understanding - **R2R Context Enhancement Layer** - Dynamic context expansion and correlation analysis - **Self-RAG Validation Layer** - Multi-model cross-validation and false positive filtering ### 🚀 Intelligent Orchestration System - **Haystack-AI Orchestrator** - AI-based intelligent task orchestration (recommended) - **Traditional Orchestrator** - Rule-driven stable orchestration scheme - **Automatic Fallback Mechanism** - Automatic switching when the AI orchestrator is unavailable - **Dynamic Architecture Selection** - Automatically selects the optimal architecture based on project size ### 🔍 Advanced Analysis Capabilities - 🛡️ **Comprehensive Security Analysis** - Comprehensive detection of vulnerabilities and provision of effective remediation suggestions - 🌐 **Cross-File Security Analysis** - Detects cross-file vulnerabilities that traditional single-file analysis cannot find - 📊 **Global Context Construction** - Constructs code call graphs, data flow graphs, and dependencies - 🎯 **Taint Analysis** - Tracks the propagation path of user input in the code - 🔄 **MCP (Multi-Agent Collaboration Protocol)** - Enhances coordination and collaboration between agents ### 🌐 Enterprise-Level Support - **Multi-LLM Vendor Support** - Supports multiple vendors including OpenAI, DeepSeek, MoonShot, Tongyi Qianwen, and more - **Automatic Vendor Detection** - Automatically identifies and configures the correct vendor API based on the model name - **Asynchronous Parallel Processing** - Uses asynchronous concurrency technology to improve performance and speed up analysis - **Visualization Features** - Generates dependency graphs and detailed security reports ## 📋 Table of Contents - [Quick Start](#-quick-start) - [Hierarchical RAG Architecture](#-hierarchical-rag-architecture) - [Documentation](#-documentation) - [Installation](#-installation) - [Usage](#-usage) - [Configuration](#-configuration) - [Supported Languages](#-supported-languages) - [Architecture](#-architecture) - [Report Format](#-report-format) - [Contribution](#-contribution) - [License](#-license) ## 🚀 Quick Start ```bash # 1. Clone the project git clone https://github.com/Vistaminc/AuditLuma.git cd AuditLuma # 2. Install dependencies pip install -r requirements.txt # 3. Analyze using the hierarchical RAG architecture (recommended) python main.py --architecture hierarchical --haystack-orchestrator ai -d ./your-project # 4. View architecture information python main.py --show-architecture-info ``` ## 🏗️ Hierarchical RAG Architecture AuditLuma 2.0 introduces an innovative four-layer RAG architecture, significantly improving analysis accuracy and efficiency: ``` ┌─────────────────────────────────────────────────────────────┐ │ Hierarchical RAG Architecture │ ├─────────────────────────────────────────────────────────────┤ │ First Layer: Haystack Orchestration Layer │ │ ├─ Haystack-AI Orchestrator (Recommended) - Intelligent Task Decomposition and Result Integration │ │ └─ Traditional Orchestrator - Rule-Driven Stable Solution │ ├─────────────────────────────────────────────────────────────┤ │ Second Layer: txtai Knowledge Retrieval Layer │ │ ├─ Semantic Retrieval and Similarity Matching │ │ └─ Context Understanding and Knowledge Graph Construction │ ├─────────────────────────────────────────────────────────────┤ │ Third Layer: R2R Context Enhancement Layer │ │ ├─ Dynamic Context Expansion │ │ └─ Correlation Analysis and Dependency Tracking │ ├─────────────────────────────────────────────────────────────┤ │ Fourth Layer: Self-RAG Validation Layer │ │ ├─ Multi-Model Cross-Validation │ │ └─ False Positive Filtering and Confidence Assessment │ └─────────────────────────────────────────────────────────────┘ ``` ### Architectural Advantages - **🎯 Improved Accuracy** - Four-layer validation mechanism, significantly reduces false positives - **⚡ Performance Optimization** - Intelligent caching and parallel processing, improves analysis speed - **🔄 Adaptive** - Automatically selects the optimal configuration based on project size - **🛡️ Reliability** - Multiple fallback mechanisms to ensure stable system operation ## 📚 Documentation ### 🚀 Getting Started - [Installation Guide](./docs/installation-guide.md) - Detailed installation steps and environment configuration - [User Guide](./docs/user-guide.md) - Complete usage tutorial from beginner to expert - [Quick Reference](./docs/quick-reference.md) - Quick reference manual for common commands and configurations ### 🏗️ Core Documentation - [Hierarchical RAG Architecture Guide](./docs/hierarchical-rag-guide.md) - Detailed explanation and usage guide for the hierarchical RAG architecture - [Configuration Reference](./docs/configuration-reference.md) - Complete configuration options and parameter descriptions - [Best Practices](./docs/best-practices.md) - Usage suggestions, performance optimization, and security configuration ### 🔧 Technical Documentation - [Architecture Design](./docs/architecture-design.md) - System architecture and design philosophy - [Troubleshooting Guide](./docs/troubleshooting.md) - Common issues, error diagnosis, and solutions - [Project Structure](./项目结构.md) - Detailed project directory structure and module descriptions ### 📖 Online Resources - [AuditLuma Related Documentation](https://iwt6omodfh0.feishu.cn/drive/folder/OwWqf7EYblaqTNdaDbtcnQcHnTt) - Complete online documentation and tutorials ## 🚀 Installation Clone the repository and install dependencies: ```bash git clone https://github.com/Vistaminc/AuditLuma.git cd AuditLuma pip install -r requirements.txt ``` ### Optional Dependencies **FAISS Vector Retrieval Library** By default, AuditLuma uses a simple built-in vector storage implementation. If you need to process large codebases, it is recommended to install FAISS to improve performance: ```bash # CPU version pip install faiss-cpu # GPU version (supports CUDA) pip install faiss-gpu ``` After installing FAISS, the system will automatically detect and use it for vector storage and retrieval, significantly improving performance when analyzing large projects. ## 🛠 Usage ### Basic Usage ```bash # Use hierarchical RAG architecture (recommended) python main.py --architecture hierarchical -d ./your-project -o ./reports # Use Haystack-AI orchestrator (default, recommended) python main.py --architecture hierarchical --haystack-orchestrator ai -d ./your-project # Use traditional orchestrator python main.py --architecture hierarchical --haystack-orchestrator traditional -d ./your-project # Automatically select architecture (based on project size) python main.py --architecture auto -d ./your-project # Traditional RAG architecture (backward compatible) python main.py --architecture traditional -d ./your-project ``` ### Advanced Usage ```bash # Enable performance comparison mode python main.py --architecture hierarchical --enable-performance-comparison -d ./your-project # View architecture information and configuration python main.py --show-architecture-info # Configuration migration (upgrade from traditional configuration to hierarchical RAG) python main.py --config-migrate # AI-enhanced cross-file analysis python main.py --architecture hierarchical --enhanced-analysis -d ./your-project ``` ### Command Line Arguments #### Basic Parameters | Parameter | Description | Default Value | |------|------|--------| | `-d, --directory` | Target project directory | `./goalfile` | | `-o, --output` | Report output directory | `./reports` | | `-w, --workers` | Number of parallel worker threads | `max_batch_size` in configuration | | `-f, --format` | Report format (html/pdf/json) | `report_format` in configuration | #### Architecture Selection Parameters | Parameter | Description | Default Value | |------|------|--------| | `--architecture` | RAG architecture mode (traditional/hierarchical/auto) | `auto` | | `--haystack-orchestrator` | Haystack orchestrator type (traditional/ai) | `ai` | | `--force-traditional` | Force use of traditional RAG architecture | - | | `--force-hierarchical` | Force use of hierarchical RAG architecture | - | | `--enable-performance-comparison` | Enable performance comparison mode | - | | `--auto-switch-threshold` | File count threshold for automatic architecture switching | `100` | #### Hierarchical RAG Specific Parameters | Parameter | Description | Default Value | |------|------|--------| | `--enable-txtai` | Enable txtai knowledge retrieval layer | - | | `--enable-r2r` | Enable R2R context enhancement layer | - | | `--enable-self-rag-validation` | Enable Self-RAG validation layer | - | | `--disable-caching` | Disable hierarchical caching system | - | | `--disable-monitoring` | Disable performance monitoring | - | #### Traditional Feature Parameters | Parameter | Description | Default Value | |------|------|--------| | `--no-mcp` | Disable multi-agent collaboration protocol | Enabled by default | | `--no-self-rag` | Disable Self-RAG retrieval | Enabled by default | | `--no-deps` | Skip dependency analysis | Not skipped by default | | `--no-remediation` | Skip generating remediation suggestions | Not skipped by default | | `--no-cross-file` | Disable cross-file vulnerability detection | Enabled by default | | `--enhanced-analysis` | Enable AI-enhanced cross-file analysis | Disabled by default | #### Other Parameters | Parameter | Description | Default Value | |------|------|--------| | `--verbose` | Enable detailed logging | Disabled by default | | `--dry-run` | Dry run mode (does not perform actual analysis) | - | | `--config-migrate` | Migrate configuration to hierarchical RAG format | - | | `--show-architecture-info` | Show current architecture information and exit | - | ## ⚙️ Configuration Configure the system by editing the `config/config.yaml` file. AuditLuma 2.0 supports hierarchical RAG architecture configuration. ### Hierarchical RAG Configuration ```yaml # Hierarchical RAG architecture model configuration hierarchical_rag_models: # Whether to enable the hierarchical RAG architecture enabled: true # Haystack orchestration layer configuration haystack: # Orchestrator type selection: traditional or ai (Haystack-AI, recommended) orchestrator_type: "ai" # Use Haystack-AI orchestrator by default # Default model (supports model@provider format) default_model: "qwen3:32b@ollama" # Task-specific model configuration task_models: security_scan: "gpt-4@openai" # Security scanning uses a stronger model syntax_check: "deepseek-chat@deepseek" # Syntax check logic_analysis: "qwen-turbo@qwen" # Logic analysis dependency_analysis: "gpt-3.5-turbo@openai" # Dependency analysis # txtai knowledge retrieval layer model configuration txtai: retrieval_model: "gpt-3.5-turbo@openai" # Knowledge retrieval model embedding_model: "text-embedding-ada-002@openai" # Embedding model # R2R context enhancement layer model configuration r2r: context_model: "gpt-3.5-turbo@openai" # Context analysis model enhancement_model: "gpt-3.5-turbo@openai" # Enhancement model # Self-RAG validation layer model configuration self_rag_validation: validation_model: "gpt-3.5-turbo@openai" # Main validation model cross_validation_models: # Multiple models used for cross-validation - "gpt-4@openai" - "deepseek-chat@deepseek" - "gpt-3.5-turbo@openai" ``` ### Model Specification Format AuditLuma supports using a unified model specification format `model@provider` to specify the model and provider: ``` deepseek-chat@deepseek # Specifies the use of the deepseek-chat model from the DeepSeek provider gpt-4-turbo@openai # Specifies the use of the gpt-4-turbo model from the OpenAI provider qwen-turbo@qwen # Specifies the use of the qwen-turbo model from the Tongyi Qianwen provider ``` If the provider is not specified (no @ symbol is used), the system will automatically infer the provider based on the model name. ### Architecture Selection Configuration ```yaml # Global settings global: # Default architecture mode: traditional, hierarchical, auto default_architecture: "hierarchical" # Automatic switching threshold (number of files) auto_switch_threshold: 100 # Whether to enable performance comparison enable_performance_comparison: false ``` ### Multi-Vendor Support AuditLuma supports multiple LLM vendors and can automatically detect the vendor based on the model name: | Model Prefix | Vendor | |---------|------| | `gpt-` | OpenAI | | `deepseek-` | DeepSeek | | `qwen-` | Tongyi Qianwen | | `glm-` or `chatglm` | Zhipu AI | | `baichuan` | Baichuan | | `ollama-` | ollama | -Note: The openai vendor can connect to all openai format transfer platforms ## 💻 Supported Languages AuditLuma supports analyzing the following programming languages: ### Major Languages (Including Top 10) - Python (.py) - JavaScript (.js, .jsx) - TypeScript (.ts, .tsx) - Java (.java) - C# (.cs) - C++ (.cpp, .cc, .hpp) - C (.c, .h) - Go (.go) - Ruby (.rb) - PHP (.php) - Lua (.lua) ### Other Supported Languages - Rust (.rs) - Swift (.swift) - Kotlin (.kt) - Scala (.scala) - Dart (.dart) - Bash (.sh, .bash) - PowerShell (.ps1, .psm1) ### Markup and Configuration Languages - HTML (.html, .htm) - CSS (.css) - JSON (.json) - XML (.xml) - YAML (.yml, .yaml) - SQL (.sql) ## 🏛 Architecture AuditLuma uses a multi-agent architecture, including the following components: ![Architecture](https://via.placeholder.com/800x400?text=AuditLuma+Architecture) 1. **Agent Orchestrator** - Coordinates all agents in the workflow 2. **Code Analysis Agent** - Analyzes code structure and extracts dependencies 3. **Security Analysis Agent** - Identifies security vulnerabilities 4. **Remediation Suggestion Agent** - Generates targeted vulnerability remediation solutions 5. **Visualization Component** - Generates intuitive reports and dependency graphs ## 📊 Report Formats AuditLuma supports the following report formats: - 📋 **HTML Report** - Includes vulnerability details, statistical charts, and interactive visualizations - 📄 **PDF Report** - Suitable for printing and sharing - 🔄 **JSON Report** - Machine-readable format suitable for further processing and integration ## 💬 Contributing Contributions and suggestions are welcome! Please follow these steps: 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Commit your changes (`git commit -m 'Add some amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Create a Pull Request ## 📞 Contact -QQ：1047736593 ## 🤝 Partners - [棉花糖网络安全圈](https://vip.bdziyi.com/?ref=711) ## Support and Appreciation If you find AuditLuma helpful, you are welcome to support us in the following ways: - Your sponsorship will be used to help us continuously improve AuditLuma! <div style="display: flex; justify-content: space-between; max-width: 600px; margin: 0 auto;"> <div style="flex: 1; margin-right: 20px;"> <img src="https://github.com/Vistaminc/Miniluma/blob/main/ui/web/static/img/zanshang/wechat.jpg"/> </div> <div style="flex: 1;"> <img src="https://github.com/Vistaminc/Miniluma/blob/main/ui/web/static/img/zanshang/zfb.jpg"/> </div> </div> ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=vistaminc/Auditluma&type=Date)](https://www.star-history.com/#) ## 📜 License MIT --- <div align="center"> <sub>Built with ❤️ by AuditLuma Team</sub> </div>

AuditLuma

Content

Connection Info

You Might Also Like

markitdown

servers

Time

Filesystem

Sequential Thinking

git

AuditLuma

Scan with WeChat to Share

Authentication Required

Content

Connection Info

You Might Also Like

markitdown

servers

Time

Filesystem

Sequential Thinking

git