Content
# Alibaba Cloud Observability MCP Server (Go Edition)
<p align="center">
<a href="./README.md"><img alt="English" src="https://img.shields.io/badge/English-d9d9d9"></a>
</p>
---
> **📌 Important Note**
>
> This project has been refactored using **Go language**. If you need to use the original Python version, please visit the [`v1`](./v1) directory:
> - 📖 [v1/README.md](./v1/README.md) - Python version documentation
> - 📦 Python version installation via `pip install mcp-server-aliyun-observability`
---
Alibaba Cloud Observability MCP Server is a Go language implementation that provides structured data access capabilities for AI models to Alibaba Cloud Log Service (SLS) and Cloud Monitor (CMS). Based on the [Model Context Protocol](https://modelcontextprotocol.io/) protocol, it can seamlessly integrate with AI tools such as Cursor, Kiro, Cline, and Windsurf.
## Features
- Supports stdio, SSE, and streamable-http transmission modes
- Modular toolset architecture: PaaS (Cloud Monitor 2.0), IaaS (SLS/CMS direct access), and Shared
- Flexible time expression parsing: relative time, absolute timestamp, Grafana style, and preset keywords
- Time series data comparison and analysis: statistical calculation, trend analysis, and difference scoring
- Structured error handling: English error description and solution suggestions
- Stability guarantee: retry (exponential backoff), circuit breaker, and graceful shutdown
- Structured JSON log (slog)
- Single binary file with zero runtime dependencies
## Quick Start
### Download and Installation
Download the binary file for the corresponding platform from the [Releases](https://github.com/aliyun/alibabacloud-observability-mcp-server/releases) page:
```bash
# Linux amd64
wget https://github.com/aliyun/alibabacloud-observability-mcp-server/releases/latest/download/alibabacloud-observability-mcp-server-linux-amd64.tar.gz
tar -xzf alibabacloud-observability-mcp-server-linux-amd64.tar.gz
# macOS arm64 (M1/M2)
wget https://github.com/aliyun/alibabacloud-observability-mcp-server/releases/latest/download/alibabacloud-observability-mcp-server-darwin-arm64.tar.gz
tar -xzf alibabacloud-observability-mcp-server-darwin-arm64.tar.gz
```
The extracted files include:
- `alibabacloud-observability-mcp-server` - executable file
- `config.yaml` - default configuration file
### Configure Credentials
```bash
# Set Alibaba Cloud AccessKey
export ALIBABA_CLOUD_ACCESS_KEY_ID=<your_access_key_id>
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=<your_access_key_secret>
```
> AccessKey acquisition method: [Alibaba Cloud AccessKey Management](https://help.aliyun.com/document_detail/53045.html)
### Start the Service
```bash
# Start in stdio mode (MCP client direct call)
./alibabacloud-observability-mcp-server start --stdio
# Start in network mode (default transport configured in config.yaml)
./alibabacloud-observability-mcp-server start --config config.yaml
```
### CLI Commands
```bash
# View version information
./alibabacloud-observability-mcp-server version
# List all registered tools
./alibabacloud-observability-mcp-server tools
```
---
## Build from Source
### Prerequisites
- Go 1.23+
### Build
```bash
# Clone the repository
git clone https://github.com/aliyun/alibabacloud-observability-mcp-server.git
cd alibabacloud-observability-mcp-server
# Build for the current platform
make build
# Build for all platforms (linux/darwin/windows × amd64/arm64)
make build-all
```
The generated binary files are located in the `bin/` directory.
## Configuration
The configuration consists of two layers:
1. `config.yaml` - server configuration (transport mode, logging, networking, etc.)
2. `.env` file or environment variables - credentials and runtime parameters
### Configuration File
```bash
cp config.yaml config.yaml.bak # Backup default configuration (optional)
cp .env.example .env # Credentials (AccessKey)
```
`config.yaml` search path: current directory → `./config/`
`.env` file loaded from the current directory, suitable for storing credentials that should not be committed to version control.
### config.yaml Structure
```yaml
# Server configuration
server:
transport: streamable-http # stdio, sse, streamable-http
host: "0.0.0.0"
port: 8080
# Logging configuration
logging:
level: info # debug, info, warn, error
debug_mode: false
# Toolkit configuration
toolkit:
scope: all # all, paas, iaas
# Fine-grained tool selection (optional, non-empty only registers tools in the list)
# enabled_tools:
# - list_workspace
# - umodel_get_entities
# - sls_execute_sql
# Networking configuration
network:
max_retry: 1
retry_wait_seconds: 1
read_timeout_ms: 610000
connect_timeout_ms: 30000
# Localization configuration
locale:
timezone: Asia/Shanghai
language: en-US
# Runtime defaults (optional)
# Priority: environment variables > .env file > config.yaml
runtime:
region: cn-hangzhou
# workspace: ""
# Endpoint override (optional, for internal access)
# endpoints:
# sls:
# cn-hongkong: "cn-hongkong-intranet.log.aliyuncs.com"
# cms:
# cn-hongkong: "cms.cn-hongkong.aliyuncs.com"
```
#### Fine-Grained Tool Selection
By default, `toolkit.scope` controls tool activation by category (`all`/`paas`/`iaas`). For finer-grained control, use `toolkit.enabled_tools` to specify the list of tools to enable:
```yaml
toolkit:
scope: all
enabled_tools:
- list_workspace
- list_domains
- umodel_get_entities
- umodel_get_metrics
- sls_execute_sql
```
When `enabled_tools` is non-empty, only tools in the list are registered, and others are unavailable. `scope` still determines which toolkit modules to load, and `enabled_tools` further filters them.
For a complete list of tools and categorization, refer to the comment template in `config.yaml`.
### CLI Parameters
| Parameter | Description | Default |
|------|------|--------|
| `--config` | Specify configuration file path | Automatic search |
| `--stdio` | Force use of stdio transmission mode | false |
### Environment Variables (Credentials and Runtime Parameters)
| Environment Variable | Description | Required |
|---------|------|------|
| `ALIBABA_CLOUD_ACCESS_KEY_ID` | AccessKey ID | No* |
| `ALIBABA_CLOUD_ACCESS_KEY_SECRET` | AccessKey Secret | No* |
| `ALIBABA_CLOUD_SECURITY_TOKEN` | STS Token (temporary credential) | No |
| `ALIBABA_CLOUD_REGION` | Default region | No |
| `ALIBABA_CLOUD_WORKSPACE` | Default workspace (required for PaaS tools) | No |
> \* When AccessKey is not configured, the service automatically uses the [default credential chain](https://help.aliyun.com/zh/sdk/developer-reference/v2-manage-go-access-credentials) to obtain credentials (supporting ECS RAM Role, OIDC, configuration files, etc.). In ECS, function computing, and other cloud environments, no manual AccessKey configuration is required.
Credential resolution priority: CLI parameters / `.env` file > shell environment variables > default credential chain.
> **💡 Automatic default filling**
>
> When `ALIBABA_CLOUD_REGION` or `ALIBABA_CLOUD_WORKSPACE` is set, if the tool call does not provide `regionId` or `workspace` parameters, the service automatically uses the environment variable values as defaults. User-explicitly passed values are not overridden.
## AI Tool Integration
### Cursor / Kiro / Cline
**Streamable-http mode (recommended):**
1. Configure `config.yaml` (set `server.transport: streamable-http`)
2. Start the service:
```bash
./bin/alibabacloud-observability-mcp-server start
```
3. Configure `mcp.json`:
```json
{
"mcpServers": {
"alibaba_cloud_observability": {
"url": "http://localhost:8080"
}
}
}
```
**Stdio mode:**
1. Configure `mcp.json`:
```json
{
"mcpServers": {
"alibaba_cloud_observability": {
"command": "./bin/alibabacloud-observability-mcp-server",
"args": ["start", "--stdio"],
"env": {
"ALIBABA_CLOUD_ACCESS_KEY_ID": "<your_access_key_id>",
"ALIBABA_CLOUD_ACCESS_KEY_SECRET": "<your_access_key_secret>"
}
}
}
}
```
Note: In stdio mode, if `config.yaml` does not exist, the built-in default values are used.
## Toolset
A total of 33 tools, divided into three levels.
### PaaS Toolset (Cloud Monitor 2.0, recommended)
Based on the unified data model, tool names prefixed with `umodel_` or `cms_`. 16 tools in total.
#### Entity Management Tools
| Tool | Description | Key Parameters |
|------|------|---------|
| `umodel_get_entities` | Get entity list | `workspace`, `domain`, `entity_set_name`, `regionId` (required); `entity_filter` (optional) |
| `umodel_get_neighbor_entities` | Get entity neighbor relationships | `workspace`, `src_entity_domain`, `src_name`, `src_entity_ids`, `regionId` (required) |
| `umodel_search_entities` | Search entities | `workspace`, `search_text`, `regionId` (required) |
#### Dataset Management Tools
| Tool | Description | Key Parameters |
|------|------|---------|
| `umodel_list_data_set` | List datasets | `workspace`, `domain`, `entity_set_name`, `regionId` (required); `data_set_types` (optional) |
| `umodel_search_entity_set` | Search entity sets | `workspace`, `search_text`, `regionId` (required) |
| `umodel_get_entity_set` | Get entity set schema definition | `domain`, `entity_set_name`, `workspace`, `regionId` (required); `detail` (optional) |
| `umodel_list_related_entity_set` | List related entity sets | `workspace`, `domain`, `entity_set_name`, `regionId` (required) |
#### Data Query Tools
| Tool | Description | Key Parameters |
|------|------|---------|
| `umodel_get_metrics` | Query metric data | `workspace`, `domain`, `entity_set_name`, `metric_domain_name`, `metric`, `regionId` (required); `analysis_mode` (basic/cluster/forecast/anomaly_detection), `offset` (time series comparison), `time_range` (optional) |
| `umodel_get_golden_metrics` | Query golden metrics | `workspace`, `domain`, `entity_set_name`, `regionId` (required); `offset`, `time_range` (optional) |
| `umodel_get_relation_metrics` | Query relationship metrics | `workspace`, `src_domain`, `src_entity_set_name`, `relation_type`, `direction` (in/out), `metric`, `metric_set_domain`, `regionId` (required); `dest_entity_set_name` (optional) |
| `umodel_get_logs` | Query log data | `workspace`, `domain`, `entity_set_name`, `log_set_domain`, `log_set_name`, `regionId` (required); `time_range`, `limit` (optional) |
| `umodel_get_events` | Query event data | `workspace`, `domain`, `entity_set_name`, `event_set_domain`, `event_set_name`, `regionId` (required); `time_range`, `limit` (optional) |
| `umodel_get_traces` | Query trace data | `workspace`, `domain`, `entity_set_name`, `trace_set_domain`, `trace_set_name`, `trace_ids`, `regionId` (required); `time_range` (optional) |
| `umodel_search_traces` | Search traces | `workspace`, `domain`, `entity_set_name`, `trace_set_domain`, `trace_set_name`, `regionId` (required); `conditions`, `limit`, `time_range` (optional) |
| `umodel_get_profiles` | Query performance profile data | `workspace`, `domain`, `entity_set_name`, `profile_set_domain`, `profile_set_name`, `entity_ids`, `regionId` (required); `time_range`, `limit` (optional) |
| `cms_natural_language_query` | Natural language data query | `query`, `workspace`, `regionId` (required); `time_range` (optional) |
### IaaS Toolset (SLS/CMS direct access)
Directly access the underlying API, tool names prefixed with `sls_` or `cms_`. 14 tools in total.
#### SLS Tools
| Tool | Description | Key Parameters |
|------|------|---------|
| `sls_list_projects` | List projects | `regionId` (required); `project` (optional, fuzzy search) |
| `sls_list_logstores` | List logstores | `project`, `regionId` (required) |
| `sls_text_to_sql` | Natural language to SQL | `text`, `project`, `logStore`, `regionId` (required) |
| `sls_text_to_sql_old` | Natural language to SQL (old version, compatible with Python version) | `text`, `project`, `logStore`, `regionId` (required) |
| `sls_text_to_spl` | Natural language to SPL | `text`, `project`, `logStore`, `data_sample`, `regionId` (required) |
| `sls_execute_sql` | Execute SQL query | `project`, `logStore`, `query`, `regionId` (required); `from_time`, `to_time` (optional) |
| `sls_execute_spl` | Execute native SPL query | `query`, `workspace`, `regionId` (required); `from_time`, `to_time` (optional) |
| `sls_get_context_logs` | Get log context | `project`, `logStore`, `pack_id`, `pack_meta`, `regionId` (required); `back_lines`, `forward_lines` (optional) |
| `sls_log_explore` | Log exploration and analysis | `project`, `logStore`, `logField`, `regionId` (required); `from_time`, `to_time`, `filter_query`, `groupField` (optional) |
| `sls_log_compare` | Log comparison analysis | `project`, `logStore`, `logField`, `regionId` (required); `test_from_time`, `test_to_time`, `control_from_time`, `control_to_time`, `filter_query`, `groupField` (optional) |
| `sls_sop` | SLS operation assistant | `text`, `regionId` (required) |
#### CMS Tools
| Tool | Description | Key Parameters |
|------|------|---------|
| `cms_execute_promql` | Execute PromQL query | `project`, `metricStore`, `query`, `regionId` (required); `from_time`, `to_time` (optional) |
| `cms_text_to_promql` | Natural language to PromQL | `text`, `project`, `metricStore`, `regionId` (required) |
### Shared Tools
There are 3 tools.
| Tool | Description | Key Parameters |
|------|------|---------|
| `list_workspace` | List workspaces | `regionId` (required) |
| `list_domains` | List entity domains | `workspace`, `regionId` (required) |
| `introduction` | Service introduction | No parameters |
## Time Expressions
All data query tools support flexible time range formats:
| Format | Example |
|------|------|
| Relative presets | `last_5m`, `last_1h`, `last_3d`, `last_1w`, `last_1M`, `last_1y` |
| Relative time | `now()-1h`, `now-30m`, `now()-7d` |
| Grafana style | `now-15m~now-5m`, `now/d`, `now-1d/d` |
| Keywords | `today`, `yesterday` |
| Absolute timestamp | `1718451045` (seconds), `1718451045000` (milliseconds) |
| Date and time string | `2024-01-01 00:00:00`, `2024-01-01T00:00:00Z` |
## Advanced Features
### Time Series Comparison and Analysis
`umodel_get_metrics` and `umodel_get_golden_metrics` support time series comparison through the `offset` parameter:
```
# Compare current 1 hour with 1 day ago data
umodel_get_metrics(
domain="apm", entity_set_name="apm.service",
metric_domain_name="apm.metric.apm.service", metric="request_count",
time_range="last_1h", offset="1d"
)
```
The return result contains:
- `current`: Current period statistics (max, min, avg, count)
- `compare`: Comparison period statistics
- `diff`: Change analysis (trend, avg_change, avg_change_percent)
- `diff_score`: Difference score (0-1, larger difference is more significant)
### Advanced Analysis Mode
`umodel_get_metrics` supports four analysis modes:
| Mode | Description | Output fields |
|------|------|---------|
| `basic` | Original time series data (default) | `__ts__`, `__value__`, `__labels__` |
| `cluster` | K-Means time series clustering | `__cluster_index__`, `__entities__`, `__sample_value__` |
| `forecast` | Time series forecasting (requires 1-5 days of historical data) | `__forecast_ts__`, `__forecast_value__`, `__forecast_lower/upper_value__` |
| `anomaly_detection` | Anomaly detection (requires 1-3 days of data) | `__anomaly_list_`, `__anomaly_msg__`, `__value_min/max/avg__` |
## Project Structure
```
├── cmd/server/ # CLI entry (cobra)
├── pkg/
│ ├── client/ # SLS/CMS client encapsulation
│ ├── config/ # Configuration management (viper + sync.Once)
│ ├── endpoint/ # Endpoint resolution
│ ├── errors/ # Structured errors and error code mapping
│ ├── logger/ # Structured logging (slog)
│ ├── server/ # MCP Server core (transport layer, lifecycle, health check)
│ ├── stability/ # Retry and circuit breaker
│ ├── timeparse/ # Time expression parsing
│ └── toolkit/ # Toolset interface and registry
│ ├── paas/ # PaaS toolset (umodel_*, cms_natural_language_query)
│ ├── iaas/ # IaaS toolset (sls_*, cms_execute_promql, cms_text_to_promql)
│ └── shared/ # Shared toolset (list_workspace, list_domains, introduction)
├── v1/ # Python version (historical reference)
├── Makefile
├── go.mod
└── go.sum
```
## Development
```bash
# Build
make build
# Run tests
make test
# Code check
make lint
# Clean build artifacts
make clean
```
### Testing
The project adopts a three-track strategy of unit testing + property testing + regression testing:
- Unit testing: Table-driven testing, covering specific examples and boundary conditions
- Property testing: Using [gopter](https://github.com/leanovate/gopter), verifying general correctness properties across all inputs
- Regression testing: Integration testing (`//go:build integration`), comparing Python version parameter consistency, requiring real Alibaba Cloud credentials
```bash
# Run all unit tests
go test ./... -v
# Run property tests only
go test ./... -run TestProperty_
# Run regression tests (requires environment variables)
ALIBABA_CLOUD_ACCESS_KEY_ID=xxx \
ALIBABA_CLOUD_ACCESS_KEY_SECRET=xxx \
ALIBABA_CLOUD_REGION=cn-hongkong \
ALIBABA_CLOUD_WORKSPACE=xxx \
go test -tags=integration ./pkg/toolkit/... -v
```
### AI Agent Development Guidelines
Refer to [docs/AGENTS.md](docs/AGENTS.md), including project structure description, code style conventions, new tool process, testing specifications, etc.
## Permission Requirements
To ensure that MCP Server can successfully access and operate your Alibaba Cloud observability resources, you need to configure the following permissions:
### Alibaba Cloud Access Key
- The service requires a valid Alibaba Cloud credential, supporting the following methods (in order of priority):
1. AccessKey ID + AccessKey Secret (passed in through `.env` file, environment variables, or CLI parameters)
2. STS temporary credential (set `ALIBABA_CLOUD_SECURITY_TOKEN` environment variable)
3. [Default credential chain](https://help.aliyun.com/zh/sdk/developer-reference/v2-manage-go-access-credentials) automatic discovery (ECS RAM Role, OIDC, credential configuration file, etc.)
- For obtaining and managing AccessKey, refer to the [Alibaba Cloud AccessKey management official documentation](https://help.aliyun.com/document_detail/53045.html)
### RAM Authorization
The RAM user or role associated with AccessKey **must** be granted the necessary permissions to access related cloud services.
**It is strongly recommended to follow the "minimum permission principle"**: only grant the minimum set of permissions required to run the MCP tools you plan to use.
According to the tools you need to use, refer to the following documents for permission configuration:
| Service | Permission Documentation | Description |
|------|---------|------|
| Log Service (SLS) | [SLS Permission Description](https://help.aliyun.com/zh/sls/overview-8) | Required for `sls_*` tools |
| Application Real-time Monitoring (ARMS) | [ARMS Permission Description](https://help.aliyun.com/zh/arms/security-and-compliance/overview-8) | Required for `umodel_*` tools |
| Cloud Monitor (CMS) | [CMS Permission Description](https://help.aliyun.com/zh/cms/cloudmonitor-2-0/) | Required for `cms_*` tools |
**Special Permission Description**:
- Using SQL generation tools (such as `sls_text_to_sql`) requires granting `sls:CallAiTools` permission separately
- Using natural language query functionality (`cms_natural_language_query`) requires granting: `cms:CreateChat`, `cms:CreateThread`, `cms:GetThread`, `cms:ListThreads`
## Security Recommendations
- The service does not store AccessKey, only used for API calls during runtime
- In SSE/HTTP mode, ensure that you control access to the access point yourself
- It is recommended to deploy in an internal network or VPC to avoid direct exposure to the public network
- Do not expose the configured AccessKey service endpoint to the public network without authentication
- It is recommended to use Alibaba Cloud Function Compute (FC) deployment and configure it to be accessible only within the VPC
## License
This project follows the same license agreement as the original Python version.
Connection Info
You Might Also Like
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
markitdown
Python tool for converting files and office documents to Markdown.
firecrawl
Firecrawl MCP Server enables web scraping, crawling, and content extraction.
Filesystem
Node.js MCP Server for filesystem operations with dynamic access control.
TrendRadar
TrendRadar: Your hotspot assistant for real news in just 30 seconds.
mempalace
The highest-scoring AI memory system ever benchmarked. And it's free.