Content
# MCP CLI - Model Context Protocol Command Line Interface
[](https://github.com/chrishayuk/mcp-cli/actions/workflows/ci.yml)
[](https://pypi.org/project/mcp-cli/)
A powerful, feature-rich command-line interface for interacting with Model Context Protocol servers. This client enables seamless communication with LLMs through integration with the [CHUK Tool Processor](https://github.com/chrishayuk/chuk-tool-processor) and [CHUK-LLM](https://github.com/chrishayuk/chuk-llm), providing tool usage, conversation management, and multiple operational modes.
**Default Configuration**: MCP CLI defaults to using Ollama with the `gpt-oss` reasoning model for local, privacy-focused operation without requiring API keys.
## 🆕 Recent Updates (v0.12.0)
### Performance & Polish (Tier 3)
- **O(1) Tool Lookups**: Indexed tool lookup replacing O(n) linear scans in both ToolManager and ChatContext
- **Cached LLM Tool Metadata**: Per-provider caching of tool definitions with automatic invalidation
- **Startup Progress**: Real-time progress messages during initialization instead of a single spinner
- **Token Usage Tracking**: Per-turn and cumulative token tracking with `/usage` command (aliases: `/tokens`, `/cost`)
- **Session Persistence**: Save/load/list conversation sessions with auto-save every 10 turns (`/sessions`)
- **Conversation Export**: Export conversations as Markdown or JSON with metadata (`/export`)
- **Trusted Domains**: Tools from trusted server domains (e.g. chukai.io) skip confirmation prompts
### Architecture & Performance
- **Updated to chuk-llm v0.16+**: Dynamic model discovery with capability-based selection, llama.cpp integration (1.53x faster), 52x faster imports
- **Updated to chuk-tool-processor v0.13+**: Now using CTP's production-grade middleware (retry, circuit breaker, rate limiting)
- **Slimmed ToolManager**: Reduced from 2000+ lines to ~800 lines by delegating to StreamManager while keeping OAuth, filtering, and LLM adaptation
### Reliability Improvements
- **Transport Failure Detection**: Automatic tracking of consecutive transport failures with warnings and recovery suggestions
- **Enhanced Tool Processing**: Improved MCP SDK ToolResult handling with proper content extraction from nested structures
- **Connection Monitoring**: Built-in health checks with automatic detection of unhealthy connections
### Bug Fixes
- **Fixed cmd mode**: `--provider` and `--model` flags now work correctly in command mode (PR #188)
- **OAuth Improvements**: Enhanced OAuth token handling and storage
- **Pydantic Migration**: Clean migration to Pydantic for better validation and type safety
## 🔄 Architecture Overview
The MCP CLI is built on a modular architecture with clean separation of concerns:
- **[CHUK Tool Processor](https://github.com/chrishayuk/chuk-tool-processor)**: Production-grade async tool execution with middleware (retry, circuit breaker, rate limiting), multiple execution strategies, and observability
- **[CHUK-LLM](https://github.com/chrishayuk/chuk-llm)**: Unified LLM provider with dynamic model discovery, capability-based selection, and llama.cpp integration (1.53x faster than Ollama with automatic model reuse)
- **[CHUK-Term](https://github.com/chrishayuk/chuk-term)**: Enhanced terminal UI with themes, cross-platform terminal management, and rich formatting
- **MCP CLI**: Command orchestration and integration layer (this project)
## 🌟 Features
### Multiple Operational Modes
- **Chat Mode**: Conversational interface with streaming responses and automated tool usage (default: Ollama/gpt-oss)
- **Interactive Mode**: Command-driven shell interface for direct server operations
- **Command Mode**: Unix-friendly mode for scriptable automation and pipelines
- **Direct Commands**: Run individual commands without entering interactive mode
### Advanced Chat Interface
- **Streaming Responses**: Real-time response generation with live UI updates
- **Reasoning Visibility**: See AI's thinking process with reasoning models (gpt-oss, GPT-5, Claude 4.5)
- **Concurrent Tool Execution**: Execute multiple tools simultaneously while preserving conversation order
- **Smart Interruption**: Interrupt streaming responses or tool execution with Ctrl+C
- **Performance Metrics**: Response timing, words/second, and execution statistics
- **Rich Formatting**: Markdown rendering, syntax highlighting, and progress indicators
- **Token Usage Tracking**: Per-turn and cumulative API token usage with `/usage` command
- **Session Persistence**: Auto-save and manual save/load of conversation sessions
- **Conversation Export**: Export to Markdown or JSON with metadata and token usage
### Comprehensive Provider Support
MCP CLI supports all providers and models from CHUK-LLM, including cutting-edge reasoning models:
| Provider | Key Models | Special Features |
|----------|------------|------------------|
| **Ollama** (Default) | 🧠 gpt-oss, llama3.3, llama3.2, qwen3, qwen2.5-coder, deepseek-coder, granite3.3, mistral, gemma3, phi3, codellama | Local reasoning models, privacy-focused, no API key required |
| **OpenAI** | 🚀 GPT-5 family (gpt-5, gpt-5-mini, gpt-5-nano), GPT-4o family, O3 series (o3, o3-mini) | Advanced reasoning, function calling, vision |
| **Anthropic** | 🧠 Claude 4.5 family (claude-4-5-opus, claude-4-5-sonnet), Claude 3.5 Sonnet | Enhanced reasoning, long context |
| **Azure OpenAI** 🏢 | Enterprise GPT-5, GPT-4 models | Private endpoints, compliance, audit logs |
| **Google Gemini** | Gemini 2.0 Flash, Gemini 1.5 Pro | Multimodal, fast inference |
| **Groq** ⚡ | Llama 3.1 models, Mixtral | Ultra-fast inference (500+ tokens/sec) |
| **Perplexity** 🌐 | Sonar models | Real-time web search with citations |
| **IBM watsonx** 🏢 | Granite, Llama models | Enterprise compliance |
| **Mistral AI** 🇪🇺 | Mistral Large, Medium | European, efficient models |
### Robust Tool System (Powered by CHUK Tool Processor v0.13+)
- **Automatic Discovery**: Server-provided tools are automatically detected and catalogued
- **Provider Adaptation**: Tool names are automatically sanitized for provider compatibility
- **Production-Grade Execution**: Middleware layers with timeouts, retries, exponential backoff, caching, and circuit breakers
- **Multiple Execution Strategies**: In-process (fast), isolated subprocess (safe), or remote via MCP
- **Concurrent Execution**: Multiple tools can run simultaneously with proper coordination
- **Rich Progress Display**: Real-time progress indicators and execution timing
- **Tool History**: Complete audit trail of all tool executions
- **Middleware**: Retry with exponential backoff, circuit breakers, and rate limiting via CTP
- **Streaming Tool Calls**: Support for tools that return streaming data
### Advanced Configuration Management
- **Environment Integration**: API keys and settings via environment variables
- **File-based Config**: YAML and JSON configuration files
- **User Preferences**: Persistent settings for active providers and models
- **Validation & Diagnostics**: Built-in provider health checks and configuration validation
### Enhanced User Experience
- **Cross-Platform Support**: Windows, macOS, and Linux with platform-specific optimizations via chuk-term
- **Rich Console Output**: Powered by chuk-term with 8 built-in themes (default, dark, light, minimal, terminal, monokai, dracula, solarized)
- **Advanced Terminal Management**: Cross-platform terminal operations including clearing, resizing, color detection, and cursor control
- **Interactive UI Components**: User input handling through chuk-term's prompt system (ask, confirm, select_from_list, select_multiple)
- **Command Completion**: Context-aware tab completion for all interfaces
- **Comprehensive Help**: Detailed help system with examples and usage patterns
- **Graceful Error Handling**: User-friendly error messages with troubleshooting hints
## 📚 Documentation
Comprehensive documentation is available in the `docs/` directory:
### Core Documentation
- **[Commands System](docs/COMMANDS.md)** - Complete guide to the unified command system, patterns, and usage across all modes
- **[Token Management](docs/TOKEN_MANAGEMENT.md)** - Comprehensive token management for providers and servers including OAuth, bearer tokens, and API keys
### Specialized Documentation
- **[OAuth Authentication](docs/OAUTH.md)** - OAuth flows, storage backends, and MCP server integration
- **[Streaming Integration](docs/STREAMING.md)** - Real-time response streaming architecture
- **[Package Management](docs/PACKAGE_MANAGEMENT.md)** - Dependency organization and feature groups
### UI Documentation
- **[Themes](docs/ui/themes.md)** - Theme system and customization
- **[Output System](docs/ui/output.md)** - Rich console output and formatting
- **[Terminal Management](docs/ui/terminal.md)** - Cross-platform terminal operations
### Testing Documentation
- **[Unit Testing](docs/testing/UNIT_TESTING.md)** - Test structure and patterns
- **[Test Coverage](docs/testing/TEST_COVERAGE.md)** - Coverage requirements and reporting
## 📋 Prerequisites
- **Python 3.11 or higher**
- **For Local Operation (Default)**:
- Ollama: Install from [ollama.ai](https://ollama.ai)
- Pull the default reasoning model: `ollama pull gpt-oss`
- **For Cloud Providers** (Optional):
- OpenAI: `OPENAI_API_KEY` environment variable (for GPT-5, GPT-4, O3 models)
- Anthropic: `ANTHROPIC_API_KEY` environment variable (for Claude 4.5, Claude 3.5)
- Azure: `AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT` (for enterprise GPT-5)
- Google: `GEMINI_API_KEY` (for Gemini models)
- Groq: `GROQ_API_KEY` (for fast Llama models)
- Custom providers: Provider-specific configuration
- **MCP Servers**: Server configuration file (default: `server_config.json`)
## 🚀 Installation
### Quick Start with Ollama (Default)
1. **Install Ollama** (if not already installed):
```bash
# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh
# Or visit https://ollama.ai for other installation methods
```
2. **Pull the default reasoning model**:
```bash
ollama pull gpt-oss # Open-source reasoning model with thinking visibility
```
3. **Install and run MCP CLI**:
```bash
# Using uvx (recommended)
uvx mcp-cli --help
# Or install from source
git clone https://github.com/chrishayuk/mcp-cli
cd mcp-cli
pip install -e "."
mcp-cli --help
```
### Using Different Models
```bash
# === LOCAL MODELS (No API Key Required) ===
# Use default reasoning model (gpt-oss)
mcp-cli --server sqlite
# Use other Ollama models
mcp-cli --model llama3.3 # Latest Llama
mcp-cli --model qwen2.5-coder # Coding-focused
mcp-cli --model deepseek-coder # Another coding model
mcp-cli --model granite3.3 # IBM Granite
# === CLOUD PROVIDERS (API Keys Required) ===
# GPT-5 Family (requires OpenAI API key)
mcp-cli --provider openai --model gpt-5 # Full GPT-5 with reasoning
mcp-cli --provider openai --model gpt-5-mini # Efficient GPT-5 variant
mcp-cli --provider openai --model gpt-5-nano # Ultra-lightweight GPT-5
# GPT-4 Family
mcp-cli --provider openai --model gpt-4o # GPT-4 Optimized
mcp-cli --provider openai --model gpt-4o-mini # Smaller GPT-4
# O3 Reasoning Models
mcp-cli --provider openai --model o3 # O3 reasoning
mcp-cli --provider openai --model o3-mini # Efficient O3
# Claude 4.5 Family (requires Anthropic API key)
mcp-cli --provider anthropic --model claude-4-5-opus # Most advanced Claude
mcp-cli --provider anthropic --model claude-4-5-sonnet # Balanced Claude 4.5
mcp-cli --provider anthropic --model claude-3-5-sonnet # Claude 3.5
# Enterprise Azure (requires Azure configuration)
mcp-cli --provider azure_openai --model gpt-5 # Enterprise GPT-5
# Other Providers
mcp-cli --provider gemini --model gemini-2.0-flash # Google Gemini
mcp-cli --provider groq --model llama-3.1-70b # Fast Llama via Groq
```
## 🧰 Global Configuration
### Default Configuration
MCP CLI defaults to:
- **Provider**: `ollama` (local, no API key required)
- **Model**: `gpt-oss` (open-source reasoning model with thinking visibility)
### Command-line Arguments
Global options available for all modes and commands:
- `--server`: Specify server(s) to connect to (comma-separated)
- `--config-file`: Path to server configuration file (default: `server_config.json`)
- `--provider`: LLM provider (default: `ollama`)
- `--model`: Specific model to use (default: `gpt-oss` for Ollama)
- `--disable-filesystem`: Disable filesystem access (default: enabled)
- `--api-base`: Override API endpoint URL
- `--api-key`: Override API key (not needed for Ollama)
- `--token-backend`: Override token storage backend (`auto`, `keychain`, `windows`, `secretservice`, `encrypted`, `vault`)
- `--verbose`: Enable detailed logging
- `--quiet`: Suppress non-essential output
### Environment Variables
```bash
# Override defaults
export LLM_PROVIDER=ollama # Default provider (already the default)
export LLM_MODEL=gpt-oss # Default model (already the default)
# For cloud providers (optional)
export OPENAI_API_KEY=sk-... # For GPT-5, GPT-4, O3 models
export ANTHROPIC_API_KEY=sk-ant-... # For Claude 4.5, Claude 3.5
export AZURE_OPENAI_API_KEY=sk-... # For enterprise GPT-5
export AZURE_OPENAI_ENDPOINT=https://...
export GEMINI_API_KEY=... # For Gemini models
export GROQ_API_KEY=... # For Groq fast inference
# Tool configuration
export MCP_TOOL_TIMEOUT=120 # Tool execution timeout (seconds)
```
## 🌐 Available Modes
### 1. Chat Mode (Default)
Provides a natural language interface with streaming responses and automatic tool usage:
```bash
# Default mode with Ollama/gpt-oss reasoning model (no API key needed)
mcp-cli --server sqlite
# See the AI's thinking process with reasoning models
mcp-cli --server sqlite --model gpt-oss # Open-source reasoning
mcp-cli --server sqlite --provider openai --model gpt-5 # GPT-5 reasoning
mcp-cli --server sqlite --provider anthropic --model claude-4-5-opus # Claude 4.5 reasoning
# Use different local models
mcp-cli --server sqlite --model llama3.3
mcp-cli --server sqlite --model qwen2.5-coder
# Switch to cloud providers (requires API keys)
mcp-cli chat --server sqlite --provider openai --model gpt-5
mcp-cli chat --server sqlite --provider anthropic --model claude-4-5-sonnet
```
### 2. Interactive Mode
Command-driven shell interface for direct server operations:
```bash
mcp-cli interactive --server sqlite
# With specific models
mcp-cli interactive --server sqlite --model gpt-oss # Local reasoning
mcp-cli interactive --server sqlite --provider openai --model gpt-5 # Cloud GPT-5
```
### 3. Command Mode
Unix-friendly interface for automation and scripting:
```bash
# Process text with reasoning models
mcp-cli cmd --server sqlite --model gpt-oss --prompt "Think through this step by step" --input data.txt
# Use GPT-5 for complex reasoning
mcp-cli cmd --server sqlite --provider openai --model gpt-5 --prompt "Analyze this data" --input data.txt
# Execute tools directly
mcp-cli cmd --server sqlite --tool list_tables --output tables.json
# Pipeline-friendly processing
echo "SELECT * FROM users LIMIT 5" | mcp-cli cmd --server sqlite --tool read_query --input -
```
### 4. Direct Commands
Execute individual commands without entering interactive mode:
```bash
# List available tools
mcp-cli tools --server sqlite
# Show provider configuration
mcp-cli provider list
# Show available models for current provider
mcp-cli models
# Show models for specific provider
mcp-cli models openai # Shows GPT-5, GPT-4, O3 models
mcp-cli models anthropic # Shows Claude 4.5, Claude 3.5 models
mcp-cli models ollama # Shows gpt-oss, llama3.3, etc.
# Ping servers
mcp-cli ping --server sqlite
# List resources
mcp-cli resources --server sqlite
# UI Theme Management
mcp-cli theme # Show current theme and list available
mcp-cli theme dark # Switch to dark theme
mcp-cli theme --select # Interactive theme selector
mcp-cli theme --list # List all available themes
# Token Storage Management
mcp-cli token backends # Show available storage backends
mcp-cli --token-backend encrypted token list # Use specific backend
```
## 🤖 Using Chat Mode
Chat mode provides the most advanced interface with streaming responses and intelligent tool usage.
### Starting Chat Mode
```bash
# Simple startup with default reasoning model (gpt-oss)
mcp-cli --server sqlite
# Multiple servers
mcp-cli --server sqlite,filesystem
# With advanced reasoning models
mcp-cli --server sqlite --provider openai --model gpt-5
mcp-cli --server sqlite --provider anthropic --model claude-4-5-opus
```
### Chat Commands (Slash Commands)
#### Provider & Model Management
```bash
/provider # Show current configuration (default: ollama)
/provider list # List all providers
/provider config # Show detailed configuration
/provider diagnostic # Test provider connectivity
/provider set ollama api_base http://localhost:11434 # Configure Ollama endpoint
/provider openai # Switch to OpenAI (requires API key)
/provider anthropic # Switch to Anthropic (requires API key)
/provider openai gpt-5 # Switch to OpenAI GPT-5
# Custom Provider Management
/provider custom # List custom providers
/provider add localai http://localhost:8080/v1 gpt-4 # Add custom provider
/provider remove localai # Remove custom provider
/model # Show current model (default: gpt-oss)
/model llama3.3 # Switch to different Ollama model
/model gpt-5 # Switch to GPT-5 (if using OpenAI)
/model claude-4-5-opus # Switch to Claude 4.5 (if using Anthropic)
/models # List available models for current provider
```
#### Tool Management
```bash
/tools # List available tools
/tools --all # Show detailed tool information
/tools --raw # Show raw JSON definitions
/tools call # Interactive tool execution
/toolhistory # Show tool execution history
/th -n 5 # Last 5 tool calls
/th 3 # Details for call #3
/th --json # Full history as JSON
```
#### Server Management (Runtime Configuration)
```bash
/server # List all configured servers
/server list # List servers (alias)
/server list all # Include disabled servers
# Add servers at runtime (persists in ~/.mcp-cli/preferences.json)
/server add <name> stdio <command> [args...]
/server add sqlite stdio uvx mcp-server-sqlite --db-path test.db
/server add playwright stdio npx @playwright/mcp@latest
/server add time stdio uvx mcp-server-time
/server add fs stdio npx @modelcontextprotocol/server-filesystem /path/to/dir
# HTTP/SSE server examples with authentication
/server add github --transport http --header "Authorization: Bearer ghp_token" -- https://api.github.com/mcp
/server add myapi --transport http --env API_KEY=secret -- https://api.example.com/mcp
/server add events --transport sse -- https://events.example.com/sse
# Manage server state
/server enable <name> # Enable a disabled server
/server disable <name> # Disable without removing
/server remove <name> # Remove user-added server
/server ping <name> # Test server connectivity
# Server details
/server <name> # Show server configuration details
```
**Note**: Servers added via `/server add` are stored in `~/.mcp-cli/preferences.json` and persist across sessions. Project servers remain in `server_config.json`.
#### Conversation Management
```bash
/conversation # Show conversation history
/ch -n 10 # Last 10 messages
/ch 5 # Details for message #5
/ch --json # Full history as JSON
/save conversation.json # Save conversation to file
/compact # Summarize conversation
/clear # Clear conversation history
/cls # Clear screen only
```
#### UI Customization
```bash
/theme # Interactive theme selector with preview
/theme dark # Switch to dark theme
/theme monokai # Switch to monokai theme
# Available themes: default, dark, light, minimal, terminal, monokai, dracula, solarized
# Themes are persisted across sessions
```
#### Token Management
```bash
/token # List all stored tokens
/token list # List all tokens explicitly
/token set <name> # Store a bearer token
/token get <name> # Get token details
/token delete <name> # Delete a token
/token clear # Clear all tokens (with confirmation)
/token backends # Show available storage backends
# Examples
/token set my-api # Prompts for token value (secure)
/token get notion --oauth # Get OAuth token for Notion server
/token list --api-keys # List only provider API keys
```
**Token Storage Backends:**
MCP CLI supports multiple secure token storage backends:
- **Keychain** (macOS) - Uses macOS Keychain (default on macOS)
- **Windows Credential Manager** - Native Windows storage (default on Windows)
- **Secret Service** - Linux desktop keyring (GNOME/KDE)
- **Encrypted File** - AES-256 encrypted local files (cross-platform fallback)
- **HashiCorp Vault** - Enterprise secret management
Override the default backend with `--token-backend`:
```bash
# Use encrypted file storage instead of keychain
mcp-cli --token-backend encrypted token list
# Use vault for enterprise environments
mcp-cli --token-backend vault token list
```
See [Token Management Guide](docs/TOKEN_MANAGEMENT.md) for comprehensive documentation.
#### Session Control
```bash
/verbose # Toggle verbose/compact display (Default: Enabled)
/confirm # Toggle tool call confirmation (Default: Enabled)
/interrupt # Stop running operations
/server # Manage MCP servers (see Server Management above)
/help # Show all commands
/help tools # Help for specific command
/exit # Exit chat mode
```
**For complete command documentation**, see [Commands System Guide](docs/COMMANDS.md).
### Chat Features
#### Streaming Responses with Reasoning Visibility
- **🧠 Reasoning Models**: See the AI's thinking process with gpt-oss, GPT-5, Claude 4
- **Real-time Generation**: Watch text appear token by token
- **Performance Metrics**: Words/second, response time
- **Graceful Interruption**: Ctrl+C to stop streaming
- **Progressive Rendering**: Markdown formatted as it streams
#### Tool Execution
- Automatic tool discovery and usage
- Concurrent execution with progress indicators
- Verbose and compact display modes
- Complete execution history and timing
#### Provider Integration
- Seamless switching between providers
- Model-specific optimizations
- API key and endpoint management
- Health monitoring and diagnostics
## 🖥️ Using Interactive Mode
Interactive mode provides a command shell for direct server interaction.
### Starting Interactive Mode
```bash
mcp-cli interactive --server sqlite
```
### Interactive Commands
```bash
help # Show available commands
exit # Exit interactive mode
clear # Clear terminal
# Provider management
provider # Show current provider
provider list # List providers
provider anthropic # Switch provider
provider openai gpt-5 # Switch to GPT-5
# Model management
model # Show current model
model gpt-oss # Switch to reasoning model
model claude-4-5-opus # Switch to Claude 4.5
models # List available models
# Tool operations
tools # List tools
tools --all # Detailed tool info
tools call # Interactive tool execution
# Server operations
servers # List servers
ping # Ping all servers
resources # List resources
prompts # List prompts
```
## 📄 Using Command Mode
Command mode provides Unix-friendly automation capabilities.
### Command Mode Options
```bash
--input FILE # Input file (- for stdin)
--output FILE # Output file (- for stdout)
--prompt TEXT # Prompt template
--tool TOOL # Execute specific tool
--tool-args JSON # Tool arguments as JSON
--system-prompt TEXT # Custom system prompt
--raw # Raw output without formatting
--single-turn # Disable multi-turn conversation
--max-turns N # Maximum conversation turns
```
### Examples
```bash
# Text processing with reasoning models
echo "Analyze this data" | mcp-cli cmd --server sqlite --model gpt-oss --input - --output analysis.txt
# Use GPT-5 for complex analysis
mcp-cli cmd --server sqlite --provider openai --model gpt-5 --prompt "Provide strategic analysis" --input report.txt
# Tool execution
mcp-cli cmd --server sqlite --tool list_tables --raw
# Complex queries
mcp-cli cmd --server sqlite --tool read_query --tool-args '{"query": "SELECT COUNT(*) FROM users"}'
# Batch processing with GNU Parallel
ls *.txt | parallel mcp-cli cmd --server sqlite --input {} --output {}.summary --prompt "Summarize: {{input}}"
```
## 🔧 Provider Configuration
### Ollama Configuration (Default)
Ollama runs locally by default on `http://localhost:11434`. MCP CLI v0.11.1+ with CHUK-LLM v0.16+ includes **llama.cpp integration** that automatically discovers and reuses Ollama's downloaded models for 1.53x faster inference (311 vs 204 tokens/sec) without re-downloading.
To use reasoning and other models:
```bash
# Pull reasoning and other models for Ollama
ollama pull gpt-oss # Default reasoning model
ollama pull llama3.3 # Latest Llama
ollama pull llama3.2 # Llama 3.2
ollama pull qwen3 # Qwen 3
ollama pull qwen2.5-coder # Coding-focused
ollama pull deepseek-coder # DeepSeek coder
ollama pull granite3.3 # IBM Granite
ollama pull mistral # Mistral
ollama pull gemma3 # Google Gemma
ollama pull phi3 # Microsoft Phi
ollama pull codellama # Code Llama
# List available Ollama models
ollama list
# Configure remote Ollama server
mcp-cli provider set ollama api_base http://remote-server:11434
```
### Cloud Provider Configuration
To use cloud providers with advanced models, configure API keys:
```bash
# Configure OpenAI (for GPT-5, GPT-4, O3 models)
mcp-cli provider set openai api_key sk-your-key-here
# Configure Anthropic (for Claude 4.5, Claude 3.5)
mcp-cli provider set anthropic api_key sk-ant-your-key-here
# Configure Azure OpenAI (for enterprise GPT-5)
mcp-cli provider set azure_openai api_key sk-your-key-here
mcp-cli provider set azure_openai api_base https://your-resource.openai.azure.com
# Configure other providers
mcp-cli provider set gemini api_key your-gemini-key
mcp-cli provider set groq api_key your-groq-key
# Test configuration
mcp-cli provider diagnostic openai
mcp-cli provider diagnostic anthropic
```
### Custom OpenAI-Compatible Providers
MCP CLI supports adding custom OpenAI-compatible providers (LocalAI, custom proxies, etc.):
```bash
# Add a custom provider (persisted across sessions)
mcp-cli provider add localai http://localhost:8080/v1 gpt-4 gpt-3.5-turbo
mcp-cli provider add myproxy https://proxy.example.com/v1 custom-model-1 custom-model-2
# Set API key via environment variable (never stored in config)
export LOCALAI_API_KEY=your-api-key
export MYPROXY_API_KEY=your-api-key
# List custom providers
mcp-cli provider custom
# Use custom provider
mcp-cli --provider localai --server sqlite
mcp-cli --provider myproxy --model custom-model-1 --server sqlite
# Remove custom provider
mcp-cli provider remove localai
# Runtime provider (session-only, not persisted)
mcp-cli --provider temp-ai --api-base https://api.temp.com/v1 --api-key test-key --server sqlite
```
**Security Note**: API keys can be stored securely in OS-native keychains (macOS Keychain, Windows Credential Manager, Linux Secret Service) or HashiCorp Vault using the token management system. Alternatively, use environment variables following the pattern `{PROVIDER_NAME}_API_KEY` or pass via `--api-key` for session-only use. See [Token Management](docs/TOKEN_MANAGEMENT.md) for details.
### Manual Configuration
The `chuk_llm` library configuration in `~/.chuk_llm/config.yaml`:
```yaml
ollama:
api_base: http://localhost:11434
default_model: gpt-oss
openai:
api_base: https://api.openai.com/v1
default_model: gpt-5
anthropic:
api_base: https://api.anthropic.com
default_model: claude-4-5-opus
azure_openai:
api_base: https://your-resource.openai.azure.com
default_model: gpt-5
gemini:
api_base: https://generativelanguage.googleapis.com
default_model: gemini-2.0-flash
groq:
api_base: https://api.groq.com
default_model: llama-3.1-70b
```
API keys can be provided via:
1. **Secure token storage** (recommended) - Stored in OS keychain/Vault, see [Token Management](docs/TOKEN_MANAGEMENT.md)
2. **Environment variables** - Export in your shell or add to `~/.chuk_llm/.env`:
```bash
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
AZURE_OPENAI_API_KEY=sk-your-azure-key-here
GEMINI_API_KEY=your-gemini-key
GROQ_API_KEY=your-groq-key
```
3. **Command-line** - Pass `--api-key` for session-only use (not persisted)
## 📂 Server Configuration
MCP CLI supports two types of server configurations:
1. **Project Servers** (`server_config.json`): Shared project-level configurations
2. **User Servers** (`~/.mcp-cli/preferences.json`): Personal runtime-added servers that persist across sessions
### Configuration File Discovery
MCP CLI searches for `server_config.json` in the following priority order:
1. **Explicit path** via `--config-file` option:
```bash
mcp-cli --config-file /path/to/custom-config.json
```
2. **Current directory** - Automatically detected when running from a project directory:
```bash
cd /path/to/my-project
mcp-cli --server sqlite # Uses ./server_config.json if it exists
```
3. **Bundled default** - When running via `uvx` or from any directory without a local config:
```bash
uvx mcp-cli --server cloudflare_workers # Uses packaged server_config.json
```
This means you can:
- **Override per-project**: Place a `server_config.json` in your project directory with project-specific server configurations
- **Use defaults globally**: Run `uvx mcp-cli` from anywhere and get the bundled default servers
- **Customize explicitly**: Use `--config-file` to specify any configuration file location
### Bundled Default Servers
MCP CLI v0.11.1+ comes with an expanded set of pre-configured servers in the bundled `server_config.json`:
| Server | Type | Description | Configuration |
|--------|------|-------------|---------------|
| **sqlite** | STDIO | SQLite database operations | `uvx mcp-server-sqlite --db-path test.db` |
| **echo** | STDIO | Echo server for testing | `uvx chuk-mcp-echo stdio` |
| **math** | STDIO | Mathematical computations | `uvx chuk-mcp-math-server` |
| **playwright** | STDIO | Browser automation | `npx @playwright/mcp@latest` |
| **brave_search** | STDIO | Web search via Brave API | Requires `BRAVE_API_KEY` token |
| **notion** | HTTP | Notion workspace integration | `https://mcp.notion.com/mcp` (OAuth) |
| **cloudflare_workers** | HTTP | Cloudflare Workers bindings | `https://bindings.mcp.cloudflare.com/mcp` (OAuth) |
| **monday** | HTTP | Monday.com integration | `https://mcp.monday.com/mcp` (OAuth) |
| **linkedin** | HTTP | LinkedIn integration | `https://linkedin.chukai.io/mcp` |
| **weather** | HTTP | Weather data service | `https://weather.chukai.io/mcp` |
**Note**: HTTP servers and API-based servers require authentication. Use the [Token Management](docs/TOKEN_MANAGEMENT.md) system to configure access tokens.
To use these servers:
```bash
# Use bundled servers from anywhere
uvx mcp-cli --server sqlite
uvx mcp-cli --server echo
uvx mcp-cli --server math
uvx mcp-cli --server playwright
# API-based servers require tokens
mcp-cli token set brave_search --type bearer
uvx mcp-cli --server brave_search
# HTTP/OAuth servers require OAuth authentication
uvx mcp-cli token set notion --oauth
uvx mcp-cli --server notion
# Use multiple servers simultaneously
uvx mcp-cli --server sqlite,math,playwright
```
### Project Configuration
Create a `server_config.json` file with your MCP server configurations:
```json
{
"mcpServers": {
"sqlite": {
"command": "python",
"args": ["-m", "mcp_server.sqlite_server"],
"env": {
"DATABASE_PATH": "database.db"
}
},
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed/files"],
"env": {}
},
"brave-search": {
"command": "npx",
"args": ["-y", "@brave/brave-search-mcp-server"],
"env": {
"BRAVE_API_KEY": "${TOKEN:bearer:brave_search}"
}
},
"notion": {
"url": "https://mcp.notion.com/mcp",
"headers": {
"Authorization": "Bearer ${TOKEN:bearer:notion}"
}
}
}
}
```
### Secure Token Replacement
MCP CLI supports automatic token replacement from secure storage using the `${TOKEN:namespace:name}` syntax:
**Syntax**: `${TOKEN:<namespace>:<token-name>}`
**Examples**:
```json
{
"mcpServers": {
"brave-search": {
"command": "npx",
"args": ["-y", "@brave/brave-search-mcp-server"],
"env": {
"BRAVE_API_KEY": "${TOKEN:bearer:brave_search}"
}
},
"api-server": {
"url": "https://api.example.com/mcp",
"headers": {
"Authorization": "Bearer ${TOKEN:bearer:my_api}",
"X-API-Key": "${TOKEN:api-key:my_service}"
}
}
}
}
```
**Token Storage**:
```bash
# Store tokens securely (never in config files!)
mcp-cli token set brave_search --type bearer
# Enter token value when prompted (hidden input)
mcp-cli token set my_api --type bearer --value "your-token-here"
# Tokens are stored in OS-native secure storage:
# - macOS: Keychain
# - Windows: Credential Manager
# - Linux: Secret Service (GNOME Keyring/KWallet)
```
**Supported Locations**:
- `env`: Environment variables for STDIO servers
- `headers`: HTTP headers for HTTP/SSE servers
**Namespaces**:
- `bearer`: Bearer tokens (default for `--type bearer`)
- `api-key`: API keys (default for `--type api-key`)
- `oauth`: OAuth tokens (automatic)
- `generic`: Custom tokens
**Benefits**:
- ✅ Never store API keys in config files
- ✅ Share `server_config.json` safely (no secrets)
- ✅ Tokens encrypted in OS-native secure storage
- ✅ Works across all transport types (STDIO, HTTP, SSE)
See [Token Management Guide](docs/TOKEN_MANAGEMENT.md) for complete documentation.
### Runtime Server Management
Add servers dynamically during runtime without editing configuration files:
```bash
# Add STDIO servers (most common)
mcp-cli
> /server add sqlite stdio uvx mcp-server-sqlite --db-path mydata.db
> /server add playwright stdio npx @playwright/mcp@latest
> /server add time stdio uvx mcp-server-time
# Add HTTP servers with authentication
> /server add github --transport http --header "Authorization: Bearer ghp_token" -- https://api.github.com/mcp
> /server add myapi --transport http --env API_KEY=secret -- https://api.example.com/mcp
# Add SSE (Server-Sent Events) servers
> /server add events --transport sse -- https://events.example.com/sse
# Manage servers
> /server list # Show all servers
> /server disable sqlite # Temporarily disable
> /server enable sqlite # Re-enable
> /server remove myapi # Remove user-added server
```
**Key Points:**
- User-added servers persist in `~/.mcp-cli/preferences.json`
- Survive application restarts
- Can be enabled/disabled without removal
- Support STDIO, HTTP, and SSE transports
- Environment variables and headers for authentication
## 📈 Advanced Usage Examples
### Reasoning Model Comparison
```bash
# Compare reasoning across different models
> /provider ollama
> /model gpt-oss
> Think through this problem step by step: If a train leaves New York at 3 PM...
[See the complete thinking process with gpt-oss]
> /provider openai
> /model gpt-5
> Think through this problem step by step: If a train leaves New York at 3 PM...
[See GPT-5's reasoning approach]
> /provider anthropic
> /model claude-4-5-opus
> Think through this problem step by step: If a train leaves New York at 3 PM...
[See Claude 4.5's analytical process]
```
### Local-First Workflow with Reasoning
```bash
# Start with default Ollama/gpt-oss (no API key needed)
mcp-cli chat --server sqlite
# Use reasoning model for complex problems
> Think through this database optimization problem step by step
[gpt-oss shows its complete thinking process before answering]
# Try different local models for different tasks
> /model llama3.3 # General purpose
> /model qwen2.5-coder # For coding tasks
> /model deepseek-coder # Alternative coding model
> /model granite3.3 # IBM's model
> /model gpt-oss # Back to reasoning model
# Switch to cloud when needed (requires API keys)
> /provider openai
> /model gpt-5
> Complex enterprise architecture design...
> /provider anthropic
> /model claude-4-5-opus
> Detailed strategic analysis...
> /provider ollama
> /model gpt-oss
> Continue with local processing...
```
### Multi-Provider Workflow
```bash
# Start with local reasoning (default, no API key)
mcp-cli chat --server sqlite
# Compare responses across providers
> /provider ollama
> What's the best way to optimize this SQL query?
> /provider openai gpt-5 # Requires API key
> What's the best way to optimize this SQL query?
> /provider anthropic claude-4-5-sonnet # Requires API key
> What's the best way to optimize this SQL query?
# Use each provider's strengths
> /provider ollama gpt-oss # Local reasoning, privacy
> /provider openai gpt-5 # Advanced reasoning
> /provider anthropic claude-4-5-opus # Deep analysis
> /provider groq llama-3.1-70b # Ultra-fast responses
```
### Complex Tool Workflows with Reasoning
```bash
# Use reasoning model for complex database tasks
> /model gpt-oss
> I need to analyze our database performance. Think through what we should check first.
[gpt-oss shows thinking: "First, I should check the table structure, then indexes, then query patterns..."]
[Tool: list_tables] → products, customers, orders
> Now analyze the indexes and suggest optimizations
[gpt-oss thinks through index analysis]
[Tool: describe_table] → Shows current indexes
[Tool: read_query] → Analyzes query patterns
> Create an optimization plan based on your analysis
[Complete reasoning process followed by specific recommendations]
```
### Automation and Scripting
```bash
# Batch processing with different models
for file in data/*.csv; do
# Use reasoning model for analysis
mcp-cli cmd --server sqlite \
--model gpt-oss \
--prompt "Analyze this data and think through patterns" \
--input "$file" \
--output "analysis/$(basename "$file" .csv)_reasoning.txt"
# Use coding model for generating scripts
mcp-cli cmd --server sqlite \
--model qwen2.5-coder \
--prompt "Generate Python code to process this data" \
--input "$file" \
--output "scripts/$(basename "$file" .csv)_script.py"
done
# Pipeline with reasoning
cat complex_problem.txt | \
mcp-cli cmd --model gpt-oss --prompt "Think through this step by step" --input - | \
mcp-cli cmd --model llama3.3 --prompt "Summarize the key points" --input - > solution.txt
```
### Performance Monitoring
```bash
# Check provider and model performance
> /provider diagnostic
Provider Diagnostics
Provider | Status | Response Time | Features | Models
ollama | ✅ Ready | 56ms | 📡🔧 | gpt-oss, llama3.3, qwen3, ...
openai | ✅ Ready | 234ms | 📡🔧👁️ | gpt-5, gpt-4o, o3, ...
anthropic | ✅ Ready | 187ms | 📡🔧 | claude-4-5-opus, claude-4-5-sonnet, ...
azure_openai | ✅ Ready | 198ms | 📡🔧👁️ | gpt-5, gpt-4o, ...
gemini | ✅ Ready | 156ms | 📡🔧👁️ | gemini-2.0-flash, ...
groq | ✅ Ready | 45ms | 📡🔧 | llama-3.1-70b, ...
# Check available models
> /models
Models for ollama (Current Provider)
Model | Status
gpt-oss | Current & Default (Reasoning)
llama3.3 | Available
llama3.2 | Available
qwen2.5-coder | Available
deepseek-coder | Available
granite3.3 | Available
... and 6 more
# Monitor tool execution with reasoning
> /verbose
> /model gpt-oss
> Analyze the database and optimize the slowest queries
[Shows complete thinking process]
[Tool execution with timing]
```
## 🔍 Troubleshooting
### Common Issues
1. **Ollama not running** (default provider):
```bash
# Start Ollama service
ollama serve
# Or check if it's running
curl http://localhost:11434/api/tags
```
2. **Model not found**:
```bash
# For Ollama (default), pull the model first
ollama pull gpt-oss # Reasoning model
ollama pull llama3.3 # Latest Llama
ollama pull qwen2.5-coder # Coding model
# List available models
ollama list
# For cloud providers, check supported models
mcp-cli models openai # Shows GPT-5, GPT-4, O3 models
mcp-cli models anthropic # Shows Claude 4.5, Claude 3.5 models
```
3. **Provider not found or API key missing**:
```bash
# Check available providers
mcp-cli provider list
# For cloud providers, set API keys
mcp-cli provider set openai api_key sk-your-key
mcp-cli provider set anthropic api_key sk-ant-your-key
# Test connection
mcp-cli provider diagnostic openai
```
4. **Connection issues with Ollama**:
```bash
# Check Ollama is running
ollama list
# Test connection
mcp-cli provider diagnostic ollama
# Configure custom endpoint if needed
mcp-cli provider set ollama api_base http://localhost:11434
```
### Debug Mode
Enable verbose logging for troubleshooting:
```bash
mcp-cli --verbose chat --server sqlite
mcp-cli --log-level DEBUG interactive --server sqlite
```
## 🔒 Security Considerations
### Privacy & Local-First
- **Local by Default**: Ollama with gpt-oss runs locally, keeping your data private
- **No Cloud Required**: Full functionality without external API dependencies
### Token & Authentication Security
- **Secure Token Storage**: Tokens stored in OS-native credential stores (macOS Keychain, Windows Credential Manager, Linux Secret Service) under the "mcp-cli" service identifier
- **Multiple Storage Backends**: Choose between keychain, encrypted files, or HashiCorp Vault based on security requirements
- **API Keys**: Only needed for cloud providers (OpenAI, Anthropic, etc.), stored securely using token management system
- **OAuth 2.0 Support**: Secure authentication for MCP servers using PKCE and resource indicators (RFC 7636, RFC 8707)
### Execution Security
- **Tool Validation**: All tool calls are validated before execution
- **Timeout Protection**: Configurable timeouts prevent hanging operations (v0.13+)
- **Circuit Breakers**: Automatic failure detection and recovery to prevent cascading failures (v0.13+)
- **Server Isolation**: Each server runs in its own process
- **File Access**: Filesystem access can be disabled with `--disable-filesystem`
- **Transport Monitoring**: Automatic detection of connection failures with warnings (v0.11+)
## 🚀 Performance Features
### LLM Provider Performance (v0.16+)
- **52x Faster Imports**: Reduced from 735ms to 14ms through lazy loading
- **112x Faster Client Creation**: Automatic thread-safe caching
- **llama.cpp Integration**: 1.53x faster inference (311 vs 204 tokens/sec) with automatic Ollama model reuse
- **Dynamic Model Discovery**: Zero overhead capability-based model selection
### Tool Execution Performance (v0.13+)
- **Production Middleware**: Timeouts, retries with exponential backoff, circuit breakers, and result caching
- **Concurrent Tool Execution**: Multiple tools can run simultaneously with proper coordination
- **Connection Health Monitoring**: Automatic detection and recovery from transport failures
- **Optimized Tool Manager**: Reduced from 2000+ to ~800 lines while maintaining all functionality
### Runtime Performance
- **Local Processing**: Default Ollama provider minimizes latency
- **Reasoning Visibility**: See AI thinking process with gpt-oss, GPT-5, Claude 4
- **Streaming Responses**: Real-time response generation
- **Connection Pooling**: Efficient reuse of client connections
- **Caching**: Tool metadata and provider configurations are cached
- **Async Architecture**: Non-blocking operations throughout
## 📦 Dependencies
Core dependencies are organized into feature groups:
- **cli**: Terminal UI and command framework (Rich, Typer, chuk-term)
- **dev**: Development tools, testing utilities, linting
- **chuk-tool-processor v0.13+**: Production-grade tool execution with middleware, multiple execution strategies, and observability
- **chuk-llm v0.16+**: Unified LLM provider with dynamic model discovery, capability-based selection, and llama.cpp integration for 52x faster imports and 112x faster client creation
- **chuk-term**: Enhanced terminal UI with themes, prompts, and cross-platform support
Install with specific features:
```bash
pip install "mcp-cli[cli]" # Basic CLI features
pip install "mcp-cli[cli,dev]" # CLI with development tools
```
## 🤝 Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
### Development Setup
```bash
git clone https://github.com/chrishayuk/mcp-cli
cd mcp-cli
pip install -e ".[cli,dev]"
pre-commit install
```
### Demo Scripts
Explore the capabilities of MCP CLI:
```bash
# Command Mode Demos
# General cmd mode features (bash)
bash examples/cmd_mode_demo.sh
# LLM integration with cmd mode (bash)
bash examples/cmd_mode_llm_demo.sh
# Python integration example
uv run examples/cmd_mode_python_demo.py
# Custom Provider Management Demos
# Interactive walkthrough demo (educational)
uv run examples/custom_provider_demo.py
# Working demo with actual inference (requires OPENAI_API_KEY)
uv run examples/custom_provider_working_demo.py
# Simple shell script demo (requires OPENAI_API_KEY)
bash examples/custom_provider_simple_demo.sh
# Terminal management features (chuk-term)
uv run examples/ui_terminal_demo.py
# Output system with themes (chuk-term)
uv run examples/ui_output_demo.py
# Streaming UI capabilities (chuk-term)
uv run examples/ui_streaming_demo.py
```
### Running Tests
```bash
pytest
pytest --cov=mcp_cli --cov-report=html
```
## 📜 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🙏 Acknowledgments
- **[CHUK Tool Processor](https://github.com/chrishayuk/chuk-tool-processor)** - Production-grade async tool execution with middleware and observability
- **[CHUK-LLM](https://github.com/chrishayuk/chuk-llm)** - Unified LLM provider with dynamic model discovery, llama.cpp integration, and GPT-5/Claude 4.5 support (v0.16+)
- **[CHUK-Term](https://github.com/chrishayuk/chuk-term)** - Enhanced terminal UI with themes and cross-platform support
- **[Rich](https://github.com/Textualize/rich)** - Beautiful terminal formatting
- **[Typer](https://typer.tiangolo.com/)** - CLI framework
- **[Prompt Toolkit](https://github.com/prompt-toolkit/python-prompt-toolkit)** - Interactive input
## 🔗 Related Projects
- **[Model Context Protocol](https://modelcontextprotocol.io/)** - Core protocol specification
- **[MCP Servers](https://github.com/modelcontextprotocol/servers)** - Official MCP server implementations
- **[CHUK Tool Processor](https://github.com/chrishayuk/chuk-tool-processor)** - Production-grade tool execution with middleware and observability
- **[CHUK-LLM](https://github.com/chrishayuk/chuk-llm)** - LLM provider abstraction with dynamic model discovery, GPT-5, Claude 4.5, O3 series support, and llama.cpp integration
- **[CHUK-Term](https://github.com/chrishayuk/chuk-term)** - Terminal UI library with themes and cross-platform support
Connection Info
You Might Also Like
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
servers
Model Context Protocol Servers
Time
A Model Context Protocol server for time and timezone conversions.
Filesystem
Node.js MCP Server for filesystem operations with dynamic access control.
Sequential Thinking
A structured MCP server for dynamic problem-solving and reflective thinking.
git
A Model Context Protocol server for Git automation and interaction.