Content
<p align="center">
<img src="assets/images/memento_mcp_logo_transparent.png" width="400" alt="Memento MCP Logo">
</p>
<p align="center">
<a href="https://github.com/JinHo-von-Choi/memento-mcp/releases">
<img src="https://img.shields.io/github/v/release/JinHo-von-Choi/memento-mcp?style=flat&label=release&color=4c8bf5" alt="GitHub Release" />
</a>
<a href="https://github.com/JinHo-von-Choi/memento-mcp/stargazers">
<img src="https://img.shields.io/github/stars/JinHo-von-Choi/memento-mcp?style=flat&color=f5c542" alt="GitHub Stars" />
</a>
<a href="LICENSE">
<img src="https://img.shields.io/badge/license-Apache%202.0-blue?style=flat" alt="License" />
</a>
<a href="https://lobehub.com/mcp/jinho-von-choi-memento-mcp">
<img src="https://lobehub.com/badge/mcp/jinho-von-choi-memento-mcp" alt="MCP Badge" />
</a>
</p>
<p align="center">
<a href="README.en.md">📖 English Documentation</a>
</p>
# Tool List
> Memento MCP is an AI long-term memory server based on the Model Context Protocol (MCP). It retains important facts, decisions, error patterns, and procedures even after a session ends and restores them in the next session.
Memento MCP is not just a simple memory library. As feedback accumulates, connections become stronger, experiences are repeated, and patterns are abstracted, and as sessions continue, stories are created. It aims for an AI that grows from experience, not just remembers.
## 30-second Experience
Here's a flow of remembering something for an AI and retrieving it in the next session:
```
[Session 1]
User: "Our project uses PostgreSQL 15 and tests with Vitest"
→ AI calls remember → 2 fragments stored
[Session 2 — Next day]
→ AI calls context → "Use PostgreSQL 15", "Vitest test" automatically restored
User: "How do I test it?"
→ AI calls recall → "Execute test with Vitest" fragment returned
→ AI: "This project uses Vitest. Run it with npx vitest."
```
You don't need to repeat the same explanation every session.
## Installation
Required: Node.js 20+, PostgreSQL (pgvector extension)
```bash
cp .env.example.minimal .env
# Edit .env values and reflect them in the shell
export $(grep -v '^#' .env | grep '=' | xargs)
npm install
npm run migrate
node server.js
```
To use local embeddings without OpenAI API, add `EMBEDDING_PROVIDER=transformers` to `.env`. It automatically downloads the `Xenova/multilingual-e5-small` model when started. However, do not mix it with OpenAI embeddings that have already been stored, as the dimensions will not match, and use it only in a newly migrated DB.
After the server is up, verify the operation with [First Memory Flow](docs/getting-started/first-memory-flow.md).
Refer to the [Compatible Platforms](#compatible-platforms) table for other platform settings.
### Update
```bash
cd ~/memento-mcp
git pull origin main
npm install
npm run migrate
# Restart service (depending on environment such as systemd / pm2 / docker)
```
- `npm run migrate` automatically uses the DB settings in `.env`. No need to manually specify `DATABASE_URL`.
- pgvector schema is automatically detected. `PGVECTOR_SCHEMA` setting is usually unnecessary.
### Claude Code Integration
Register with `claude mcp add` CLI:
```bash
claude mcp add memento http://localhost:57332/mcp \
--transport http \
--scope user \
--header "Authorization: Bearer YOUR_ACCESS_KEY"
```
Registration is stored in `~/.claude.json`. Confirm:
```bash
claude mcp list
# memento: http://localhost:57332/mcp (HTTP) - ✓ Connected
```
To share by project, write it in `.mcp.json` in the repository root. See [Claude Code Configuration](docs/getting-started/claude-code.md) for detailed settings.
### Supported Environments
| Environment | Recommendation | Getting Started |
| --- | --- | --- |
| Linux / macOS | Recommended | [Quick Start](docs/getting-started/quickstart.md) |
| Windows + WSL2 | Most Recommended | [Windows WSL2 Setup](docs/getting-started/windows-wsl2.md) |
| Windows + PowerShell | Limited Support | [Windows PowerShell Setup](docs/getting-started/windows-powershell.md) |
## Compatible Platforms
Memento is an MCP (Model Context Protocol) standard server. It can be used with all AI platforms that support MCP, not just Claude Code.
| Platform | Setting Location | Connection Method |
| --- | --- | --- |
| Claude Code | `claude mcp add` CLI (`~/.claude.json`) or `.mcp.json` | Streamable HTTP |
| Claude Desktop | claude_desktop_config.json | Streamable HTTP |
| Claude.ai Web | Settings > Integrations | OAuth (RFC 7591) |
| Cursor | .cursor/mcp.json | Streamable HTTP |
| Windsurf | ~/.codeium/windsurf/mcp_config.json | Streamable HTTP |
| GitHub Copilot | VS Code MCP Marketplace | Streamable HTTP |
| Codex CLI | ~/.codex/config.toml | Streamable HTTP |
| ChatGPT Desktop | Developer Mode > Apps | OAuth (RFC 7591) |
| Continue | config.json | Streamable HTTP |
Common settings: Server URL `http://localhost:57332/mcp`, Authorization header with `Bearer YOUR_ACCESS_KEY`.
Claude.ai Web / ChatGPT integration uses OAuth. Enter the issued API key (`mmcp_xxx`) as `client_id` to connect without Dynamic Client Registration (RFC 7591). Trusted domains (claude.ai, chatgpt.com) have automatically approved redirect URIs.
See [Integration Guides](docs/getting-started/) for platform-specific settings.
## 7 Fragment Types
| Type | Description | Use Case |
| --- | --- | --- |
| `fact` | Fact | Objective information such as settings, paths, and versions |
| `decision` | Decision | Architecture selection, technology stack decisions, and rationale |
| `error` | Error | Occurred errors, causes, and solutions |
| `preference` | Preference | User style, coding rules, and work methods |
| `procedure` | Procedure | Repeatable steps such as deployment, build, and testing |
| `relation` | Relation | Connections between entities, dependencies, and ownership relationships |
| `episode` | Episode | Narrative memories including context (1000 characters, others 300 characters) |
## Core Features
| Feature | Description |
| --- | --- |
| `remember` | Store important information as atomic fragments |
| `recall` | Return only necessary memories with keyword + semantic 3-layer search |
| `context` | Automatically restore core context at session start |
| Automatic organization | Duplicate merging, contradiction detection, importance attenuation, TTL-based forgetting |
| **Link reconsolidation** | `tool_feedback` feedback is reflected in real-time in fragment_links' weight/confidence (ReconsolidationEngine). Contradictory links are automatically quarantined. |
| **Spreading activation** | When `recall` is called, `contextText` is passed to proactively boost related fragments' activation_score and prioritize contextually relevant results (SpreadingActivation). |
| **Episode continuity** | Automatically generate `preceded_by` edges between episode fragments created after `reflect` to preserve experience flow as a graph (EpisodeContinuityService). |
| Management console | Memory exploration, knowledge graph, statistics dashboard, API key group/state filter, daily-limit inline editing |
| OAuth integration | RFC 7591 Dynamic Client Registration, Claude.ai / ChatGPT Web integration support |
| Workspace isolation | Memories are separated by project, job, and client even within the same key. Automatically tagged with `api_keys.default_workspace` and filtered during search. |
| Batch processing | `batch_remember` uses multi-row single INSERT (256KB or 500-row chunk). `reflect` delegates 5 categories to a single batch call. EmbeddingWorker processes a queue bundle with a single generateBatchEmbeddings + multi-row UPDATE. |
| Consistency Gate | Tracks morpheme indexing completion with `fragments.morpheme_indexed` column. Incomplete fragments are automatically excluded from L3 morpheme search paths. |
| Mode preset | `recall-only` / `write-only` / `onboarding` / `audit` JSON preset. Limit tool exposure range with `X-Memento-Mode` header or `api_keys.default_mode`. |
| Affective tagging | `fragments.affect` column (neutral / frustration / confidence / surprise / doubt / satisfaction). Filter by emotional label during `remember` / `recall`. |
| Local embedding | Use `@huggingface/transformers` pipeline-based embedding (Xenova/multilingual-e5-small, 384d by default) without external API by setting `EMBEDDING_PROVIDER=transformers`. |
See [SKILL.md](SKILL.md) for the full MCP tool list.
## CLI
Remote MCP server can be directly manipulated without a local node. Specify with `--remote URL --key KEY` global flag or `MEMENTO_CLI_REMOTE` / `MEMENTO_CLI_KEY` environment variables.
```bash
# Recall from remote server (environment variable method)
MEMENTO_CLI_REMOTE=https://example.com/mcp MEMENTO_CLI_KEY=mmcp_xxx memento-mcp recall "query"
# Recall from remote server (flag method)
memento-mcp recall "query" --remote https://example.com/mcp --key mmcp_xxx
# Table format output, 5 results
memento-mcp recall "query" --format table --limit 5
# Prevent duplicate storage with idempotency key
memento-mcp remember "content" --topic project_name --idempotency-key k1
```
Choose output format with `--format table|json|csv`, and 11 subcommands support `--help`/`-h`. See [docs/cli.md](docs/cli.md) for detailed flags.
## API Response Meta
`recall` / `context` responses include `_meta: { searchEventId, hints, suggestion }` field.
```json
{
"fragments": [...],
"_meta": {
"searchEventId": "evt-abc123",
"hints": { "signal": "consider_context" },
"suggestion": { "code": "large_limit_no_budget", "message": "..." }
}
}
```
`remember` / `link` / `forget` / `amend` return expected results without side effects with `dryRun: true` parameter. All responses include `X-RateLimit-Limit` / `X-RateLimit-Remaining` / `X-RateLimit-Resource` headers, and headers are omitted when using master key or limit=null setting. `recall` limits return fields to 17 whitelist ranges with `fields` array. `remember` / `batchRemember` prevent duplicate storage within the same key_id range with `idempotencyKey` parameter (max 128 characters).
## Security
- RBAC default-deny: Tool names not in `TOOL_PERMISSIONS` map are immediately denied regardless of permissions.
- Tenant isolation: `forget` / `amend` / `link` / `fragment_history` are inaccessible to other tenants with SQL-level `key_id` conditions. "None" and "no permission" are treated with the same message to prevent exposure.
- injectSessionContext: Client-transmitted internal fields such as `_keyId` / `_permissions` are reinjected with server authentication results to prevent session context forgery.
- Admin rate limit: IP-based rate limit for `/auth`, `/keys` POST, `/import` POST.
- OpenAPI: `GET /openapi.json` endpoint (`ENABLE_OPENAPI=true`). Master key returns entire path, API key returns permissions filter spec.
## Symbolic Verification Layer
Optional explainability, advisory link integrity, polarity conflict detection, and policy rule soft gating. 9 core modules + 5 rule files. All flags are disabled by default.
## Smart Recall
- ProactiveRecall: `remember()` with keyword overlap-based similar fragment automatic linking.
- CaseRewardBackprop: automatic backpropagation of evidence fragment importance during case verification events.
- SearchParamAdaptor: automatic optimization of search thresholds based on usage patterns.
- CBR(Case-Based Reasoning): searching for similar cases with `recall(caseMode=true)` to reuse past solution patterns.
- depth filter: controlling search depth by Planner/Executor roles (`"high-level"` / `"detail"` / `"tool-level"`).
- recall response `key_id`: including owner tenant identifier in returned fragments.
- Reconsolidation: real-time updates of `fragment_links` weights/confidence based on `tool_feedback` (`ENABLE_RECONSOLIDATION=true`).
- Spreading Activation: pre-activating related fragments with `recall(contextText=...)` based on conversation context (`ENABLE_SPREADING_ACTIVATION=true`).
`fragments.id` is in the format of `frag-{16-character hex}`. Note that it's not a UUID, so be cautious when generating or parsing IDs externally.
The `/metrics` endpoint exposes metrics in Prometheus-compatible format. Users can freely configure collection and visualization.
## Recall vs. Rules
Memento-injected recall fragments have lower priority than system prompts. Fact memories like "We use PostgreSQL 15" work well, but behavioral rules like "Always use Given-When-Then pattern when writing tests" may be ignored if conflicting with system prompts.
It's recommended to set behavioral rules in high-priority channels like CLAUDE.md, AGENTS.md, hooks, and skills.
## Benchmark
[LongMemEval-S](https://arxiv.org/abs/2407.15460) performance with 500 questions:
| Metric | Score | Comparison |
|------|------|------|
| Search recall@5 | 88.3% | +8~18pp compared to LongMemEval paper Stella 1.5B |
| QA accuracy | 45.4% | with temporal metadata (baseline 40.4%) |
| Fragment throughput | 89,006 fragments / 27 seconds | entire pipeline including ingestion, embedding, and search |
Search achieves over 80% recall in 5 out of 6 question types. However, there's a significant gap between search recall (88.3%) and QA accuracy (45.4%), mainly due to limitations in the reader stage, especially in multi-session and temporal reasoning.
See the [Benchmark Report](docs/benchmark.md) for detailed analysis.
## Usage Patterns
Memento is optimized for fact caching. For scenarios where context is crucial:
- Store narratives in `episode` type to reconstruct the context.
- Store `contextSummary` alongside to retrieve context during recall.
- Use a dual-structure approach with main memory systems (MEMORY.md, etc.) for fact search and context restoration.
## Who Can Benefit
- Developers using AI agents like Claude Code, Cursor, or Windsurf daily.
- Those tired of repeating the same explanations in every session.
- Users who want AI to remember their project context.
## Further Reading
| Document | Content |
|------|------|
| [Quick Start](docs/getting-started/quickstart.md) | Detailed installation guide |
| [Architecture](docs/architecture.md) | System structure, DB schema, 3-tier search, TTL |
| [Configuration](docs/configuration.md) | Environment variables, MEMORY_CONFIG, embedding providers |
| [API Reference](docs/api-reference.md) | HTTP endpoints, prompts, resources |
| [CLI](docs/cli.md) | 9 terminal commands |
| [Internals](docs/internals.md) | Evaluators, integrators, contradiction detection |
| [Benchmark](docs/benchmark.md) | Detailed analysis of LongMemEval-S benchmark |
| [SKILL.md](SKILL.md) | MCP tool reference |
| [INSTALL.md](docs/INSTALL.md) | Migration, hook setup, detailed installation |
| [CHANGELOG](CHANGELOG.md) | Version-by-version changes |
## Operations
- `/health`: Comprehensive check of DB, Redis, pgvector, and worker status. Degraded responses in case of partial failures.
- Rate Limiting: 100 requests/minute per API key, 30 requests/minute per IP. Adjustable via environment variables.
- Worker recovery: Automatic retries with exponential backoff (1s→60s) for embedding and evaluation workers.
- Graceful Shutdown: Waiting for ongoing workers to complete (30 seconds) before auto-reflecting sessions.
- OAuth endpoint: Returns `WWW-Authenticate` header for clients to initiate authentication flow automatically. Session TTL defaults to 240 minutes.
## Known Limitations
- L1 Redis cache only supports API key-based isolation. Multi-agent environments require L2/L3 isolation.
- Automatic quality evaluation only applies to decision, preference, and relation types. Fact, procedure, and error types are excluded.
- If `MEMENTO_ACCESS_KEY` is not set, authentication is disabled. Required for externally exposed environments.
## Tech Stack
- Node.js 20+
- PostgreSQL 14+ (with pgvector extension)
- Redis 6+ (optional)
- OpenAI Embedding API (optional) or `EMBEDDING_PROVIDER=transformers` (local low-cost mode)
- Gemini CLI / Codex CLI / GitHub Copilot CLI (for quality evaluation, morpheme analysis, and automatic reflect; optional, with LLM_PRIMARY / LLM_FALLBACKS)
- @huggingface/transformers + ONNX Runtime (for NLI contradiction classification and local embedding; CPU-only)
- MCP Protocol 2025-11-25
Core functionality works with PostgreSQL only. Adding Redis enables L1 cascade search and SessionActivityTracker. Adding OpenAI API or `EMBEDDING_PROVIDER=transformers` enables L3 semantic search and automatic linking.
## Motivation
<details>
<summary>Expand/Collapse</summary>
In practical AI usage, I experienced inefficiencies from repeatedly explaining the same context daily. System prompts had limitations, and note-taking became unmanageable as the number of fragments grew. Old and new information often conflicted.
The main issue was dealing with repeated explanations and settings. Authentication issues would resurface, and files would need to be opened directly to verify settings. Despite thorough explanations, the same issues would recur. Restarting sessions meant repeating the same process.
It felt like educating a new employee daily, despite having a stellar educational background.
"Hey, do you remember me?" – Without clues, nothing comes to mind, but a single phrase like "our classmate in elementary school" brings back a flood of memories. AI works similarly. Yesterday's bug fixes, last week's decisions, and preferred coding styles are all important. Instead of resetting every session, Memento remembers.
To address this, I designed a system that breaks down memories into atomic units, searches them hierarchically, and naturally forgets over time. Like humans, this system incorporates "appropriate forgetting."
It doesn't stop there. As feedback accumulates, connections strengthen, and weak links disappear. Repeated experiences abstract into patterns. Episodes between sessions connect, and context becomes a narrative. The goal is not to build a library but to create an AI that grows through experience.
Memory isn't a prerequisite for intelligence; it's a condition. Knowing chess strategies doesn't help if you can't recall yesterday's game. Speaking multiple languages is useless if you can't remember the previous conversation. Having billions of parameters to store knowledge doesn't make a difference if you can't recall the previous interaction.
Memory enables relationships, and relationships build trust.
Memories don't disappear; they move to a cold tier. Forgotten fragments are eventually deleted during the next consolidation cycle. This is a design feature, not a bug. Unnecessary memories make room for new ones. Even St. Augustine's palace needs organization.
Even famously forgetful goldfish remember things for months.
Now, your AI can too.
</details>
## License
Apache 2.0
<p align="center">
Made by <a href="mailto:jinho.von.choi@nerdvana.kr">Jinho Choi</a> |
<a href="https://buymeacoffee.com/jinho.von.choi">Buy me a coffee</a>
</p>
MCP Config
Below is the configuration for this MCP Server. You can copy it directly to Cursor or other MCP clients.
mcp.json
Connection Info
You Might Also Like
everything-claude-code
Complete Claude Code configuration collection - agents, skills, hooks,...
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
servers
Model Context Protocol Servers
servers
Model Context Protocol Servers
cc-switch
All-in-One Assistant for Claude Code, Codex & Gemini CLI across platforms.
Time
A Model Context Protocol server for time and timezone conversions.