All notable changes to SuperLocalMemory V3 will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- License: Changed from Elastic-2.0 to AGPL-3.0-or-later to protect research IP
P0 silent data loss fix. The async /remember pipeline was broken since
v3.4.32 — memories were being marked "queued" and acknowledged but never
actually persisting to memory.db during runtime. Only daemon-restart drained
the pending queue (limit 20 per restart). 18 memories were permanently lost
to a NoneType iterable crash between April 15-26, 2026, all recoverable
because the content was preserved in pending.db.
- Materializer
_engineNameError (unified_daemon.py). The background pending materializer thread referenced a module-level_engineglobal that was never declared. Result: every iteration threwNameError: name '_engine' is not defined, the exception was caught and logged as "materializer loop error", and the thread slept 5s and retried forever without ever processing pending memories. Bug present since v3.4.32. Fixed by declaring_engine = Noneat module level and assigning_engine = enginein the FastAPI lifespan afterengine.initialize(). - scene_builder NoneType crash (
encoding/scene_builder.py:assign_to_scene). When the embedding worker was unavailable (cold-start timeout, crash),embedder.embed()returned None. The code checkedtheme_emb is Nonebut never checkedfact_emb is None, so_cosine(None, theme_emb)calledzip(None, theme_emb)→'NoneType' object is not iterable, propagating up throughengine.store()→ mark_failed → permanent loss. Fixed by guardingfact_emb is None(skip scene assignment, still create scene) and adding defensiveNonecheck to_cosine()itself. - Retry-aware mark_failed (
cli/pending_store.py). Previously, ANY exception during materialization permanently marked the memory as failed — even transient errors like embedding worker timeout. Now uses the existingretry_countcolumn: keeps status aspendinguntil 3 retries, only marksfailedafter all retries are exhausted.
- Diagnostic logging in materializer — "Materializer: waiting for engine to init...", "engine acquired, starting drain loop", "processing N pending memories" — so operators can verify the materializer is alive without grepping for absence of error messages.
tests/test_integration/test_async_remember_e2e.py— full production pipeline test: POST/remember(async, default mode) → wait up to 60s → verify content inmemory.db→ recall returns it. This is the test that was missing for 8+ months. The 4,501 existing test functions test components in isolation (mockingstore_pending) and never exercise the full async flow that real users hit.
On install, if you have existing failed records in pending.db, they will
be auto-retried on the next daemon restart by engine._process_pending_memories().
To manually recover, run:
import sqlite3
db = sqlite3.connect('~/.superlocalmemory/pending.db')
db.execute("UPDATE pending_memories SET status='pending', retry_count=0, error=NULL WHERE status='failed'")
db.commit()Then slm restart.
P0 RAM fix. Total SLM footprint reduced from ~14 GB peak to ~2.3 GB peak (84% reduction). Idle dropped from ~2.5 GB to ~1.0 GB. Users with 16 GB laptops can now run SLM without uninstalling.
- CoreML EP allocation — Added
ORT_DISABLE_COREML=1torecall_worker.py,cli/commands.py(warmup diagnose path), and the Popen environment dicts incore/embeddings.pyandretrieval/reranker.py. Previously onlyembedding_worker.pyandreranker_worker.pyset this. On ARM64 Mac, ONNX Runtime's CoreML Execution Provider allocated 3-5 GB per missing guard. - Duplicate MemoryEngine — The QueueConsumer (recall_queue.db drain)
was routing through
WorkerPool→recall_workersubprocess, which loaded a SECOND full MemoryEngine inside the daemon. Now routes through the daemon's in-process engine via the newEngineRecallAdapter. Eliminates ~800 MB of duplication. - Eager warmup — Removed
WorkerPool.shared().warmup()from daemon startup. The recall_worker subprocess no longer spawns at boot. It remains available as a fallback for dashboard/chat routes.
- RSS limits tightened:
embedding_workerself-kill: 4000 MB → 1800 MBrecall_workerself-kill: 2500 MB → 1500 MB- Daemon watchdog
MAX_WORKER_MB: 4096 MB → 1800 MB HealthMonitor.global_rss_budget_mb: 4096 MB → 2500 MB
- Watchdog interval: 60s → 15s in both daemon watchdog and
HealthMonitor
check_interval_sec. Catches memory spikes faster. - Idle timeouts:
SLM_EMBED_IDLE_TIMEOUT: 1800s (30 min) → 300s (5 min)SLM_RERANKER_IDLE_TIMEOUT: 1800s → 300s- Reduces idle RAM held by ML model subprocesses.
EngineRecallAdapterinunified_daemon.py— wraps the in-process MemoryEngine to satisfyRecallPoolProtocolfor the QueueConsumer. Eliminates the recall_worker subprocess on the hot path.
Persistent hook daemon: recall latency drops from ~2.2s to sub-second by eliminating Python subprocess startup on every prompt.
hooks/hook_daemon.py— Unix domain socket server that keeps a long-lived process for recall requests. Claude Code connects via socket instead of spawning a fresh Python interpreter per prompt. Eliminates ~300-500ms of subprocess overhead. Starts/stops with the SLM daemon.- Auto-restart watchdog:
ensure_hook_daemon()checks socket health and restarts the daemon if it died. Claude Code hooks call this before connecting, so a crashed daemon is transparent to the user. - Graceful fallback: if the socket is unavailable, the hook automatically falls back to the v3.4.35 subprocess path. Claude Code performance is NEVER impacted by daemon failure.
- 9 new tests for daemon lifecycle, socket protocol, ack detection, watchdog, fallback, and memory safety.
- Ack prompts: ~5ms via socket (was 30ms via subprocess)
- Substantive recall: target sub-1s (was 2.2s p50 via subprocess)
- Hook daemon RSS: ~15-20MB (no engine, no ONNX, no PyTorch)
Production auto-recall: every Claude Code prompt automatically retrieves the top relevant memories via the unified queue, so the agent has continuous- learning context without the user invoking recall manually.
hooks/auto_recall_hook.py— production UserPromptSubmit handler. Reads stdin JSON from Claude Code, detects ack prompts (silent fast path), enqueues substantive prompts torecall_queue.db, polls for the result with mode-aware timeout (A=10s, B=25s, C=40s), and injects the top-K memories as Claude Code'shookSpecificOutput.additionalContextenvelope. Wraps recalled content in untrusted-boundary markers so the LLM treats it as data, not instructions. Fail-open on any error.core/queue_consumer.py— daemon background thread that drainsrecall_queue.db. Claims jobs atomically, routes throughpool.recall()(engine never loaded in MCP/hook processes), writes results back. Priority lanes (high=recall, low=consolidate). Periodic cleanup of completed rows.slm hook auto_recallCLI subcommand wires Claude Code to the hook.- 50 new tests —
test_queue_consumer.py(11) +test_auto_recall_hook.py(39). Full TDD coverage including ack detection, fencing, dedup, fail-open.
core/recall_queue.py—complete()now wrapped inBEGIN IMMEDIATEfor fencing-token atomicity under multi-process access. Dedup hash includesnamespaceto prevent cross-namespace result collisions.server/unified_daemon.py— starts QueueConsumer on boot, stops on shutdown.hooks/hook_handlers.py— dispatchesauto_recallto the new hook.
- p50 recall latency: 1.75s (40-prompt integration test, Mode B)
- p99 recall latency: 11.83s
- Hook process RSS: ~20 MB (no engine loading, no memory blast)
- Ack prompts: 30 ms (silent, no recall)
Fix: user's mode choice can no longer be silently overwritten.
- Mode protection in
SLMConfig.save(). Anysave()call that would change the mode inconfig.jsonis now blocked unless the caller passesmode_change=True. This prevents accidental mode resets when code creates a freshSLMConfig()(defaults to Mode A) and callssave()to persist an unrelated field change. A warning is logged when a silent mode change is blocked. - MCP
set_modepreserves user settings. Previouslyset_modecreated a freshSLMConfig.for_mode()that lost all user customizations (LLM provider, API keys, embedding config, active profile). Now carries forward all settings from the existing config, matching the dashboard behavior. - All intentional mode-change paths (
slm mode, MCPset_mode, dashboard PUT/api/v3/mode, setup wizard) passmode_change=True.
Fix: daemon leaked SQLite connections to learning.db via bandit threadlocals.
- Bandit threadlocal connection leak.
reward_proxy.settle_stale_playscreates aContextualBanditthat opens a threadlocal connection via_conn_for. When called fromasyncio.to_thread(bandit_loops.py, every 60 s), each thread-pool thread kept its connection open for the process lifetime. Over 24 h this accumulated 12+ leaked file descriptors and ~100 MB of wasted SQLite page-cache RAM. Newbandit.close_threadlocal_conn()function, called in thesettle_stale_playsfinally block, ensures pool threads release their connections immediately. - Corrected embedding worker memory comment. The
~200MB footprintnote was written forall-MiniLM-L6-v2; the default modelnomic-ai/nomic-embed-text-v1.5uses ~1.1 GB via ONNX.
Fix: concurrent remembers no longer block recalls on the shared embedder.
- Daemon
/rememberis now async by default. Writes to the pending queue in under 100 ms and returns apending_id; a background thread drains the queue in the background. Previously, the synchronousengine.store()on the FastAPI event loop could block/searchand/healthfor 30+ seconds while the single embedder worker processed a large write. Under concurrent load the daemon could appear hung. - Materializer yields to active recalls. While any
/searchis in flight the drainer sleeps between items, so user-initiated recalls always get the embedder first. - MCP remember tool simplified. Writes to
pending.dband returns; the daemon's materializer completes the pipeline. Removes the redundant in-processpool.storebackground task that previously contended with/search. pool_storereturns["pending:<id>"]when the daemon is async, keeping a stable identifier for callers without blocking on the embedder.
?wait=truequery parameter onPOST /rememberfor callers that need synchronous behaviour and realfact_idsin the response.superlocalmemory.core.recall_gatemodule — shared counter that lets the materializer detect in-flight recalls and yield priority.
- No action required. Existing clients continue to work; the
response shape is compatible (
ok,countstill present). Scripts that depended onfact_idsto validate the write should switch topending_idor pass?wait=trueto opt in to the legacy behaviour.
Dashboard truth, memory vs fact clarity, and self-cleaning pending queue.
- Dashboard now shows both memory counts honestly. Parent memories (what you stored) and atomic facts (what retrieval indexes) appear as two distinct cards with their ratio. No more "Total Memories: 6,000" when you actually have 2,000 memories decomposed into 6,000 facts.
- "Browse atomic facts" relabeled for clarity — this view lists the indexed atomic units.
- Visible search box in the Memories tab — previously hidden behind the Recall Lab only. Search now debounces 280 ms on input.
/api/memories/{id}/detail— full memory + all child atomic facts in one call. Powers the click-to-expand modal./api/facts/{id}— single atomic fact detail with source memory content, entities, and canonical entities.- Pagination UI — Prev/Next controls show "Showing 1–50 of 6,123". Previously hardcoded to 50 with no navigation.
- CSV export — new
format=csvoption on/api/exportplus a dedicated "Export All (CSV)" menu item. JSON and JSONL still work. - Export progress toast — "Preparing JSON export…" notification before the download starts.
total_facts+facts_per_memoryin/api/statsresponse.- Pending queue auto-cleanup — the maintenance scheduler now sweeps the pending queue every cycle: completed rows > 7 days, failed rows over retry limit, and stuck rows > 7 days are removed; a 30-day hard cap prevents runaway growth on any status.
- Test isolation —
pending_storenow honorsSLM_DATA_DIR. Four MCP remember tests were writing to the live~/.superlocalmemory/instead oftmp_path. Root conftest now forcesSLM_DATA_DIR=tmp_pathfor every test unless explicitly opted out. - Fact click popup — was calling
/api/v3/recall/tracewith a text substring (re-query by first 100 chars) and colliding with the memory row click handler. Now scoped to.fact-result-itemonly, hits the new/api/facts/{fact_id}endpoint. - Memory modal ID confusion — the modal labeled
mem.idas "ID" regardless of whether it was a memory_id or fact_id. Now displays both "Memory ID" and "Fact ID" when they differ. - Memory modal hydration — fetches the full memory + fact list asynchronously when opened, so source content and entity data appear even for rows that arrived from the search endpoint.
Multi-IDE shared worker, silent migration, and security hardening.
- Multi-IDE RAM sharing. MCP processes share a single recall worker via the daemon. Total RSS stays below 2 GB with four IDEs open.
- Feedback and learning signals flow from every IDE session to the daemon, not just the first.
- Setup wizard validates the data directory at install time and
rejects iCloud, Dropbox, OneDrive, Box, Google Drive, and
Library/CloudStoragepaths that silently corrupt SQLite WAL. - One-time upgrade banner after
pip install -U/npm install -gpoints users toslm doctor. docs/errors.md— canonical error catalog with codes, recovery steps, exit codes, and HTTP status mappings.- CI matrix now runs on
ubuntu-22.04,macos-14(Apple Silicon), andwindows-latestwithportalocker.
- Silent, atomic data migration on upgrade — no manual steps.
- Migration serialized via file lock so parallel pip + npm installs cannot race.
- Concurrent-safe MCP engine singleton with double-checked locking.
- Pool adapter returns frozen dataclasses instead of
SimpleNamespace.
- File permissions tightened: marker files written at 0600, parent directories at 0700.
- Symlink-following blocked on version marker reads.
- Cloud-synced directory detection extended to
Library/CloudStorage(macOS 13+).
- Silent error swallows in daemon shutdown, migration probe, and banner emission now log at WARNING.
- Fenced-out
complete()writes (stale worker claims) emit a WARNING log instead of vanishing silently. - Daemon-start migration guarded behind
is_readysentinel — skips when already applied.
Critical hotfix on top of 3.4.22 for two end-user-facing regressions.
- Daemon error log no longer balloons. A ternary passed as the
logger.infoformat string caused aTypeErroron every startup in 24/7 mode. Python's logging module then dumped the full FastAPImerged_lifespanstack to stderr; over a day the LaunchAgent log grew to tens of MB. The call is now pre-formatted. A defensive log-rotation pass at startup truncates any daemon log over 10 MB so users upgrading from 3.4.22 get a clean slate on first boot. - Dashboard no longer hangs after a daemon upgrade. Static JS/CSS/HTML
was served without cache headers, so browsers served stale modules after
slm restartand the dashboard showed an infinite spinner. All static responses now shipCache-Control: no-cache, must-revalidate, andindex.htmlembeds the server version; on mismatch the tab clearslocalStorage(preserving theme) and hard-reloads once. - Fetches can no longer hang forever. A global
fetchpatch attaches a 15-secondAbortControllertimeout to every relative-URL request, so a dead socket surfaces as a rejection instead of leaving a spinner spinning. No callsite changes required.
GET /api/version— returns the running daemon version; consumed by the dashboard version-fingerprint auto-reload.
Hardening release — correctness, stability, and security fixes.
slm benchmarkplus escape-hatch commands (disable,enable,clear-cache,reconfigure).- One-time upgrade banner on first boot after install.
- Tighter defaults for the interactive installer.
- Licence: AGPL-3.0-or-later.
- Node.js prerequisite: ≥ 18.
- Hardened redaction, path validation, and token handling per internal audit. No end-user-visible behaviour change.
- Fully backward compatible.
atomic_factsis never modified by any migration. All upgrades are additive.
- Recall cold-start eliminated. Embedding + reranker workers stay warm for 30 minutes by default instead of 2 minutes, so bursts of recalls no longer pay a 30-60 second model-load tax on every other query.
SLM_EMBED_IDLE_TIMEOUT— seconds to keep the embedding worker warm (default 1800). Set to 120 to restore pre-v3.4.19 behavior.SLM_RERANKER_IDLE_TIMEOUT— same, for the cross-encoder reranker (default 1800).
- pip and npm installs now ship identical functionality. Semantic search and cross-encoder reranking work out of the box on pip (previously required
pip install superlocalmemory[search]). - First pip run auto-installs Claude Code hooks when Claude Code is detected, matching the npm postinstall experience.
- Entity Explorer no longer stuck on "No entities found" after switching operating modes.
- Engine-backed routes (entity, ingest, recall, remember, list) auto-recover after mode changes — no daemon restart required.
- Mode change audit log at
~/.superlocalmemory/logs/mode-audit.log. - Mode C now requires an explicit API key via Settings to prevent accidental cloud-mode writes.
Varun Pratap Bhardwaj Solution Architect
SuperLocalMemory V3 - Intelligent local memory system for AI coding assistants.
- Excessive memory usage during rapid file edits — auto-observe now reuses a single background process instead of spawning one per edit. Rapid multi-file operations (parallel agents, branch switching, batch edits) no longer risk high memory usage.
- Observation debounce — rapid-fire observations are batched and deduplicated within a short window, reducing redundant work.
- Memory-aware worker management — new safety check skips heavy processing when system memory is low.
| Variable | Default | Description |
|---|---|---|
SLM_OBSERVE_DEBOUNCE_SEC |
3.0 |
Observation batching window |
SLM_MIN_AVAILABLE_MEMORY_GB |
2.0 |
Min free RAM for background processing |
- Langevin dynamics now active — positions were never initialized at store time, causing the entire Langevin lifecycle system to be inert (0 positioned facts). New facts now receive near-origin positions (Strategy A).
- Backfill for existing facts — maintenance now initializes unpositioned facts using metadata-aware equilibrium seeding (Strategy B) followed by 50-step burn-in (Strategy C). Old, rarely-accessed facts land in their correct lifecycle zones immediately.
- Maintenance returns
langevin_backfilledcount for observability - Health check now reports positioned facts accurately after backfill
- Adaptive Memory Lifecycle — memories naturally strengthen with use and fade when neglected. No manual cleanup needed.
- Smart Compression — embedding precision adapts to memory importance, achieving up to 32x storage savings on low-priority memories.
- Cognitive Consolidation — automatic pattern extraction from clusters of related memories. Your knowledge graph self-organizes.
- Pattern Learning — auto-learned soft prompts injected into agent context at session start. The system teaches itself what matters.
- Hopfield Retrieval — 6th retrieval channel for vague or partial query completion. Ask half a question, get the whole answer.
- Process Health — automatic detection and cleanup of orphaned SLM processes. No more zombie workers.
slm decay— run memory lifecycle reviewslm quantize— run smart compression cycleslm consolidate --cognitive— extract patterns from memory clustersslm soft-prompts— view auto-learned patternsslm reap— clean orphaned processes
forget— programmatic memory archival via lifecycle rulesquantize— trigger smart compression on demandconsolidate_cognitive— extract and store patterns from memory clustersget_soft_prompts— retrieve auto-learned patterns for context injectionreap_processes— clean orphaned SLM processesget_retention_stats— memory lifecycle analytics
- 7 new API endpoints for lifecycle stats, compression stats, patterns, and process health
- New dashboard tabs: Memory Lifecycle, Compression, Patterns
- Mode A/B memory usage reduced from ~4GB to ~40MB (100x reduction)
- Embedding migration on mode switch (auto-detects model change)
- Forgetting filter in retrieval pipeline (archived memories excluded from results)
- 6-channel retrieval (was 5)
- Fully backward compatible with 3.2.x
- New tables created automatically on first run
- No manual migration needed
- Performance improvements for retrieval pipeline
- New memory management capabilities with configurable lifecycle controls
- Enhanced dashboard with 3 additional monitoring tabs
- 9 new API endpoints for configuration and status
- 5 new MCP tools for proactive memory operations
- 5 new CLI commands for configuration management
- Internal retrieval architecture optimized with additional search channel
- Schema extensions for improved data management (9 new tables)
- Memory surfacing engine with multi-signal scoring
- Significant latency reduction in recall operations (vector-indexed retrieval)
- Idle-time memory optimization for large stores
- Reduced memory footprint for long-running sessions
- Windows
slm --version/slm -v—.batand.cmdwrappers now intercept--version/-vdirectly (fast path, no Python needed) and setPYTHONPATHto the npm package'ssrc/directory before launching Python. Previously, Windows users hittingslm.batinstead of the Node.js wrapper gotunrecognized arguments: --versionbecause Python resolved an older pip-installed version without the flag. - Unix bash wrapper (
bin/slm) — now setsPYTHONPATHand intercepts--version/-v, matching the Node.js wrapper's behavior. Previously relied on npm's shim always routing toslm-npm. postinstall.js— now runspip install .to install thesuperlocalmemoryPython package itself (not just dependencies). Prevents stale pip-installed versions from shadowing the npm-distributed source. Falls back to--userfor PEP 668 environments.preuninstall.js— corrected version string from "V2" to "V3".- Windows Python detection — added
py -3(Python Launcher for Windows) as a fallback candidate inslm.bat. - Environment parity — all three entry points (
slm-npm,slm,slm.bat) now set identical PyTorch memory-prevention env vars (PYTORCH_MPS_HIGH_WATERMARK_RATIO,TORCH_DEVICE, etc.).
slm doctorcommand — comprehensive pre-flight check: Python version, all dependency groups, embedding worker functional test, Ollama connectivity, API key validation, disk space, database integrity. Supports--jsonfor agent-native output.slm hooks installlisted in CLI reference and README.- Dashboard, learning (lightgbm), and performance (diskcache, orjson) dependencies now install automatically during
npm install.
- Warmup reliability — increased subprocess timeout from 60s to 180s for first-time model download. Added step-by-step progress output and direct in-process import diagnostics when worker fails.
- Mode B default model — changed from
phi3:minitollama3.2to matchprovider_presets()and reduce first-time setup friction. - postinstall.js — now installs all 5 dependency groups (core, search, dashboard, learning, performance) with clear status messages per group.
- Error messages — all embedding worker failures, engine fallbacks, and dashboard errors now suggest
slm doctorfor diagnosis. - pyproject.toml — added
diskcacheandorjsonto core dependencies; aligned optional dependency versions with core.
- Profile switching and display uses correct identifiers
- Profile sync across CLI, Dashboard, and MCP — all entry points now see the same profiles
- Profile switching now persists correctly across restarts
- Resolve circular import in server module loading
- Environment variable support across all CLI tools
- Multi-tool memory database sharing
- Paweł Przytuła (@pawelel) - Issue #7 and PR #8
- Windows installation and cross-platform compatibility
- Database stability under concurrent usage
- Forward compatibility with latest Python versions
- Full Windows support with PowerShell scripts for all operations
slm attributioncommand for license and creator information
- Overall reliability and code quality
- Dependency management for reproducible installs
- Windows compatibility for repository cloning (#7)
- Updated test assertions for v2.8 behavioral feature dimensions
Release Type: Major Feature Release — "Memory That Manages Itself"
SuperLocalMemory now manages its own memory lifecycle, learns from action outcomes, and provides enterprise-grade compliance — all 100% locally on your machine.
- Memory Lifecycle Management — Memories automatically organize themselves over time based on usage patterns, keeping your memory system fast and relevant
- Behavioral Learning — The system learns what works by tracking action outcomes, extracting success patterns, and transferring knowledge across projects
- Enterprise Compliance — Full access control, immutable audit trails, and retention policy management for GDPR, HIPAA, and EU AI Act
- 6 New MCP Tools —
report_outcome,get_lifecycle_status,set_retention_policy,compact_memories,get_behavioral_patterns,audit_trail - Improved Search — Lifecycle-aware recall that automatically promotes relevant memories and filters stale ones
- Performance Optimized — Real-time lifecycle management and access control
- Enhanced ranking algorithm with additional signals for improved relevance
- Improved search ranking using multiple relevance factors
- Search results include lifecycle state information
- Configurable storage limits prevent unbounded memory growth
- Documentation organization and navigation
- Per-profile learning — each profile learns its own preferences independently
- Thumbs up/down and pin feedback on memory cards
- Learning data management in Settings (backup + reset)
- "What We Learned" summary card in Learning tab
- Smarter learning from your natural usage patterns
- Recall results improve automatically over time
- Privacy notice for all learning features
- All dashboard tabs refresh on profile switch
- Enhanced trust scoring accuracy
- Improved search result relevance across all access methods
- Better error handling for optional components
- Learning Dashboard Tab — View your ranking phase, preferences, workflow patterns, and privacy controls
- Learning API — Endpoints for dashboard learning features
- One-click Reset — Reset all learning data directly from the dashboard
Release Type: Major Feature Release — "Your AI Learns You"
SuperLocalMemory now learns your patterns, adapts to your workflow, and personalizes recall. All processing happens 100% locally — your behavioral data never leaves your machine.
- Adaptive Learning System — Detects your tech preferences, project context, and workflow patterns across all your projects
- Personalized Recall — Search results automatically re-ranked based on your learned preferences. Gets smarter over time.
- Zero Cold-Start — Personalization works from day 1 using your existing memory patterns
- Multi-Channel Feedback — Tell the system which memories were useful via MCP, CLI, or dashboard
- Source Quality Scoring — Learns which tools produce the most useful memories
- Workflow Detection — Recognizes your coding workflow sequences and adapts retrieval accordingly
- Engagement Metrics — Track memory system health locally with zero telemetry
- Isolated Learning Data — Behavioral data stored separately from memories. One-command erasure for full GDPR compliance.
- 3 New MCP Tools — Feedback signal, pattern transparency, and user correction
- 2 New MCP Resources — Learning status and engagement metrics
- New CLI Commands — Learning management, engagement tracking, pattern correction
- New Skill — View learned preferences in Claude Code and compatible tools
- Auto Python Installation — Installer now auto-detects and installs Python for new users
- Interactive Knowledge Graph — Fully interactive visualization with zoom, pan, and click-to-explore
- Mobile & Accessibility Support — Touch gestures, keyboard navigation, and screen reader compatibility
Release Type: Security Hardening & Scalability — "Battle-Tested"
- Rate Limiting — Protection against abuse with configurable thresholds
- API Key Authentication — Optional authentication for API access
- CI Workflow — Automated testing across multiple Python versions
- Trust Enforcement — Untrusted agents blocked from write and delete operations
- Advanced Search Index — Faster search at scale with graceful fallback
- Hybrid Search — Combined search across multiple retrieval methods
- SSRF Protection — Webhook URLs validated against malicious targets
- Higher memory graph capacity with intelligent sampling
- Hardened profile isolation across all queries
- Bounded resource usage under high load
- Optimized index rebuilds for large databases
- Sanitized error messages — no internal details leaked
- Capped resource pools for stability
Release Type: Framework Integration — "Plugged Into the Ecosystem"
- LangChain Integration — Persistent chat history for LangChain applications
- LlamaIndex Integration — Chat memory storage for LlamaIndex
- Session Isolation — Framework memories tagged separately from normal recall
Release Type: Major Feature Release — "Your AI Memory Has a Heartbeat"
SuperLocalMemory transforms from passive storage to active coordination layer. Every memory operation now triggers real-time events.
- Reliable Concurrent Access — No more "database is locked" errors under multi-agent workloads
- Real-Time Events — Live event broadcasting across all connected tools
- Subscriptions — Durable and ephemeral event subscriptions with filters
- Webhook Delivery — HTTP notifications with automatic retry on failure
- Agent Registry — Track connected AI agents with protocol and activity monitoring
- Memory Provenance — Track who created or modified each memory, and from which tool
- Trust Scoring — Behavioral trust signals collected per agent
- Dashboard: Live Events — Real-time event stream with filters and stats
- Dashboard: Agents — Connected agents table with trust scores and protocol badges
- Refactored core modules for reliability and performance
- Dashboard modernized with modular architecture
- Profile isolation bug in dashboard — graph stats now filter by active profile
- Hierarchical Clustering — Large knowledge clusters auto-subdivided for finer-grained topic discovery
- Cluster Summaries — Structured topic reports for every cluster in the knowledge graph
Release Type: Profile System & Intelligence
- Memory Profiles — Single database, multiple profiles. Switch instantly from any IDE or CLI.
- Auto-Backup — Configurable automatic backups with retention policy
- Confidence Scoring — Statistical confidence tracking for learned patterns
- Profile Management UI — Create, switch, and delete profiles from the dashboard
- Settings Tab — Backup configuration, history, and profile management
- Column Sorting — Click headers to sort in Memories table
--fullflag to show complete memory content without truncation- Smart truncation for large memories
- CLI
getcommand now retrieves memories correctly
- ChatGPT Connector — Search and fetch memories from ChatGPT via MCP
- Streamable HTTP Transport — Additional transport option for MCP connections
- Dashboard Enhancements — Memory detail modal, dark mode, export, search score visualization
- Security improvement in dashboard event handling
Release Type: Universal Integration
SuperLocalMemory now works across 16+ IDEs and CLI tools.
- Auto-Configuration — Automatic setup for Cursor, Windsurf, Claude Desktop, Continue.dev, Codex, Copilot, Gemini, JetBrains
- Universal CLI —
slmcommand works in any terminal - Skills Installer — One-command setup for supported editors
- Tool Annotations — Read-only, destructive, and open-world hints for all MCP tools
Release Type: Feature Release — Advanced Search
- Advanced Search — Faster, more accurate search with multiple retrieval strategies
- Query Optimization — Spell correction, query expansion, and technical term preservation
- Search Caching — Frequently-used queries return near-instantly
- Combined Search — Results fused from multiple search methods for better relevance
- Fast Vector Search — Sub-10ms search at scale (optional)
- Local Embeddings — Semantic search with GPU acceleration (optional)
- Modular Installation — Install only what you need: core, UI, search, or everything
Release Type: Major Feature Release — Universal Integration
- 6 Universal Skills — remember, recall, list-recent, status, build-graph, switch-profile
- MCP Server — Native IDE integration with tools, resources, and prompts
- Attribution Protection — Multi-layer protection ensuring proper credit
- 11+ IDE Support — Cursor, Windsurf, Claude Desktop, Continue.dev, Cody, Aider, ChatGPT, Perplexity, Zed, OpenCode, Antigravity
SuperLocalMemory V3 represents a complete architectural rewrite with intelligent knowledge graphs, pattern learning, and enhanced organization.
- 4-Layer Architecture — Storage, Hierarchical Index, Knowledge Graph, Pattern Learning
- Automatic Entity Extraction — Discovers key topics and concepts from your memories
- Intelligent Clustering — Automatic thematic grouping of related memories
- Pattern Learning — Tracks your preferences across frameworks, languages, architecture, security, and coding style
- Storage Optimization — Progressive compression reduces storage by up to 96%
- Profile Management — Multi-profile support with isolated data
We use Semantic Versioning:
- MAJOR: Breaking changes (e.g., 2.0.0 → 3.0.0)
- MINOR: New features (backward compatible, e.g., 2.0.0 → 2.1.0)
- PATCH: Bug fixes (backward compatible, e.g., 2.1.0 → 2.1.1)
Current Version: v3.3.0
Website: superlocalmemory.com
npm: npm install -g superlocalmemory
SuperLocalMemory V3 is released under the Elastic License 2.0.
100% local. 100% private. 100% yours.