Skip to content

feat: add PostgreSQL + pgvector backend for shared multi-agent QMD deployments#342

Open
chrisdietr wants to merge 5 commits intotobi:mainfrom
chrisdietr:feat/pgvector-backend
Open

feat: add PostgreSQL + pgvector backend for shared multi-agent QMD deployments#342
chrisdietr wants to merge 5 commits intotobi:mainfrom
chrisdietr:feat/pgvector-backend

Conversation

@chrisdietr
Copy link

Summary

This PR adds PostgreSQL + pgvector as an opt-in storage backend for QMD.

SQLite remains the default backend and continues to be the right choice for
local, single-user workflows. This change is specifically aimed at shared and
multi-agent deployments, where multiple processes may need concurrent access to
the same index and SQLite becomes a limiting operational choice.

The goal is to extend QMD to that environment without changing the default
experience for existing users.

Motivation

QMD works very well today as a local-first tool. But multi-agent setups place
different demands on the storage layer than single-user local use.

When several agents or processes need to share the same memory/index and perform
concurrent reads and writes, a server-backed database is a better fit than
SQLite. PostgreSQL provides a more robust concurrency model while still letting
QMD preserve the same indexing, search, and retrieval model it already has.

In short:

  • SQLite stays the default for simple local use
  • PostgreSQL becomes available for shared, concurrent, multi-agent setups

What changed

Backend selection

  • add QMD_BACKEND to select the storage backend
  • add QMD_POSTGRES_URL for PostgreSQL connection configuration
  • preserve SQLite as the default when no backend is specified

Database abstraction

  • extend the existing DB abstraction so QMD can operate against either:
    • SQLite
    • PostgreSQL

PostgreSQL support

  • add PostgreSQL schema initialization
  • add PostgreSQL full-text search using tsvector
  • add pgvector-backed semantic search
  • add HNSW vector indexing
  • add backend-aware vector table management and cleanup

CLI / status output

  • update qmd status so it reports PostgreSQL backend information correctly
    instead of assuming a SQLite file path

Documentation

  • document PostgreSQL backend setup and usage in README.md
  • clarify the intended use case as shared / multi-agent deployments

Tests

  • add PostgreSQL integration tests
  • remove personal/local machine references from the PostgreSQL test setup
  • replace a flaky wall-clock LLM batching assertion with a deterministic test
    that validates batching behavior without depending on machine timing

Compatibility

This change is intended to be backward-compatible.

Existing users

  • SQLite remains the default backend
  • existing CLI workflows are unchanged
  • existing MCP workflows are unchanged

PostgreSQL users

PostgreSQL support is opt-in:

export QMD_BACKEND=postgres
export QMD_POSTGRES_URL=postgresql://user:pass@localhost:5432/qmd

Vector handling

Internal vector handling remains aligned with the existing sqlite-vec path.
Embeddings were already represented at the storage boundary as Float32Array;
the PostgreSQL backend follows the same convention and converts values as needed
for pgvector.

Testing

Default suite:

npx vitest run --reporter=verbose test/

Result:

  • 616 passed
  • 3 skipped

The 3 skipped tests are the opt-in PostgreSQL integration tests in
test/store.postgres.test.ts.

PostgreSQL integration suite:

QMD_ENABLE_POSTGRES_TESTS=1 bun test --preload ./src/test-preload.ts test/store.postgres.test.ts

Result:

  • 3 passed
  • 0 failed

Note: the PostgreSQL integration test provisions the test database/schema inside
an existing PostgreSQL server, but does not start PostgreSQL itself. It assumes
a reachable Postgres instance with the vector extension available.

Notes for reviewers

  • SQLite remains the default/simple path
  • PostgreSQL support is intentionally opt-in
  • the new integration tests are opt-in because they require external Postgres
    setup with pgvector available
  • the included LLM test change is test stabilization, not a product behavior
    change; I kept it in this PR because it was necessary to make the branch
    reliably green

Tooling note

This PR was prepared with pi using GPT-5.4 xhigh.

- src/pg.ts: PostgreSQL adapter (Worker thread + Atomics.wait for sync API)
- src/pg-worker.ts: Worker thread for async postgres queries
- src/db.ts: Backend selection via QMD_BACKEND env (default 'sqlite')
- src/store.ts: Backend-aware queries (vec0→pgvector HNSW, FTS5→tsvector+GIN)
- test/store.postgres.test.ts: Integration tests (gated by QMD_ENABLE_POSTGRES_TESTS)
- README.md: Backend documentation

All 28 existing tests pass. 3 new Postgres integration tests pass.
Zero breaking changes to SQLite behavior.

Built by gpt-5.3-codex (xhigh reasoning, 154K tokens)
SQLite allows non-aggregated columns in GROUP BY; Postgres does not.
Wrap c.doc in MIN() for Postgres path to satisfy ANSI GROUP BY rules.
When QMD_BACKEND=postgres, status now shows:
- Backend: PostgreSQL (dbname) instead of Index: /path/to/sqlite
- Size from pg_database_size() instead of file stat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant