Skip to content

Local-first multi-machine sync #692

Description

@maphew

Progress branch: https://github.com/maphew/agentsview/tree/docs/local-first-multi-machine-sync

Local-First Multi-Machine Sync: the Artifact Ledger

Status Draft proposal
Date 2026-06-12
Authors maphew, with Claude Fable 5 (multi-agent research workflow)
Inputs local-first-sync-research/ (audits, research, critique)

Summary

Make agentsview local-first with serverless multi-machine sync, in the
mold of fossil-scm: every machine holds the complete archive, sync is an
idempotent exchange of immutable artifacts over any dumb transport
(Syncthing folder, S3 bucket, or another agentsview instance over HTTP),
and no machine is architecturally privileged. No CRDT library is needed:
session content is single-writer-per-machine and append-mostly (a
grow-only set), and the thin layer of user-mutable metadata (renames,
trash, stars, pins) is handled by a fossil-ticket-style append-only
change log replayed deterministically with hybrid logical clocks.

SQLite remains exactly what the project already declares it to be: a
local, rebuildable derivation. The live database file never crosses the
wire. PostgreSQL push survives as an optional aggregation peer, demoted
from coordination point.

Motivation

Today a user with a laptop and a desktop has three partial options, each
with a mandatory hub or a one-way arrow:

  • pg push / pg serve: requires an always-on PostgreSQL server.
  • agentsview sync --host (SSH pull): serverless and proven, but
    pull-only, manual/cron-only, re-downloads a full tar of every agent
    dir each run, only covers file-based agents, and never propagates
    user metadata (a rename or star made on one machine is invisible
    everywhere else).
  • duckdb push / quack serve: a read mirror, beta, no FTS.

Upstream demand for something better is visible: issue 572 ("How to
synchronize data across multiple macOS devices"), issue 412 (periodic
SSH remote sync), issue 517 (multiple named pg targets), issue 484
(stars/pins in pg serve), issue 332 (pg push machine attribution), and
issue 655 (pg push same-id collision ping-pong).

Goals

  1. Every machine ends up with the full archive: all sessions from all
    machines, queryable and searchable locally (FTS5 intact).
  2. User curation converges: renames, trash/restore, stars, pins made on
    any machine reach all machines.
  3. No mandatory server. Any always-on peer, NAS share, or object-store
    bucket improves availability, but none is required by the
    architecture (fossil's "central server is a social convention").
  4. Mixed app versions keep syncing: machine A on a newer release must
    interoperate with machine B on an older one.
  5. The existing single-machine experience is unchanged when sync is not
    configured.

Non-goals

  • Real-time collaborative editing. Convergence latency is transport
    latency plus sync cadence (seconds to minutes).
  • Partial/selective sync, subscriptions, or multi-tenant sharing. The
    trust model is a fully mutually trusted personal fleet (see Trust).
  • Replacing pg push/pg serve; they remain as an optional hub.

Why no CRDT engine

The data shape decides this. Audit of every table and column
(local-first-sync-research/01-codebase-audits.md) shows two classes:

  1. Bulk session content (sessions, messages, tool_calls,
    tool_result_events, usage_events, secret_findings): derived
    deterministically from session files, created by exactly one
    machine, append-mostly. Merging two machines' archives is set-union
    of records that cannot conflict — the degenerate "grow-only set"
    CRDT that needs no library. Fossil's own docs describe its artifact
    bag the same way.
  2. User-mutable metadata (sessions.display_name,
    sessions.deleted_at, starred_sessions, pinned_messages + notes,
    excluded_sessions tombstones, worktree_project_mappings): the only
    genuinely multi-writer data, edited rarely and by one human. An
    append-only log of timestamped change events with last-writer-wins
    replay — fossil's ticket model verbatim — is sufficient, auditable,
    and deterministic.

General-purpose CRDT machinery would solve a problem this data does not
have, while charging real costs (below).

Alternatives considered and rejected

Full sourcing in local-first-sync-research/03-technology-research.md.
State of the ecosystem as of mid-2026:

  • Automerge: automerge-go is effectively unmaintained (last
    commit Oct 2024, no tagged release, wraps a pre-3.0 core via cgo, no
    transport layer). Even Automerge 3 requires full in-memory document
    loads; the maintainers scope documents as "units of collaboration"
    and their own bulk-data research (sedimentree) moves large content
    out of the CRDT into content-addressed blobs — i.e. toward this
    design. Rejected for bulk and for metadata.
  • cr-sqlite: upstream dormant since v0.16.3 (Jan 2024); the only
    maintained lineage is Fly.io's purpose-built fork for Corrosion. CRR
    constraints collide with this schema head-on: FTS5 virtual tables
    cannot be CRRs, the messages rowid PK and external-content FTS
    linkage break, triggers/CASCADE FKs are restricted, __crsql_clock
    shadow tables bloat a multi-GB DB, and Fly documented an ALTER TABLE
    metadata-backfill storm — this repo alters the sessions table
    routinely. Rejected.
  • SQLite session extension (changesets): maintained forever as part
    of SQLite and accessible from Go via zombiezen/modernc (not via
    mattn/go-sqlite3 — issue 825 there, open since 2020) — but the binary
    changeset format is coupled to table column count, so every release
    that adds a sessions column (frequent here; dataVersion is at 36)
    bricks mixed-version sync. NDJSON's ignore-unknown-fields tolerance
    gives version skew handling for free. Rejected; its HLC/two-tier
    semantics are adopted, its codec is not.
  • Whole-DB replication (Litestream v0.5 read replicas,
    sqlite3_rsync): healthy tools, but they produce N separate replica
    DBs. The entire Store interface, UI, and analytics assume one
    queryable DB; cross-replica fan-out would touch everything. Also
    one-way by design. Rejected as the architecture; fine as a backup
    strategy alongside.
  • Raw-session-file mirror over Syncthing (sync the agent dirs
    themselves, let each machine parse everything): fastest to build and
    the best version-skew story, but structurally blind to non-file
    sessions — at least 7 agents in the registry are FileBased:false,
    plus uploads, claude.ai/ChatGPT imports, SSH-pulled sessions, and
    orphan-preserved sessions whose files are gone. It would also
    file-copy Zed's live threads.db SQLite database, the exact
    corruption hazard this design exists to avoid. Rejected as the sync
    unit; raw files return as optional fallback artifacts (see
    Invariants).
  • Server-light engines (ElectricSQL, PowerSync, Evolu, Jazz,
    Ditto, Turso/libSQL embedded replicas, Marmot v2): all require a
    sync service, an always-on cluster, or have no Go story. Rejected.

Design

Overview

Each install maintains a write-once, content-addressed artifact store
alongside (never inside) the SQLite DB:

$AGENTSVIEW_DATA_DIR/artifacts/<origin>/
  checkpoints/cp-<seq>.json        append-only numbered index files
  manifests/<hash>.json.zst        session manifests
  segments/<hash>.ndjson.zst       message segments
  meta/<hlc>-<hash>.json           user-edit change feed
  raw/<hash>                       optional raw source file fallback

A machine writes only under its own origin prefix (single-writer per
prefix means transports cannot conflict). Sync between any two stores —
or between a store and a folder/bucket/peer — is idempotent set-union
of immutable files. Ingestion derives the local SQLite rows from
foreign artifacts through the existing write paths.

machine A                                            machine B
sync engine -> SQLite -> exporter -> artifacts/A --\
                                                    >-- transport --
ingester <- artifacts/B <---------------------------/   (folder/S3/
    |                                                    HTTP peer)
    v
SQLite (A + B merged, FTS5 maintained by normal triggers)

Origin identity

Each install generates and persists an origin ID once: configured
machine name (default os.Hostname(), reusing the validation that
rejects the local sentinel, internal/config/config.go,
internal/postgres/sync.go) plus a short random suffix, e.g.
thinkpad-x9k2. The suffix survives hostname changes and distinguishes
restored/cloned machines; persistence copies the EnsureAuthToken
pattern (internal/config/config.go).

Global session identity is (origin, native_session_id). Locally
produced rows are untouched (bare IDs, machine='local'). Foreign
sessions are stored as id = origin + "~" + nativeID, machine = origin — byte-for-byte the proven SSH remote-sync convention
(EngineConfig.IDPrefix/Machine in internal/ssh/sync.go,
applyRemoteRewrites in internal/sync/engine.go, StripHostPrefix
in internal/parser/types.go), which every read path, the UI, and
GetMachines already render correctly. This avoids composite-PK
surgery across SQLite/PG/DuckDB under the Backend Parity rule.

Artifact kinds

  1. Message segment: canonical NDJSON (zstd) of N consecutive parsed
    messages keyed by natural coordinates — ordinal, source_uuid, role,
    content, tool_calls by (ordinal, call_index), tool_result_events by
    event_index, token fields. Natural-coordinate keying is already the
    schema's cross-copy convention (secret_findings; the orphan copier's
    (session_id, ordinal) joins). Message rowids are explicitly unstable
    (nextMessageIDTx) and never appear in artifacts.
  2. Session manifest: small JSON with the parser-derived session
    header (the same field set sessionPushFingerprint enumerates), an
    ordered list of segment hashes, inline usage_events, the producer's
    data_version, and a generation counter. A newer manifest for the
    same session supersedes older ones, ordered by (data_version,
    generation). Steady-state appends emit one tail segment plus a new
    manifest reusing prior segment hashes. Superseded manifests/segments
    become unreferenced and GC-able after a grace window.
  3. Meta change event: tiny JSON
    {v, hlc, origin, session_gid, op, value} with op in {rename,
    soft_delete, restore, star, unstar, pin, unpin, purge}; pins anchor
    by source_uuid with ordinal fallback (the existing
    savePinsTx/restorePinsTx logic). Append-only forever; the full
    edit history is retained.
  4. Checkpoint: cp-<seq>.json mapping session_gid to current
    manifest hash plus the meta-feed high-water mark. Append-only
    numbered files keep the store fully write-once; discovery of changes
    is O(changed), not O(store).
  5. Raw source fallback (optional, on by default for file-based
    agents): the original session file stored as a content-addressed
    blob and referenced from the manifest. See Invariants for why.

Export

Export reads from the DB, not from source files, so non-file agents,
uploads, imports, SSH-pulled sessions, and orphan-preserved sessions
all publish. After each successful session write, the session is queued
and debounced by reusing the existing pg-watch loop: the artifact
exporter implements the same small target interface
(cmd/agentsview/pg_watch.go), so agentsview sync --watch is the
existing daemon with a different sink. Change detection reuses the
fingerprint-skip discipline (sessionPushFingerprint plus per-session
last-exported-manifest state, modeled on pg_sync_state). Export is
scoped to machine-owned rows; machine='local' is rewritten to the
origin ID at export time. Uploads (which default to machine='remote')
are explicitly included — both prior designs fumbled this.

Ingestion

A new internal/artifact importer per foreign origin: read the latest
checkpoint, diff manifest hashes against an artifact_sync_state table
(modeled on pg_sync_state), fetch and hash-verify missing segments,
assemble db.Session plus []db.Message, apply the origin~ prefix,
and write through the existing paths (UpsertSession,
ReplaceSessionMessages/WriteSessionBatchAtomic). That single
decision inherits, for free: FTS5 maintenance via the normal triggers
(including the bulk trigger-swap fast path), excluded/trashed tombstone
rejection, pin re-attachment by source_uuid, and stats triggers. The
importer then replays new meta events in HLC order and fires the SSE
broadcaster (closing the existing gap where non-engine writes never
emit data_changed).

Manifests that reference segments not yet delivered are recorded as
phantoms (fossil's term) and retried on the next pass, tolerating
arbitrary delivery order from dumb transports.

Metadata ledger

Every user-mutation handler (rename, soft-delete/restore/permanent
delete, star, pin) additionally appends one meta event to the
machine's own feed. Replay is ordered by (HLC, artifact-hash tiebreak)
— a data-intrinsic ordering key, so every node derives identical state
from identical artifact sets regardless of local clocks. The HLC is
persisted across restarts, monotonic per node, with a bounded-drift
clamp (the Actual Budget pattern). Per-field last-writer-wins; when two
origins write the same field within clock-skew distance, the losing
value is appended to a meta_conflicts log and the UI shows a fork
badge with both values — converge automatically, never silently lose
(fossil's lesson). Applied events go through the existing DB mutators
in a suppress-re-export mode so no echo loops arise.

Deletes

  • Soft delete / restore: ordinary meta events; deleted_at converges.
  • Permanent delete (purge): per-event opt-in only ("delete
    everywhere" confirmation). It propagates an excluded_sessions
    tombstone, which UpsertSession already enforces against
    resurrection, and peers locally shun the session's bulk artifacts.
    Default remains today's semantics: EmptyTrash is local-only.
  • Checkpoint absence is never deletion (see Invariants).

Transports

One verb, three target shapes, all the same set-union:

  1. Folderagentsview sync /path/to/share (Syncthing, Dropbox,
    NFS, rclone mount). Safe for dumb file sync because every file is
    immutable, written temp+rename, and single-writer-per-prefix. The
    live SQLite DB never crosses the wire — the documented corruption
    class (SQLite's "How To Corrupt", Zotero's KB, Syncthing forums)
    does not apply to write-once artifact files.
  2. HTTP peeragentsview sync https://desktop:8080. Four routes
    on the existing embedded server behind the existing Bearer-token
    middleware: list origins, get checkpoint, get artifact by hash, post
    artifact (hash-verified, write-once, idempotent). Stateless and
    resumable; fossil's igot/gimme reduced to HTTP GETs because
    content-addressing makes "have" a stat call. Any running agentsview
    is a rendezvous, like fossil ui.
  3. Object storage — same layout under an S3/B2 prefix; rclone
    against the folder shape covers it until native support lands.

Interaction with resync and dataVersion

The artifact store lives outside the DB file, so ResyncAll's atomic
swap does not touch it; artifact_sync_state is carried across the
swap alongside the existing metadata copy (CopySessionMetadataFrom).
After a parser-version resync, changed sessions re-export with bumped
data_version manifests — the same "force full push after resync" rule
pg push uses. Segments whose canonical content is unchanged keep their
hashes; a parser change touching a common message field genuinely
re-ships content, which the raw-file fallback hedges (peers may
re-derive locally instead of re-downloading).

Mixed versions: bundles are NDJSON with ignore-unknown-fields and
skip-unknown-ops rules plus an explicit format version, so an older
reader skips fields it does not know and a newer reader tolerates their
absence. Each machine's own parser and dataVersion govern only its own
DB.

What pg push becomes

Short term: unchanged (ingested peer sessions are ordinary rows, and
their machine column carries true origin). Medium term: extract the
small SessionSink interface latent in push.go's orchestration (which
contains no SQL); PG becomes one sink, the artifact exporter another.
Long term: PostgreSQL is an optional aggregation/analytics peer.
Prerequisite fixes regardless of this design: machine-scoped export
(upstream issues 332 and 655).

Invariants (pinned before any code)

  1. Canonical serialization is a forever-contract. Sorted keys,
    fixed number formatting, explicit format version; golden tests
    enforce byte-stability. Any silent change re-hashes every segment
    and triggers a fleet-wide reship. The raw-source fallback artifact
    exists so that even a broken contract degrades to local re-derive,
    not re-download.
  2. Checkpoint absence is never deletion. Tombstone events are the
    only delete mechanism. A session vanishing from an origin's
    checkpoint (local EmptyTrash, export bug, truncation) must not
    propagate removal.
  3. ErrSessionTrashed/ErrSessionExcluded on import means retry
    later, never advance the watermark.
    Meta events are tiny and
    segments large, so a soft-delete routinely arrives before content;
    if the watermark advanced anyway, a later restore would strand stale
    content with nothing to trigger a re-fetch.
  4. Single-writer-per-prefix is the only write rule on shared
    transports.
    Colliding origin IDs (cloned machine, restored backup)
    must be detected (checkpoint seq conflict) and surfaced loudly, not
    merged.
  5. The live SQLite file never crosses the wire. Documentation must
    say this explicitly and warn against syncing the data dir.

Trust model

A fully mutually trusted personal fleet. Folder transports have no
per-writer identity (prefix discipline is convention; Syncthing has no
per-subdir ACL), and HTTP mode is one shared symmetric Bearer token —
any peer can forge any origin's metadata. That is acceptable for one
person's machines and must be documented as exactly that. Per-peer
tokens are the minimum follow-up before any sharing story; origin
signatures are the eventual answer.

Practical availability note, stated plainly in docs: two
intermittently-on laptops sync only when both are online. A NAS folder,
S3 bucket, or any always-on peer is the practical rendezvous — by
social convention, not architecture, exactly as in fossil.

Migration

Fully additive. Upgrade generates an origin ID; no rewrite of existing
rows. agentsview sync --init backfills artifacts for the whole
existing DB (including orphans) and seeds the meta feed from current
display_name/deleted_at/stars/pins timestamped with local_modified_at.
Machines without sync configured behave exactly as today. New tables
(artifact_sync_state, meta_clock/conflicts) arrive via the existing
idempotent migration pattern; no dataVersion bump, no resync.

Phasing

  1. Prereq fixes (days): machine-scoped pg push export; preserve
    per-session machine at push time (upstream issues 332, 655). Real
    bugs regardless of this design.
  2. Phase 1 (2-4 weeks): artifact store, canonical serializer with
    golden tests, exporter, folder-transport set-union, importer,
    sync --init. Delivers the headline want — every machine sees all
    sessions — read-only, over Syncthing/Dropbox/NFS, no schema surgery.
  3. Phase 2 (2-3 weeks): HLC, meta ledger, deterministic replay,
    fork badges, purge with confirm UX. Delivers converging curation.
  4. Phase 3 (1-2 weeks): HTTP peer endpoints behind existing auth,
    sync --watch via the pg-watch loop, peers page in the UI.
  5. Phase 4 (ongoing): GC of superseded artifacts, native S3 target,
    SessionSink refactor of pg push, two-instance E2E harness.

Estimated 6-10 weeks total for one developer; each phase ships value
alone.

Risks

  • Canonical-serialization drift (highest variance; mitigated by golden
    tests, format version, raw fallback).
  • HLC/LWW edge cases: skewed clocks, restart persistence, tie
    determinism — table-driven tests required; replay must be idempotent
    under any feed permutation.
  • Storage growth: a third local copy of the corpus (source files + DB +
    compressed artifacts) plus peers'. zstd gives 5-10x on JSONL; GC of
    superseded bulk artifacts is mandatory, with a grace window against
    slow peers.
  • Meta feed file count: one small file per edit grows forever;
    personal-scale fine, needs a batching/compaction story eventually.
  • FTS5 initial ingest at multi-GB scale re-tokenizes every peer's
    corpus; use the existing Drop/Rebuild bulk path for first ingest.
  • N-times row counts make the machine filter load-bearing UX for the
    sidebar and analytics.
  • Scope creep: this deliberately stops at set-union plus LWW. Partial
    sync, subscriptions, or content merging would erode the simplicity
    that makes it safe to own (~3-4k LOC).

Open questions

  • Should the raw-source fallback be mandatory rather than optional for
    file-based agents? (Cost: storage; benefit: version-skew immunity.)
  • Per-agent or per-project export excludes (selective publish, fossil
    private-branch analog) in v1 or later?
  • Dotfile-synced agent dirs produce visible duplicates under two
    origins (today: silent same-id merge). Document "pick one transport
    per agent dir", or attempt content-hash coalescing in the UI later?
  • Does worktree_project_mappings (already machine-keyed) ride the
    meta ledger or stay local-only?

Related upstream issues

  • 332 — pg push overwrites original machine name on remote-synced
    sessions (open).
  • 655 — pg push: sessions.id sole PG PK; same-id pushes from two
    machines silently merge and ping-pong (filed from this work).
  • 412 — periodic SSH remote sync from serve (open feature request).
  • 517 — multiple named pg targets (open feature request).
  • 484 — stars/pins in pg serve (closed; metadata demand signal).
  • 572 — multi-machine dashboard question (closed; demand signal).

References

Full research underlying every claim here, including sources and
line-level code citations, lives in local-first-sync-research/:
codebase audits (01), the SSH remote-sync deep audit (02), technology
research with sources (03), the three competing design proposals (04),
and the adversarial critique that selected and hardened this design
(05).

gpt-5.5 on behalf of maphew.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions