Skip to content

feat(import): Open Knowledge Format (OKF) import#101

Merged
ethanj merged 42 commits into
mainfrom
feat/okf-import
Jun 14, 2026
Merged

feat(import): Open Knowledge Format (OKF) import#101
ethanj merged 42 commits into
mainfrom
feat/okf-import

Conversation

@ethanj

@ethanj ethanj commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Stacked on the OKF export branch (#100). Base will retarget to main once that merges; the diff here is the import delta only.

What

Adds llmwiki import --okf <dir> [--trusted], the inverse of the OKF export — it reads a conformant Open Knowledge Format (OKF) v0.1 bundle and maps each document back into a llmwiki page, reusing the shared, reversible OKF↔llmwiki mapping.

  • Default: staged for review. Imported docs become review candidates (reviewMode: imported, held reason imported-okf) and flow through the existing review list/show/approve lifecycle. External knowledge never reaches the live wiki without a human in the loop.
  • --trusted: write live. Runs the identical pipeline (read → map → validate → collision-check) and only skips the human-staging step, writing pages straight into wiki/. It does not skip validation or collision safety, and the live write + index/MOC refresh run under the same .llmwiki lock that compile and approve use.
  • Durable, attributable origin. Imported pages carry provenanceState: imported plus an okf:<bundle> source token that survives approval, so downstream retrieval and evals can always tell foreign knowledge from locally-compiled, citation-validated pages. Their locally-computed freshness is unverified.
  • Lossless, identity round-trip. A native export→import preserves the canonical body byte-for-byte, the page kind, sources, citations (via inline ^[…] claim markers), and routes query pages back to wiki/queries/. The page slug is identity (concepts/rag.mdrag).
  • Tolerant of foreign bundles. An unknown type maps to kind: concept with the raw type preserved under x-okf.type; the full original frontmatter is snapshotted under x-okf.originalFrontmatter; unknown keys and broken links are kept, never rejected.

Why

OKF is an emerging standard for portable agent knowledge. Pairing import with the export target makes compiled wikis a true round-trip surface — bundles can leave as OKF and come back without losing the compiler's richer provenance — while keeping foreign, untrusted bundles gated behind review.

Safety (untrusted-input ingestion)

Import treats every bundle as untrusted: a single confined, resource-bounded reader rejects (never truncates) bundles exceeding file-count / per-doc / total-size caps; every file is realpath-confined to the bundle root (symlink escapes are skipped); frontmatter parses on js-yaml's default safe schema; and both write paths run page validation and the skip-and-warn collision policy (never overwriting a live page, a pending candidate, or an intra-bundle duplicate).

Test plan

  • npx tsc --noEmit, npm run build, npm test (1599 passed), npx fallow (0 above threshold), CI-strict dupes clean
  • Native round-trip: canonicalBody(imported) === canonicalBody(original), kind/sources/citations preserved, query page routed to wiki/queries/
  • Durable imported provenance + okf: token survive approval; no state.json ownership written
  • Path confinement (symlink escape), resource-cap rejection, collision skip (incl. under --trusted), nested-slug non-collapse, freshness unverified, and a subprocess CLI test

ethanj and others added 30 commits June 13, 2026 23:50
Map ExportPage to OKF frontmatter with x-llmwiki provenance block.
Includes canonical body stripping, sha256 content-hash, safeRefName helper.
…rences

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ethanj ethanj changed the base branch from feat/okf-export to main June 14, 2026 21:14
@ethanj ethanj merged commit 15f6b50 into main Jun 14, 2026
3 checks passed
@ethanj ethanj deleted the feat/okf-import branch June 14, 2026 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant