diff --git a/.gitignore b/.gitignore
index a60f863..aff0cf1 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,26 +1,25 @@
-node_modules/
-dist/
-.datasets/
-.gcr/
-public/data/
+
+# Site-specific generated files (produced by generate-data per SITE_ID)
+*.gem
+*.log
+*.tgz
 *.tsbuildinfo
 .DS_Store
+.datasets/
 .env
-.env.local
 .env.*
-
-# Site-specific generated files (produced by generate-data per SITE_ID)
+.env.local
+.gcr/
+.idea/
+.vscode/
+TODO*
+TODO.update-browser/
+coverage/
+dist/
+node_modules/
+public/data/
 public/datasets.json
+public/logos/
 public/routing.json
 public/site-config.json
-public/logos/
-
-TODO*
 site-configs.yml
-TODO.update-browser/
-*.gem
-coverage/
-*.log
-*.tgz
-.idea/
-.vscode/
diff --git a/CLAUDE.md b/CLAUDE.md
index 3bb64ae..65f2f12 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -16,7 +16,7 @@ Glossarist Concept Browser (`@glossarist/concept-browser`) — a Vue 3 SPA that
 - Run a single test: `npx vitest run src/__tests__/graph.test.ts`
 - `npm run fetch-datasets` — Clone/update source repos into `.datasets/`, harmonize concepts to canonical format. Supports `DATASET_SOURCE_{ID}` env var for local path override.
 - `npm run generate-data` — Convert harmonized YAML concepts to JSON-LD. Reads from `.datasets/` (populated by fetch-datasets) and `datasets.yml`.
-- `node scripts/build-edges.js` — Pre-compute cross-reference edges from generated concept JSON files (run after `generate-data`)
+- `node scripts/build-edges.js` — Pre-compute cross-reference and domain edges from generated concept JSON files, writes `edges.json` + `domain-nodes.json` (run after `generate-data`)
 - `npm run build:full` — Full pipeline: fetch + generate + build-edges + build
 - `npx concept-browser <command>` — CLI: fetch, generate, edges, build
 
@@ -32,7 +32,7 @@ All datasets are harmonized to ONE canonical YAML format before `generate-data.m
 The target architecture uses GCR (Glossarist Concept Repository) files — sealed ZIP archives with harmonized concepts + metadata, modeled after LXR from `lutaml-xsd`. See `docs/gcr-spec.md`. Currently, the browser reads from cloned repos; when the glossarist gem provides `glossarist package`, the pipeline will switch to consuming `.gcr` files.
 
 ### Data Flow
-`public/datasets.json` → lists dataset IDs → each maps to `public/data/{id}/` containing `manifest.json`, `index.json`, `edges.json`, and `concepts/*.json`. The `AdapterFactory` discovers datasets at startup, loads manifests and indexes, then concepts are fetched on-demand when a user navigates to one.
+`public/datasets.json` → lists dataset IDs → each maps to `public/data/{id}/` containing `manifest.json`, `index.json`, `edges.json` (cross-reference + domain edges), `domain-nodes.json` (domain classification nodes with concept counts), and `concepts/*.json`. The `AdapterFactory` discovers datasets at startup, loads manifests and indexes, then concepts are fetched on-demand when a user navigates to one.
 
 ### Key Layers
 
diff --git a/README.md b/README.md
index c2cd1d2..f49e208 100644
--- a/README.md
+++ b/README.md
@@ -57,9 +57,10 @@ datasets.yml
               └─> public/data/{id}/
                   ├── manifest.json      Dataset metadata
                   ├── index.json         Concept listing (chunked for large sets)
-                  ├── edges.json         Pre-computed cross-references
+                  ├── edges.json         Pre-computed cross-reference + domain edges
+                  ├── domain-nodes.json  Domain classification nodes
                   └── concepts/*.json    Individual concept documents
-                  └─> scripts/build-edges.js  (extract graph edges)
+          └─> scripts/build-edges.js  (extract graph + domain edges)
 ```
 
 ### Step-by-step
diff --git a/TODO.generalized/01-canonical-concept-format.md b/TODO.generalized/01-canonical-concept-format.md
deleted file mode 100644
index 7d568d7..0000000
--- a/TODO.generalized/01-canonical-concept-format.md
+++ /dev/null
@@ -1,71 +0,0 @@
-# Status: DONE
-
-# 01 — Canonical Concept Format Specification
-
-## Context
-
-All glossarist datasets currently use slightly different YAML formats (IEV bare strings, Geolexica arrays, osgeo `authoritative_source`). The browser must not handle format variants — all datasets must conform to ONE canonical format before the browser sees them.
-
-## Task
-
-Create `docs/dataset-schema.md` defining the canonical concept YAML format and the harmonization rules.
-
-### Canonical concept YAML
-
-```yaml
-termid: "102-01-01"              # string, unique within dataset
-term: equality                   # convenience: preferred English term
-eng:                             # language block (ISO 639-2 code)
-  terms:                         # REQUIRED, at least 1
-    - type: expression           # expression | symbol | abbreviation
-      designation: equality
-      normative_status: preferred # preferred | deprecated | admitted
-      gender: f                  # optional
-      plurality: singular        # optional
-      usage_info: Mathematik     # optional
-  definition:                    # ALWAYS array of {content: "..."} objects
-    - content: "relation between two entities..."
-  notes:                         # optional, array of strings
-    - "Note 1 content"
-  examples:                      # optional, array of strings
-    - "Example 1"
-  language_code: eng
-  entry_status: valid            # valid | superseded | withdrawn | draft
-  sources:                       # ALWAYS array (normalize singular forms)
-    - type: authoritative        # authoritative | lineage
-      origin:
-        ref: ISO 1087-1:2000
-        clause: "3.4.16"
-        link: https://www.iso.org/standard/20057.html
-  dates:                         # ALWAYS array of {type, date}
-    - type: accepted
-      date: "2008-08-01T00:00:00+00:00"
-  review_date: "2024-01-01"
-  review_decision_date: "2024-01-01"
-  review_decision_event: published
-```
-
-### Harmonization rules
-
-| Variant | Source format | Harmonized to |
-|---------|--------------|---------------|
-| Definition | bare string `"text"` | `[{content: "text"}]` |
-| Definition | `[{content: "text"}]` | unchanged |
-| Sources | `authoritative_source: {link: "..."}` | `sources: [{type: authoritative, origin: {link: "..."}}]` |
-| Sources | `sources: [{type, origin}]` | unchanged |
-| Sources | absent (IEV) | absent (kept absent) |
-| Dates | `date_accepted: "..."` scalar | `dates: [{type: accepted, date: "..."}]` |
-| Dates | `dates: [{type, date}]` array | unchanged |
-| Entry status | `"Standard"` | `"valid"` |
-| Notes | bare strings | bare strings (kept) |
-| Terms | `abbrev: true` (osgeo) | `type: abbreviation` |
-| `_revisions` | present (isotc211) | **stripped** |
-
-## Files
-
-- Create: `docs/dataset-schema.md`
-
-## Verification
-
-- Document exists, covers all fields, lists all harmonization rules
-- Cross-referenced by GCR spec and adding-a-dataset doc
diff --git a/TODO.generalized/02-gcr-packaging-format.md b/TODO.generalized/02-gcr-packaging-format.md
deleted file mode 100644
index 81abbe7..0000000
--- a/TODO.generalized/02-gcr-packaging-format.md
+++ /dev/null
@@ -1,85 +0,0 @@
-# Status: DONE
-
-# 02 — GCR Packaging Format Specification
-
-## Context
-
-Modeled after LXR from `lutaml-xsd`. A sealed `.gcr` ZIP file bundles harmonized concept data + metadata so that datasets are immutable, self-describing artifacts. The browser pipeline reads GCR files instead of raw repos.
-
-## Task
-
-Create `docs/gcr-spec.md` defining the GCR format.
-
-### GCR ZIP structure
-
-```
-my-dataset.gcr (ZIP)
-├── metadata.yaml              # Dataset metadata + statistics
-├── register.yaml              # Original register metadata from source repo
-├── concepts/                  # Harmonized concept YAML files (canonical format)
-│   ├── 102-01-01.yaml
-│   ├── 102-01-02.yaml
-│   └── ...
-└── concepts_data/             # Pre-serialized (optional, for fast loading)
-    └── ...                    # Future: JSON or Marshal serialized concepts
-```
-
-### metadata.yaml schema
-
-```yaml
-title: IEC Electropedia (IEV)               # required
-description: International Electrotechnical...  # required
-glossarist_version: 2.4.0                    # required
-created_at: "2026-04-28T12:00:00+09:00"     # required
-created_by: glossarist CLI                   # required
-
-statistics:                                  # required
-  concept_count: 22228
-  languages: [eng, ara, deu, fra, ...]
-  concepts_with_definitions: 20000
-  concepts_with_sources: 18000
-
-owner: IEC TC 1                              # optional
-homepage: https://www.electropedia.org       # optional
-repository: https://github.com/glossarist/...  # optional
-license: CC-BY-SA                            # optional
-tags: [electrotechnical, iec, multilingual]  # optional
-
-appearance:                                  # optional
-  color: "#3366ff"
-
-links:                                       # optional
-  - name: IEC Electropedia
-    url: https://www.electropedia.org
-
-schema_version: "1.0.0"                      # required
-```
-
-### Validation rules (for `glossarist validate`)
-
-- `metadata.yaml` exists and parses
-- `concepts/` directory exists with ≥1 YAML file
-- Each concept has `termid` (string)
-- Each concept has ≥1 language block with ≥1 term
-- No duplicate `termid` values
-- `definition` is always array of `{content: "..."}` (harmonized)
-- `sources` is always array (no `authoritative_source` singular)
-- `entry_status` values are from allowed set: `valid`, `superseded`, `withdrawn`, `draft`
-- Cross-references (if present) are valid concept IDs
-
-### Reference: LXR format (lutaml-xsd)
-
-The LXR format is a ZIP with `metadata.yaml` + `schemas/*.xsd` + `schemas_data/*.marshal`. Key files:
-- `/Users/mulgogi/src/lutaml/lutaml-xsd/lib/lutaml/xsd/schema_repository_package.rb` — ZIP read/write
-- `/Users/mulgogi/src/lutaml/lutaml-xsd/lib/lutaml/xsd/package_builder.rb` — serialization orchestration
-- `/Users/mulgogi/src/lutaml/lutaml-xsd/lib/lutaml/xsd/schema_repository_metadata.rb` — metadata model
-- `/Users/mulgogi/src/lutaml/lutaml-xsd/lib/lutaml/xsd/package_configuration.rb` — strategy configuration
-
-## Files
-
-- Create: `docs/gcr-spec.md`
-
-## Verification
-
-- Document exists, specifies ZIP structure, metadata schema, validation rules
-- References canonical format from `docs/dataset-schema.md`
diff --git a/TODO.generalized/03-datasets-yml.md b/TODO.generalized/03-datasets-yml.md
deleted file mode 100644
index c17d52e..0000000
--- a/TODO.generalized/03-datasets-yml.md
+++ /dev/null
@@ -1,72 +0,0 @@
-# Status: DONE
-
-# 03 — Create datasets.yml + .gitignore
-
-## Context
-
-The browser needs a configuration file listing all datasets with their source repos, colors, and metadata. Currently the dataset list is hardcoded in `generate-data.mjs` (lines 309-346). Externalizing it to `datasets.yml` means adding a dataset requires only editing one file.
-
-## Task
-
-### Create `datasets.yml`
-
-```yaml
-# datasets.yml — Glossarist Vocabulary Browser dataset registry
-# Add a new dataset by adding an entry below. No code changes required.
-# Run: npm run fetch-datasets && npm run generate-data && npm run build-edges
-
-datasets:
-  - id: iev
-    sourceRepo: https://github.com/glossarist/glossarist-data-iev
-    title: "IEC Electropedia (IEV)"
-    owner: IEC TC 1
-    existingSiteUrl: https://www.electropedia.org
-    color: "#3366ff"
-    tags: [electrotechnical, iec, multilingual]
-
-  - id: isotc211
-    sourceRepo: https://github.com/geolexica/isotc211-glossary
-    owner: ISO/TC 211
-    existingSiteUrl: https://isotc211.geolexica.org
-    color: "#0d9488"
-    tags: [geographic-information, gis, iso, multilingual]
-
-  - id: isotc204
-    sourceRepo: https://github.com/geolexica/isotc204-glossary
-    owner: ISO/TC 204
-    existingSiteUrl: https://isotc204.geolexica.org
-    color: "#d97706"
-    tags: [transport, its, iso, automated-driving]
-
-  - id: osgeo
-    sourceRepo: https://github.com/geolexica/osgeo-glossary
-    owner: OSGeo
-    existingSiteUrl: https://osgeo.geolexica.org
-    color: "#059669"
-    tags: [osgeo, open-source, gis]
-```
-
-Metadata resolution: `datasets.yml` overrides → repo's `register.yaml` → defaults.
-
-### Create `.gitignore`
-
-```
-node_modules/
-dist/
-.datasets/
-public/data/
-*.tsbuildinfo
-.DS_Store
-.env
-.env.local
-```
-
-## Files
-
-- Create: `datasets.yml`
-- Create: `.gitignore`
-
-## Verification
-
-- `datasets.yml` parses as valid YAML
-- `.gitignore` excludes generated data directories
diff --git a/TODO.generalized/04-fetch-datasets.md b/TODO.generalized/04-fetch-datasets.md
deleted file mode 100644
index 96cdc38..0000000
--- a/TODO.generalized/04-fetch-datasets.md
+++ /dev/null
@@ -1,48 +0,0 @@
-# Status: DONE
-
-# 04 — Create scripts/fetch-datasets.mjs
-
-## Context
-
-Currently dataset source directories are hardcoded absolute paths in `generate-data.mjs` (lines 11-13). Need a script that reads `datasets.yml`, clones/updates the source repos, and makes them available for data generation.
-
-## Task
-
-Create `scripts/fetch-datasets.mjs` that:
-
-1. Reads `datasets.yml` (using `js-yaml`, already a devDependency)
-2. For each dataset:
-   - Check `DATASET_SOURCE_{ID}` env var for local path override
-   - If no override, `git clone --depth 1` into `.datasets/{id}/` (or `git fetch` + `reset` if exists)
-   - Supports `GITHUB_TOKEN` for private repos
-3. Reads `.datasets/{id}/register.yaml` for metadata (title, description, languages)
-4. Validates source directory exists with `.yaml` concept files
-5. Outputs resolved metadata
-
-### Key implementation details
-
-- Use `child_process.execSync` for git operations
-- Clone with `--depth 1` for speed (we don't need history)
-- If `.datasets/{id}/` already exists, do `git fetch origin && git reset --hard origin/HEAD`
-- Read `register.yaml` for `name` (→ title), `description`, `subregisters` (→ languages)
-- Exit gracefully if a repo fails (don't block other datasets)
-- Support `DATASET_SOURCE_IEV=/local/path` env var override for development
-
-### Example usage
-
-```bash
-npm run fetch-datasets
-# or with local override:
-DATASET_SOURCE_IEV=/Users/me/src/glossarist/glossarist-data-iev npm run fetch-datasets
-```
-
-## Files
-
-- Create: `scripts/fetch-datasets.mjs`
-- Modify: `package.json` — add `"fetch-datasets": "node scripts/fetch-datasets.mjs"` script
-
-## Verification
-
-- `npm run fetch-datasets` creates `.datasets/` with all 4 repos
-- Re-running updates existing repos without errors
-- `DATASET_SOURCE_IEV=/local/path npm run fetch-datasets` uses local path
diff --git a/TODO.generalized/05-update-generate-data.md b/TODO.generalized/05-update-generate-data.md
deleted file mode 100644
index d3f70e8..0000000
--- a/TODO.generalized/05-update-generate-data.md
+++ /dev/null
@@ -1,86 +0,0 @@
-# Status: DONE
-
-# 05 — Update scripts/generate-data.mjs
-
-## Context
-
-`generate-data.mjs` has hardcoded paths (lines 11-13), hardcoded cross-ref maps (lines 17-19), and format-variant handling (bare strings in `defsToJsonLd`, inline text scanning in `extractInlineRefs`). Must read from `datasets.yml` + `.datasets/` and handle only the canonical format.
-
-## Task
-
-### Remove
-
-- Hardcoded `IEV_DIR`, `TC211_DIR`, `TC204_DIR` constants (lines 11-13)
-- Hardcoded `REF_PREFIX_MAP` and `URN_STANDARD_MAP` (lines 17-19) — inline refs are pre-extracted during harmonization
-- Hardcoded `DATASETS` array (lines 309-346)
-- Format-variant handling in `defsToJsonLd()` (line 57: `typeof defs === 'string' ? [...] : defs`)
-- Format-variant handling in `extractInlineRefs()` (lines 86-91: bare string normalization)
-- The entire `extractInlineRefs()` function — references are pre-extracted as `gl:references` during harmonization
-
-### Add
-
-- Read `datasets.yml` for dataset list and configuration
-- Read `.datasets/{id}/register.yaml` for metadata (title, description, languages)
-- Resolve source dirs from `.datasets/{id}/concepts/` or `DATASET_SOURCE_{ID}` env var
-- Merge metadata: `datasets.yml` overrides → `register.yaml` → defaults
-- Simplify `defsToJsonLd()` to assume array-of-objects format only
-
-### Keep unchanged
-
-- All JSON-LD conversion logic (`yamlToJsonLd`, `termToDesignation`, `sourcesToJsonLd`)
-- `processDataset()` flow (chunking, manifest generation)
-- `DS_PALETTE` fallback (used when no color in datasets.yml)
-
-### Simplified `defsToJsonLd`
-
-```js
-function defsToJsonLd(defs) {
-  if (!defs || !Array.isArray(defs)) return [];
-  return defs
-    .map(d => ({
-      '@type': 'gl:DetailedDefinition',
-      'gl:content': d.content || '',
-    }))
-    .filter(d => d['gl:content']);
-}
-```
-
-### Main loop reads from datasets.yml
-
-```js
-import datasetsConfig from './datasets.yml' with { type: 'yaml' }; // or parse at runtime
-
-for (const ds of datasetsConfig.datasets) {
-  const dir = process.env[`DATASET_SOURCE_${ds.id.toUpperCase()}`]
-    || path.join(ROOT, '.datasets', ds.id, 'concepts');
-  if (!fs.existsSync(dir)) {
-    console.warn(`Skipping ${ds.id}: source not found (${dir})`);
-    continue;
-  }
-  // Read register.yaml for metadata
-  const registerYaml = readYaml(path.join(ROOT, '.datasets', ds.id, 'register.yaml'));
-  processDataset(dir, ds.id, {
-    title: ds.title || registerYaml.name,
-    description: ds.description || registerYaml.description,
-    owner: ds.owner,
-    languages: ds.languages || Object.keys(registerYaml.subregisters || {}),
-    color: ds.color || DS_PALETTE[idx % DS_PALETTE.length],
-    sourceRepo: ds.sourceRepo,
-    existingSiteUrl: ds.existingSiteUrl,
-    tags: ds.tags,
-  });
-}
-```
-
-## Files
-
-- Modify: `scripts/generate-data.mjs`
-
-## Verification
-
-- `npm run generate-data` works with datasets from `.datasets/`
-- `npm run generate-data` works with `DATASET_SOURCE_IEV` env var
-- No hardcoded dataset paths remain
-- `defsToJsonLd` does not handle bare strings
-- `extractInlineRefs` removed
-- All 4 datasets generate successfully (iev, isotc211, isotc204, osgeo)
diff --git a/TODO.generalized/06-harmonize-osgeo.md b/TODO.generalized/06-harmonize-osgeo.md
deleted file mode 100644
index eaff0ba..0000000
--- a/TODO.generalized/06-harmonize-osgeo.md
+++ /dev/null
@@ -1,7 +0,0 @@
-# 06 — Harmonize osgeo-glossary Dataset
-
-## Status: DONE (integrated into fetch-datasets.mjs)
-
-The harmonization is handled by `scripts/fetch-datasets.mjs` which normalizes all concept YAML files to canonical format during the fetch step. No separate script needed.
-
-See `docs/dataset-schema.md` for the harmonization rules applied.
diff --git a/TODO.generalized/07-harmonize-iev.md b/TODO.generalized/07-harmonize-iev.md
deleted file mode 100644
index 3b4cabb..0000000
--- a/TODO.generalized/07-harmonize-iev.md
+++ /dev/null
@@ -1,7 +0,0 @@
-# 07 — Harmonize IEV Dataset
-
-## Status: DONE (integrated into fetch-datasets.mjs)
-
-The harmonization is handled by `scripts/fetch-datasets.mjs` which normalizes all concept YAML files to canonical format during the fetch step. No separate script needed.
-
-See `docs/dataset-schema.md` for the harmonization rules applied.
diff --git a/TODO.generalized/08-spa-deployment-config.md b/TODO.generalized/08-spa-deployment-config.md
deleted file mode 100644
index 4bc21bb..0000000
--- a/TODO.generalized/08-spa-deployment-config.md
+++ /dev/null
@@ -1,119 +0,0 @@
-# Status: DONE
-
-# 08 — SPA Deployment Configuration
-
-## Context
-
-The browser needs to deploy as an SPA to GitHub Pages at https://www.geolexica.org. This requires:
-- Base path configuration in Vite and Vue Router
-- SPA fallback (404.html) for client-side routing
-- GitHub Actions CI/CD pipeline
-
-## Task
-
-### vite.config.ts
-
-Add `base` option:
-
-```typescript
-export default defineConfig({
-  base: process.env.BASE_PATH || '/',
-  // ... rest unchanged
-})
-```
-
-### src/router/index.ts (line 34)
-
-```typescript
-history: createWebHistory(import.meta.env.BASE_URL),
-```
-
-### scripts/generate-404.js
-
-Copy `dist/index.html` → `dist/404.html` for GitHub Pages SPA fallback.
-
-```js
-import { copyFileSync } from 'fs';
-import { join, dirname } from 'path';
-import { fileURLToPath } from 'url';
-
-const __dirname = dirname(fileURLToPath(import.meta.url));
-const dist = join(__dirname, '..', 'dist');
-copyFileSync(join(dist, 'index.html'), join(dist, '404.html'));
-console.log('Created dist/404.html for SPA fallback');
-```
-
-### package.json scripts
-
-Add:
-```json
-{
-  "fetch-datasets": "node scripts/fetch-datasets.mjs",
-  "build:full": "npm run fetch-datasets && npm run generate-data && node scripts/build-edges.js && npm run build",
-  "postbuild": "node scripts/generate-404.js"
-}
-```
-
-### .github/workflows/deploy.yml
-
-```yaml
-name: Deploy to GitHub Pages
-
-on:
-  push:
-    branches: [main]
-  workflow_dispatch:
-
-permissions:
-  contents: read
-  pages: write
-  id-token: write
-
-concurrency:
-  group: pages
-  cancel-in-progress: false
-
-jobs:
-  build:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - uses: actions/setup-node@v4
-        with:
-          node-version: 20
-          cache: npm
-      - run: npm ci
-      - run: npm run fetch-datasets
-        env:
-          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-      - run: npm run generate-data
-      - run: node scripts/build-edges.js
-      - run: npm run build
-      - uses: actions/upload-pages-artifact@v3
-        with:
-          path: dist
-
-  deploy:
-    needs: build
-    runs-on: ubuntu-latest
-    environment:
-      name: github-pages
-      url: ${{ steps.deployment.outputs.page_url }}
-    steps:
-      - id: deployment
-        uses: actions/deploy-pages@v4
-```
-
-## Files
-
-- Modify: `vite.config.ts`
-- Modify: `src/router/index.ts`
-- Modify: `package.json`
-- Create: `scripts/generate-404.js`
-- Create: `.github/workflows/deploy.yml`
-
-## Verification
-
-- `npm run build` creates `dist/404.html`
-- SPA routes work with direct URL access (404.html fallback)
-- GitHub Actions workflow runs on push to main
diff --git a/TODO.generalized/09-update-docs.md b/TODO.generalized/09-update-docs.md
deleted file mode 100644
index 9283093..0000000
--- a/TODO.generalized/09-update-docs.md
+++ /dev/null
@@ -1,54 +0,0 @@
-# Status: DONE
-
-# 09 — Update Documentation
-
-## Context
-
-`docs/adding-a-dataset.md` is outdated — it references the old color system (per-dataset Tailwind colors, `dsColor()` functions), old CLI flags (`--input`, `--id`), and inline cross-reference patterns that are being removed. The new pipeline uses `datasets.yml` + `fetch-datasets` + `generate-data` with no code changes.
-
-## Task
-
-### Rewrite `docs/adding-a-dataset.md`
-
-Reflect the new pipeline:
-
-1. Add entry to `datasets.yml` (id, sourceRepo, owner, color, tags)
-2. Run `npm run fetch-datasets && npm run generate-data && npm run build-edges`
-3. No code changes needed
-4. Reference `docs/dataset-schema.md` for canonical concept format
-5. Reference `docs/gcr-spec.md` for GCR packaging format
-
-Remove all references to:
-- Per-dataset Tailwind color configuration
-- `dsColor()`, `dsAccent()`, `REGISTER_COLORS` functions
-- Inline cross-reference patterns (`{{...IEV:...}}`, `{urn:iso:...}`)
-- `--input`, `--id`, `--title` CLI flags
-- Manual `datasets.json` editing
-
-### Update `docs/architecture.md`
-
-Update data pipeline description to reflect:
-- Source repos → `datasets.yml` + `fetch-datasets.mjs` → `.datasets/`
-- `.datasets/` → `generate-data.mjs` (canonical format only) → `public/data/`
-- No format-variant handling
-
-### Update `CLAUDE.md`
-
-Update to reflect:
-- `datasets.yml` as the dataset registry (not `DATASETS` array in generate-data.mjs)
-- `npm run fetch-datasets` command
-- `npm run build:full` command
-- GCR packaging format reference
-- Canonical concept format
-
-## Files
-
-- Modify: `docs/adding-a-dataset.md`
-- Modify: `docs/architecture.md`
-- Modify: `CLAUDE.md`
-
-## Verification
-
-- No references to old color system remain
-- No references to hardcoded paths remain
-- Pipeline documentation matches actual scripts
diff --git a/TODO.generalized/10-glossarist-gem-commands.md b/TODO.generalized/10-glossarist-gem-commands.md
deleted file mode 100644
index f52827c..0000000
--- a/TODO.generalized/10-glossarist-gem-commands.md
+++ /dev/null
@@ -1,73 +0,0 @@
-# Status: DONE
-
-# 10 — Glossarist Gem: upgrade, package, validate Commands
-
-## Context
-
-The glossarist-ruby gem (`/Users/mulgogi/src/glossarist/glossarist-ruby/`) currently has only `generate_latex`. Three new commands are needed to support the GCR workflow. This is a **separate repo and separate effort** from the browser.
-
-Reference implementations from `lutaml-xsd`:
-- `schema_repository_package.rb` — ZIP read/write logic
-- `package_builder.rb` — serialization orchestration
-- `schema_repository_metadata.rb` — metadata model with extensibility
-- `package_configuration.rb` — strategy configuration
-- `commands/package_command.rb` — CLI build/validate/info commands
-
-## Task
-
-### `glossarist harmonize <source_dir> -o <output_dir>`
-
-Reads a source concept repository (any format variant), normalizes to canonical format.
-
-Harmonization rules (from `docs/dataset-schema.md`):
-- Definitions: bare string → `[{content: "text"}]`
-- Sources: `authoritative_source` → `sources` array
-- Dates: scalar → `dates` array
-- Entry status: `"Standard"` → `"valid"`
-- Terms: `abbrev: true` → `type: abbreviation`
-- Inline refs: `{{term, IEV:xxx}}` → structured `references`
-- `_revisions`: stripped
-- `termid`: cast to string
-
-### `glossarist package <harmonized_dir> -o <output.gcr>`
-
-Creates a `.gcr` ZIP file:
-1. Read harmonized YAML directory
-2. Generate `metadata.yaml` (from `register.yaml` + computed statistics)
-3. Compute statistics (concept count, languages, concepts with definitions/sources)
-4. Assemble ZIP with `metadata.yaml`, `register.yaml`, `concepts/*.yaml`
-
-### `glossarist validate <path>`
-
-Validates a source directory or `.gcr` file:
-- `metadata.yaml` exists and parses
-- `concepts/` directory with ≥1 YAML file
-- Each concept has `termid` (string)
-- Each concept has ≥1 language block with ≥1 term
-- No duplicate `termid` values
-- Format compliance (canonical format rules)
-- Cross-reference integrity (optional)
-
-### Implementation approach
-
-1. Add `Glossarist::CLI` Thor commands in `lib/glossarist/cli.rb`
-2. Add `Glossarist::Package` module with `GcrPackage`, `GcrMetadata`, `GcrBuilder` classes
-3. Use `rubyzip` gem for ZIP creation/extraction
-4. Reuse `ManagedConceptCollection.load_from_files()` for reading concepts
-5. Statistics computed from loaded collection
-
-## Files (in glossarist-ruby repo)
-
-- Modify: `lib/glossarist/cli.rb`
-- Create: `lib/glossarist/package/`
-- Create: `lib/glossarist/package/gcr_package.rb`
-- Create: `lib/glossarist/package/gcr_metadata.rb`
-- Create: `lib/glossarist/package/gcr_builder.rb`
-- Modify: `glossarist.gemspec` — add `rubyzip` dependency
-
-## Verification
-
-- `glossarist harmonize` produces canonical YAML from any source format
-- `glossarist package` creates a valid `.gcr` file
-- `glossarist validate` catches format violations
-- Browser pipeline can read `.gcr` output