Skip to content

feat(memex): principal attributes memex — schema.org-keyed disclosure pre-fill, multi-agent approval#389

Merged
toadkicker merged 48 commits intomainfrom
feat/1ed9-continue-branch
May 5, 2026
Merged

feat(memex): principal attributes memex — schema.org-keyed disclosure pre-fill, multi-agent approval#389
toadkicker merged 48 commits intomainfrom
feat/1ed9-continue-branch

Conversation

@toadkicker
Copy link
Copy Markdown
Contributor

@toadkicker toadkicker commented May 5, 2026

Summary

  • Principal attributes memex: principal_attributes SQLite table (prop_name TEXT PRIMARY KEY, value TEXT, last_used) stores per-principal schema.org vocab values across sessions. Three agents asking for schema:departureAirport all share the same stored row — no hardcoded field list.
  • Multi-agent approval plan: IntentPlan gains candidates: Vec<AgentCandidate> (up to 3 scored agents). resolve_top_agents(n) replaces single-agent resolution; resolve_agent delegates to it.
  • AwaitingApproval block redesign: Agent selector checkboxes (all checked by default, hidden for single agent) + union disclosure form pre-filled from memex via get_principal_attributes Tauri command. Reactive Leptos signals for both.
  • canvas_approve_block accepts filled_values and selected_agent_names, stores non-credential values to memex, dispatches handshake to each selected agent with values merged into Phase 3 disclosures via HandshakeParams.extra_disclosures.
  • Security: credential params (api_key, token, access_key, etc.) are never persisted to plaintext memex — guarded via CREDENTIAL_PARAM_NAMES. reject_block now requires signedChallenge to prevent gate-leak on rejection. Reserved Phase 3 keys (@type, query) cannot be overwritten by extra_disclosures.
  • CI fixes: clippy::manual_split_once in pap-agents, rustfmt pass, removed "+ Note" button, Sources tab, and broken "< Canvas" settings link.

Test plan

  • cargo test --workspace — 1,209+ tests, all pass
  • just check-wasm / WASM compile check — clean
  • Clippy clean (cargo clippy -D warnings)
  • cargo fmt --check passes

🤖 Generated with Claude Code

toadkicker and others added 15 commits April 29, 2026 06:40
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The approve_block, reject_block, and block persistence methods were
silently discarding backend errors using `let _ = invoke(...).await`,
causing blocks to get stuck in unrecoverable states.

**Root cause**: When backend calls failed:
- Errors were never logged
- Block IDs remained in approval_in_flight forever, blocking retries
- Blocks stayed in AwaitingApproval state with no recovery path
- Block creation failures left orphaned UI-only blocks

**Fixes applied**:
1. approve_block: Proper match on invoke result
   - Logs errors to console
   - Transitions block to Failed state with error message
   - Always cleans up approval_in_flight (allows retry)
   - Warns on duplicate approval attempts

2. reject_block: Same error handling pattern
   - Logs rejection failures
   - Transitions to Failed state on error

3. Block persistence (canvas_block_create):
   - Logs when persistence fails
   - Provides helpful message that block remains memory-only
   - Doesn't block handshake on persistence failure

4. Message persistence (canvas_message_add):
   - Logs failures with warning
   - Continues operation even if DB write fails

**Impact**: Users now see actual backend errors (e.g. "Block not found",
"No Definitions Found") in console, retry buttons work correctly, and
the system degrades gracefully when persistence fails.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When canvas_retry fails with 'Block not found' (because block persistence
failed), the old orphaned UI block is deleted and a fresh prompt is submitted
instead of showing a permanent error state.

This fixes the 'command pattern' issue where failed persistence left blocks
in limbo - retry now gracefully falls back to creating a new block.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The .setup-overlay had no pointer-events rule, causing it to intercept
all clicks even when trying to interact with content behind it. Now
the overlay itself is click-through (pointer-events: none) while the
wizard box re-enables clicks (pointer-events: auto).

This was discovered via Playwright testing showing strict mode violation
when trying to click workflow elements.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…vements and e2e test work

Merges 7 commits from codex-papillon-canvas-browser-runtime: canvas/browser
runtime refinements, Ghost run panel, surface title, local catalog handshake,
orchestrator runtime, workflow labels, CSS overhaul, setup wizard fixes, and
Playwright e2e test suite updates. Excludes node_modules tracked on source branch
by adding node_modules/ to .gitignore.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…owser-runtime`:

**New files:**
- `canvas_ghost_run_panel.rs` — Ghost preflight panel component
- `canvas_surface_title.rs` — canvas surface title component
- `handshake/local_catalog.rs` — local catalog handshake logic (367 lines)
- `orchestrator_runtime.rs` — orchestrator runtime wiring
- `workflow_labels.rs` — workflow label definitions
- `e2e/playwright.local.config.ts` — local Playwright config
- `FIXES_APPLIED.md` — summary of all fixes applied on that branch

**Major modifications:**
- `canvas_workflow_pipeline.rs` — large canvas UX overhaul
- `styles/main.css` — substantial CSS changes (~2100 lines changed)
- `state/canvas.rs`, `handshake/mod.rs`, `pages/canvas.rs` — runtime wiring
- `e2e/tests/workflow.spec.ts` — ~444 lines of e2e test updates
- `setup_wizard.rs`, `topbar.rs`, `app.rs` — UI fixes

Also added `node_modules/` to `.gitignore` to prevent the Playwright deps from being accidentally committed. The 3 loose root files (`package.json`, `test-papillon.js`, `papillon-01-initial.png`) are untracked — let me know if you want to add them to `.gitignore` as well or relocate them.
Federation e2e plan was already implemented; keeping the plan file adds
noise to the docs tree.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix an infinite reactive loop in TemplatesTab where the auto-generate
Effect tracked new_config, causing new_config.set() to re-trigger it
endlessly. This blocked the JS thread for 60+ seconds whenever any
input event fired on SchemaTypeInput.

Two Rust fixes:
- templates_tab/mod.rs: use get_untracked() for new_config in the
  auto-generate Effect so the signal write doesn't re-trigger the Effect
- schema_type_input.rs: remove per-item RwSignal<bool> hover state from
  the For loop dropdown; replace with CSS .schema-suggestion:hover class

E2e test fixes:
- templates.spec.ts: restore form-based createTemplate now that the loop
  is fixed; update persist test to close settings overlay before navigating
- wysiwyg-registry.spec.ts, agents.spec.ts: fix for canvas-surface-status
  selector and autocomplete interaction
- tier2-functional.spec.ts: add 429 skip guards for rate-limited endpoints;
  relax CORS assertions to match Chrysalis server behavior
- tauri-mock.ts: add canvas persistence IPC handlers and template CRUD

Result: 369 passed, 32 skipped, 0 failed (401 total tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… cases

- Delete canary.spec.ts and chrysalis-integration.spec.ts: Chrysalis
  integration tests belong in the Chrysalis repo, not Papillon's e2e suite.
  Removes the CHRYSALIS_URL testIgnore condition from playwright.config.ts.
- Expand helpers.ts: add goToSettingsTab(page, tabName), submitAndWaitForBlock,
  and createEmptyCanvas as shared composable utilities.
- Remove duplicated local helpers from 5 spec files (goToTemplatesTab in
  templates.spec.ts and wysiwyg-registry.spec.ts, openNetworkTab in
  agents.spec.ts, submitAndWaitForBlock in advertiser-phase.spec.ts and
  reshape.spec.ts, createEmptyCanvas in workflow.spec.ts).
- Add error/failure cases to agents.spec.ts, wysiwyg-registry.spec.ts,
  ollama-config.spec.ts, and wysiwyg-registry.spec.ts: unknown DID no-ops,
  invalid JSON imports, duplicate template prevention, missing config fields,
  and empty-arg edge cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…est files

The chrysalis-integration.spec.ts and canary.spec.ts files were deleted
(Chrysalis protocol tests belong in the Chrysalis repo, not Papillon's e2e
suite). The CI chrysalis-e2e job hardcoded paths to both files, causing
'No tests found' failures. Remove the entire chrysalis-e2e job from ci.yml
and replace the canary health check in release-papillon.yml with the
smoke/web-standalone suite that actually ships with Papillon.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
kayak_flights.toml used Skyscanner URL params ({from}, {to}, {date})
that dynamic_handler.rs never substitutes — only {query} is replaced.
The broken endpoint returned 'No Definitions Found' which propagated as
a phase 4 failure in the UI.

Two fixes:
1. Remove the broken Skyscanner endpoint from kayak_flights.toml; switch
   to LLM-only with richer flight-focused instructions.
2. Change dynamic_handler.rs non-2xx path to fall through to the LLM
   (same as network errors) instead of surfacing the third-party API
   error string directly as a phase 4 failure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 5, 2026

Greptile Summary

  • Removes the broken Skyscanner/RapidAPI endpoint from kayak_flights.toml (URL parameters {from}, {to}, {date} were never substituted by the handler, only {query} is) and converts the agent to LLM-only mode; however, the new llm_instructions value itself contains {query} which is also never substituted, leaving a literal placeholder in the LLM system prompt on every call.
  • Changes the non-2xx HTTP response path in dynamic_handler.rs to fall through to the LLM instead of returning a TransportError::ServerError, which is the correct fix for the "Failed at phase 4" user-facing error; the response body is dropped unread which may affect connection pool reuse.
  • provider = \"DuckDuckGo\" in the updated TOML is inaccurate since no DuckDuckGo endpoint is used.

Confidence Score: 3/5

The core handler fix is correct but the new TOML introduces a runtime defect (unsubstituted {query} placeholder in the LLM system prompt) that should be resolved before merging.

One P1 defect (literal {query} in llm_instructions is never replaced, sending a malformed system prompt to the LLM on every flight query) combined with two P2 issues pulls the score below the P1 ceiling of 4.

crates/pap-agents/catalog/travel/kayak_flights.toml — the llm_instructions {query} substitution bug and the misleading provider field both need to be addressed.

Important Files Changed

Filename Overview
crates/pap-agents/src/dynamic_handler.rs Non-2xx branch now silently falls through to LLM instead of returning an error; let _ = status; is a no-op and the response body is dropped unread, which may hurt connection pooling.
crates/pap-agents/catalog/travel/kayak_flights.toml Broken Skyscanner endpoint removed and agent converted to LLM-only; introduces a {query} placeholder in llm_instructions that is never substituted at runtime, and an incorrect provider = "DuckDuckGo" value.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User query routed to DynamicAgentHandler] --> B{Endpoint configured?}
    B -- Yes --> C[Build & send HTTP request]
    B -- No --> G
    C --> D{HTTP response}
    D -- 2xx --> E{Extract JSON via response_mapping / jsonpath}
    E -- data found --> F[Return schema.org object]
    E -- no data --> G
    D -- "non-2xx (OLD: return ServerError)" --> G
    D -- network error --> G
    G[Call LLM with llm_instructions + query] --> H{agent returns pap:IntentClassification?}
    H -- Yes --> I[Parse & return JSON]
    H -- No --> J[Return schema:Answer]

    style D fill:#f9c,stroke:#c66
    style G fill:#cfc,stroke:#6a6
Loading

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
crates/pap-agents/catalog/travel/kayak_flights.toml:11
**`{query}` in `llm_instructions` is never substituted**

`dynamic_handler.rs` only calls `endpoint.url_template.replace("{query}", &query)` and the body-template equivalent; it passes `self.def.llm_instructions` to `client.complete()` verbatim (line 235). The new `llm_instructions` value deliberately embeds `{query}` expecting runtime substitution, but no substitution happens — the LLM system prompt will always contain the literal string `{query}` instead of the actual flight search term.

The user query is still forwarded as the second argument to `complete`, so results won't be completely wrong, but the system-prompt instruction "The user is searching for: {query}" is always broken, sending a confusing placeholder to the model on every call.

### Issue 2 of 3
crates/pap-agents/catalog/travel/kayak_flights.toml:4
**`provider` field is misleading**

`provider = "DuckDuckGo"` is set, but no DuckDuckGo endpoint exists — this is a pure LLM-only agent. The `provider` field in other catalog agents identifies the actual data source (e.g. the old `"KAYAK / Flights API"`); setting it to an unrelated third party may confuse both the UI and future maintainers. A value like `"LLM"` or simply the actual LLM backend name would be more accurate.

### Issue 3 of 3
crates/pap-agents/src/dynamic_handler.rs:223
**Unconsumed response body may prevent connection reuse**

`let _ = status;` is a no-op — `status` was already bound on line 189 and copied (it's `Copy`). More importantly, dropping `resp` here without reading its body prevents `reqwest`'s blocking client from returning the underlying TCP connection to the pool, so every non-2xx API response forces a new connection on the next call. Consider draining the body with `let _ = resp.bytes();` before falling through to keep connection pooling working.

Reviews (1): Last reviewed commit: "fix(agents): fall through to LLM on non-..." | Re-trigger Greptile

returns = ["schema:Flight"]
source = "Catalog"
llm_instructions = "You are a travel assistant. Provide helpful travel information about Flight Search: relevant details, tips, and context for travelers."
llm_instructions = "You are a travel assistant helping users find flight information. The user is searching for: {query}. Provide helpful information about flights, typical price ranges, airlines that serve those routes, and booking tips. If the query includes airports or cities, mention common carriers and approximate travel times. Note that actual real-time prices require checking airline websites directly."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 {query} in llm_instructions is never substituted

dynamic_handler.rs only calls endpoint.url_template.replace("{query}", &query) and the body-template equivalent; it passes self.def.llm_instructions to client.complete() verbatim (line 235). The new llm_instructions value deliberately embeds {query} expecting runtime substitution, but no substitution happens — the LLM system prompt will always contain the literal string {query} instead of the actual flight search term.

The user query is still forwarded as the second argument to complete, so results won't be completely wrong, but the system-prompt instruction "The user is searching for: {query}" is always broken, sending a confusing placeholder to the model on every call.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/pap-agents/catalog/travel/kayak_flights.toml
Line: 11

Comment:
**`{query}` in `llm_instructions` is never substituted**

`dynamic_handler.rs` only calls `endpoint.url_template.replace("{query}", &query)` and the body-template equivalent; it passes `self.def.llm_instructions` to `client.complete()` verbatim (line 235). The new `llm_instructions` value deliberately embeds `{query}` expecting runtime substitution, but no substitution happens — the LLM system prompt will always contain the literal string `{query}` instead of the actual flight search term.

The user query is still forwarded as the second argument to `complete`, so results won't be completely wrong, but the system-prompt instruction "The user is searching for: {query}" is always broken, sending a confusing placeholder to the model on every call.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

name = "Flight Search"
provider = "KAYAK / Flights API"
description = "Search and compare flight prices across major airlines."
provider = "DuckDuckGo"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 provider field is misleading

provider = "DuckDuckGo" is set, but no DuckDuckGo endpoint exists — this is a pure LLM-only agent. The provider field in other catalog agents identifies the actual data source (e.g. the old "KAYAK / Flights API"); setting it to an unrelated third party may confuse both the UI and future maintainers. A value like "LLM" or simply the actual LLM backend name would be more accurate.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/pap-agents/catalog/travel/kayak_flights.toml
Line: 4

Comment:
**`provider` field is misleading**

`provider = "DuckDuckGo"` is set, but no DuckDuckGo endpoint exists — this is a pure LLM-only agent. The `provider` field in other catalog agents identifies the actual data source (e.g. the old `"KAYAK / Flights API"`); setting it to an unrelated third party may confuse both the UI and future maintainers. A value like `"LLM"` or simply the actual LLM backend name would be more accurate.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

// third-party API's error message directly to the user.
// API errors (missing keys, wrong params, rate limits) are
// implementation details — the LLM can still answer the query.
let _ = status;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unconsumed response body may prevent connection reuse

let _ = status; is a no-op — status was already bound on line 189 and copied (it's Copy). More importantly, dropping resp here without reading its body prevents reqwest's blocking client from returning the underlying TCP connection to the pool, so every non-2xx API response forces a new connection on the next call. Consider draining the body with let _ = resp.bytes(); before falling through to keep connection pooling working.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/pap-agents/src/dynamic_handler.rs
Line: 223

Comment:
**Unconsumed response body may prevent connection reuse**

`let _ = status;` is a no-op — `status` was already bound on line 189 and copied (it's `Copy`). More importantly, dropping `resp` here without reading its body prevents `reqwest`'s blocking client from returning the underlying TCP connection to the pool, so every non-2xx API response forces a new connection on the next call. Consider draining the body with `let _ = resp.bytes();` before falling through to keep connection pooling working.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

Benchmark Regression Report

PAP Protocol Benchmark Regression Check
========================================
Baseline: .bench-baseline/baseline.json
Threshold: 55%

  ed25519_keypair_generation                19.6 µs  (baseline: 19.6 µs, +0.1%)  [ok]
  did_key_derivation                         1.5 µs  (baseline: 1.5 µs, -0.3%)  [ok]
  mandate_create_sign                       24.0 µs  (baseline: 23.8 µs, +0.8%)  [ok]
  mandate_chain_verify_depth3              164.6 µs  (baseline: 144.8 µs, +13.7%)  [ok]
  sd_jwt_issue_5claims                      27.8 µs  (baseline: 27.8 µs, +0.3%)  [ok]
  sd_jwt_verify_disclose_3of5               55.3 µs  (baseline: 50.2 µs, +10.0%)  [ok]
  session_open_full_lifecycle              114.1 µs  (baseline: 108.3 µs, +5.3%)  [ok]
  receipt_create_cosign                     48.0 µs  (baseline: 48.0 µs, +0.0%)  [ok]
  federation_announce_local                 61.7 µs  (baseline: 66.6 µs, -7.3%)  [ok]


P99 Tail-Latency Check
----------------------
Results: target/p99_results.json
Threshold: 50%

  session_open_full_lifecycle(p99)         132.2 µs  (baseline: 500.0 µs, -73.5%)  [ok]
  mandate_chain_verify_depth3(p99)         159.9 µs  (baseline: 480.0 µs, -66.6%)  [ok]
  receipt_create_cosign(p99)                57.9 µs  (baseline: 210.0 µs, -72.4%)  [ok]

All benchmarks within 55% of baseline.

Threshold: 10% regression vs baseline from main

Todd Baur and others added 12 commits May 4, 2026 21:09
An agent represents delegated authority for a specific action. If the
HTTP endpoint returns non-2xx or a network error, that action failed —
silently substituting an LLM response violates the mandate and would
co-sign a receipt for something that never executed.

Revert the LLM fallthrough on non-2xx and network errors introduced in
cffd64d. The kayak_flights fix (removing the broken Skyscanner endpoint)
is the correct solution: no endpoint = LLM is the designed execution
path, not a fallback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…on-and-auth-disclosure.md`.

**7 tasks, key design decisions:**

- **Task 1** — new `param_extractor.rs` with `extract_template_params()` and `resolve_param()`. Auth params (`api_key`, `access_key`, etc.) return `None` — they never come from query text, only from disclosures/settings. Protocol intact.
- **Task 2** — `DynamicSession` gains `disclosed_props: HashMap<String, String>`. `handle_disclosure()` extracts all JSON fields (not just `query`). `execute()` substitutes all `{param}` in priority order: disclosed → agent_props → extractor.
- **Task 3** — `register_dynamic_with_props()` added to `AgentSet`. `state.rs` loads `agent_settings` for each agent at startup and passes them in.
- **Task 4** — 24 TOML files: `apikey=demo` → `{api_key}`, add `requires_disclosure = ["api_key"]` and `configurable_properties` entry.
- **Task 5** — 19 TOML files: add structural params to `requires_disclosure` so the orchestrator can surface them in mandate previews.
- **Task 6** — `build_disclosures()` helper includes any agent-required properties (like `api_key`) from saved settings into the Phase 3 disclosure object.
- **Task 7** — e2e test verifying no literal `{owner}` / `{repo}` reaches the user.

**Two execution options:**

**1. Subagent-Driven (recommended)** — dispatch a fresh subagent per task, review between tasks

**2. Inline Execution** — execute in this session using executing-plans

Which approach?
**1. BM25 routes to agents, doesn't extract entities.** The new `EntityExtractor` is a separate layer — invoked after BM25 has selected the right agent, to resolve `{owner}`, `{lat}`, etc. from the query text. It's LLM-first (structured JSON prompt listing the params needed), with the existing regex patterns as fallback. Auth params return `None` from this layer unconditionally.

**2. API keys via SD-JWT vault, not flat settings.** The flow is: agent declares `requires_disclosure = ["api_key"]` → Phase 3 handshake calls `vault_disclose_for_agent` (new Tauri command) → vault decrypts the stored key, returns it ephemerally → merged into the Phase 3 disclosure object → arrives at `handle_disclosure()` as `{"api_key": "sk-live-..."}` → substituted in the URL template. The principal unlocks the vault once; the orchestrator gates credential access per-agent without the user wrangling keys manually.

The plan is 8 tasks. Ready to execute whenever you want.
…am resolution

Implements extract_template_params(), extract_heuristic(), and EntityExtractor
struct to resolve {param} placeholders in dynamic agent URL templates from
natural-language queries, with AUTH_PARAMS exclusion for credential vault safety.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- extract_template_params: iterate with char_indices() instead of raw
  bytes so {param} extraction never panics on multibyte characters
- split_origin_destination: guard byte offsets from to_lowercase() with
  is_char_boundary() before slicing the original string
- parse_owner_repo: only treat the first slash-delimited segment as a
  hostname when it contains a dot, preventing "owner.name/repo" from
  being misread as "hostname/name/repo"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… entity extractor

Expands DynamicSession with a disclosed_props map populated from all
scalar Phase 3 disclosure properties (not just query). execute() now
applies a three-priority resolution chain — disclosed props, agent_props,
then EntityExtractor (LLM → heuristic) — to substitute every {param}
placeholder in endpoint URL templates before the SSRF safety check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n lock per execute

Pre-fill resolved map from disclosed_props (highest priority) and agent_props
before invoking EntityExtractor, so the LLM/heuristic path is only triggered
for params not already known. Adds EntityExtractor::resolve_params() to accept
a param name list directly without re-parsing a URL template. Also collapses
the two sessions.with() calls in execute() into one to avoid holding the lock
twice per request.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…I key disclosure

Adds ApiCredential variant to VaultItemData/VaultItemType, store_credential/get_credential
methods to Vault<S>, vault field to AppState, and four Tauri IPC commands (vault_open,
vault_seal, vault_store_credential, vault_disclose_for_agent) that gate all access through
the existing AES-256-GCM encrypted vault.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…WT path

Adds `build_disclosures` helper that merges vault-resolved credentials
into the Phase-3 disclosure object. Phase 3 of `execute()` now calls
`vault_disclose_for_agent` for each credential param in the agent's
`requires_disclosure` list before building the disclosure payload.
…vault errors in Phase 3

Two agents with similar names could previously map to the same slug and
share vault credentials. Use agent_did as the namespace key instead.
Vault failures in Phase 3 are now hard errors with actionable messages
instead of being silently dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…osure params

Added TDD validation test no_catalog_agent_has_hardcoded_demo_key that scans
all 309 catalog TOML files and asserts no url_template contains =demo, =api2demo,
or =DEMO. Added walkdir dev-dependency. Added serde(default) to published_to,
created_at, and updated_at fields so catalog TOMLs (which omit those fields) can
be deserialized by the test.

Fixed 34 agents total (24 from task spec + 10 discovered by the test): replaced
demo/DEMO/DEMO_KEY placeholders with {api_key}, added "api_key" to
requires_disclosure, and added configurable_properties blocks with
PropertyValueSpecification so the settings UI knows to prompt for the key.
pocket_recommendations.toml receives two auth params: {api_key} (consumer_key)
and {access_token}, with requires_disclosure = ["api_key", "access_token"].

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…alidation test + fixes)

Added TDD validation test that asserts every {param} in a catalog agent's
url_template (excluding {query}) appears in requires_disclosure, so the
orchestrator can surface all needed params in the mandate preview. Fixed
23 catalog TOML files with missing structural params (lat/lon, owner/repo,
sport/league, etc.). Also percent-encoded literal JSON braces in
patents_view.toml url_template that were confusing the template parser.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@toadkicker toadkicker changed the title fix(agents): fall through to LLM on non-2xx API responses feat(agents): entity extraction + SD-JWT credential disclosure for broken catalog agents May 5, 2026
Todd Baur and others added 18 commits May 4, 2026 23:12
…proval_challenge command; add just web recipe
**Root cause:** `canvas_approve_block` (Tauri command) requires a `SignedChallenge` to prove the principal authorized the approval, but `approve_block` in the frontend was sending the request without one — guaranteed deserialization failure, meaning the `[ GRANT ACCESS ]` button never actually worked.

**Fix:** Added `sign_approval_challenge` Tauri command — issues a nonce and signs it internally with the principal's Ed25519 key (the key lives in Rust, so the frontend can't sign it directly). `approve_block` now calls this first, gets back a `SignedChallenge`, then includes it in the `canvas_approve_block` call.

**Demo path for tomorrow:**
```
just papillon          # Tauri desktop app
just registry-local    # docker compose up or plain HTTP registry
```
Or browser-only:
```
just web               # trunk serve at localhost:1420 (new recipe)
```

Settings → General → set Ollama endpoint + model → Save. Enter a prompt → block shows `AwaitingApproval` with disclosure scope → click `[ GRANT ACCESS ]` → handshake runs phases 1-6 → block renders on canvas.
…-memex.md`. It covers 8 tasks:

1. **DB schema** — `principal_attributes` table + 3 `DatabaseOps` methods (native/WASM/IndexedDB), with 4 unit tests
2. **`AgentCandidate` type** — new struct + `candidates` field on `IntentPlan`
3. **`resolve_top_agents`** — scoring + handler-building refactored to return top-N; `resolve_agent` delegates to it
4. **`get_principal_attributes` command** — new Tauri command that returns `HashMap<String,String>`
5. **`canvas_plan_prompt`** — calls `resolve_top_agents(3)`, builds union disclosure, populates `plan.candidates`
6. **`canvas_approve_block`** — accepts `filled_values` + `selected_agent_names`, stores to memex, multi-agent dispatch with `extra_disclosures` threaded through Phase 3
7. **Frontend AwaitingApproval** — agent checkboxes (all selected by default), memex-prefilled text inputs, updated `approve_block` signature
8. **Full workspace test + WASM check**

**Two execution options:**

**1. Subagent-Driven (recommended)** — I dispatch a fresh subagent per task, review spec compliance then code quality between tasks, continuous execution

**2. Inline Execution** — execute tasks in this session via the executing-plans skill

Which approach?
Adds a persistent key-value store for principal attributes keyed by
exact schema.org vocab strings (e.g. schema:givenName). Native backend
uses SQLite with upsert semantics; WASM uses in-memory HashMap;
IndexedDB delegates and persists. Includes 4 passing tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… tests

Add principal_attributes to build_snapshot_json() and restore_state() in
IndexedDbDatabase so browser page reloads no longer silently lose attribute
data. Add 4 WasmDatabase unit tests covering round-trip, upsert, missing-key,
and get-all cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… candidates

Extract build_handler as a private helper, add resolve_top_agents(n) that
scores all local + remote candidates and returns the top-N with handlers
built, then simplify resolve_agent to delegate to resolve_top_agents(1).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…entPlan.candidates

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…per-agent with extra_disclosures

- HandshakeParams gains extra_disclosures field, merged into Phase 3 disclosures
- process_prompt_inner accepts extra_disclosures (after retry_count); process_prompt_with_extras added
- AppState.approval_values stores (selected_agent_names, filled_values) keyed by approval_request_id
- canvas_approve_block accepts filled_values/selected_agent_names (Option), persists attrs to memex
- canvas_plan_prompt post-approval path reads approval_values and dispatches per-agent with extras

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…irst-wins dispatch; cleanup

- Phase 3: skip @type and query keys in extra_disclosures merge to prevent injection
- approval_values cleanup now happens unconditionally after receiver.await (not only on approval)
- Multi-dispatch loop changed to first-wins semantics (break after first success)
- Added unit test for extra_disclosures merge guard
- Removed unused mod.rs re-export of process_prompt_with_extras (approval.rs imports directly)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…isclosure form

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…otes button, sources tab, Canvas back link

- Fix `clippy::manual_split_once` in entity_extractor.rs (pap-agents)
- Run cargo fmt --all to clear all format diffs across modified files
- Remove "+ Note" button from canvas page
- Remove Sources tab from CanvasBackFace (Workflow-only, no tab bar)
- Remove broken "< Canvas" back link from settings nav
- Drop unused source_panel module export and leptos_router::A import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… params from plaintext memex

- reject_block now calls sign_approval_challenge before invoking canvas_approve_block
  (missing challenge caused backend deserialization failure, leaving gate permanently stuck)
- canvas_approve_block skips persisting credential params (api_key, token, access_key, etc.)
  to principal_attributes — those belong in the vault, not the plaintext memex
- Fix misleading comment: process_prompt_with_extras does not emit block_resolved; the
  outer caller does (single event per dispatch, not one per candidate)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@toadkicker toadkicker changed the title feat(agents): entity extraction + SD-JWT credential disclosure for broken catalog agents feat(memex): principal attributes memex — schema.org-keyed disclosure pre-fill, multi-agent approval May 5, 2026
Todd Baur and others added 2 commits May 5, 2026 10:46
topbar-address-input was hardcoded to rgba(9,11,22,0.55) / rgba(255,255,255,0.08)
— always dark regardless of theme. Switch to --input-bg / --input-border which
are correctly defined for dark, light, and auto modes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ary_map_or

- resolution.rs: extract ScoredCandidate type alias for Vec<(String,String,Vec<String>,...,f64)>
- state.rs: extract ApprovalPayload type alias for (Vec<String>, HashMap<String,String>)
- execution.rs: #[allow(clippy::too_many_arguments)] on process_prompt_with_extras (8 params)
- dynamic.rs: replace .map_or(false, |ext| ext == "toml") with .is_some_and(...)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@toadkicker toadkicker merged commit 712449b into main May 5, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant