Skip to content

Move semantic hints to user prompt for cross-request caching#130

Merged
neuromechanist merged 2 commits into
developfrom
feature/issue-129-cache-friendly-prompts
Mar 30, 2026
Merged

Move semantic hints to user prompt for cross-request caching#130
neuromechanist merged 2 commits into
developfrom
feature/issue-129-cache-friendly-prompts

Conversation

@neuromechanist

Copy link
Copy Markdown
Member

Summary

  • Move semantic hints from system prompt to user prompt so the system prompt is static per schema version
  • Add a pointer in the system prompt directing the LLM to check the user message for hints
  • Update tests for the new prompt structure

Problem

Prompt caching broke between requests because semantic hints (different per image) were embedded in the system prompt. Since Anthropic's caching uses prefix matching, any change invalidated the cache for the entire ~1000-tag vocabulary and rules section.

Solution

The system prompt now contains only static content (vocabulary, rules, patterns). A short pointer says "Check the user message for SEMANTIC HINTS." The actual hints are in the user prompt, which already changes per request.

For batch processing of 1000 images, the system prompt cost is paid once and cached for all subsequent requests (within the 5-minute TTL).

Test plan

  • 455 tests pass, 0 failures
  • Comprehensive guide tests updated for new structure
  • Keyword extraction tests still pass (hints flow through user prompt)

Fixes #129

System prompt is now static per schema version, enabling prompt caching
across requests. Semantic hints (which change per image/description)
are placed in the user prompt instead. The system prompt includes a
pointer instructing the LLM to check the user message for hints.

Fixes #129
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Mar 30, 2026

Copy link
Copy Markdown

Deploying hedit with  Cloudflare Pages  Cloudflare Pages

Latest commit: 2a33a32
Status: ✅  Deploy successful!
Preview URL: https://c4a1b4e7.hedit.pages.dev
Branch Preview URL: https://feature-issue-129-cache-frie.hedit.pages.dev

View logs

@codecov

codecov Bot commented Mar 30, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

- Rename _format_semantic_hints to format_semantic_hints (public API,
  used cross-module)
- Align header: system prompt pointer and actual section both say
  "SEMANTIC HINTS"
- Soften system prompt wording to "may include" (hints are optional)
- Skip hints with empty tag keys
- Add debug logging when hints are included in user prompt
- Add 10 tests: user prompt with/without hints, confidence bucketing,
  system prompt caching invariant
@neuromechanist

Copy link
Copy Markdown
Member Author

PR Review Summary (3 agents: code-reviewer, silent-failure-hunter, test-analyzer)

Critical Issues (0 found)

None.

Important Issues (4 found, ALL FIXED in 2a33a32)

  • [code-reviewer] Header mismatch: system prompt pointer said "SEMANTIC HINTS" but actual output was "## POTENTIALLY RELEVANT TAGS". Aligned both to "SEMANTIC HINTS".

  • [code-reviewer + silent-failure-hunter] System prompt said "Check the user message for..." (imperative) even when no hints exist. Changed to "The user message may include... If no hints section is present, proceed without them."

  • [silent-failure-hunter] Deferred import of private _format_semantic_hints function. Made it public (format_semantic_hints) and moved import to module level. Also added continue for hints with empty tag keys.

  • [test-analyzer] Zero test coverage for hints in user prompt. Added 10 tests:

    • test_first_pass_with_semantic_hints
    • test_correction_pass_with_semantic_hints
    • test_no_hints_no_hints_section
    • test_empty_hints_no_hints_section
    • TestFormatSemanticHints (4 tests: None, empty, valid, confidence bucketing, empty tags)
    • TestSystemPromptCaching (system prompt has pointer but not dynamic content)

Suggestions (noted, not fixed)

  • [silent-failure-hunter] No debug logging when hints included. Fixed: added logger.debug.
  • [test-analyzer] Module-level format_semantic_hints could use boundary-score tests (exactly 0.8, 0.5). Low priority.

All tests pass

  • 465 passed, 0 failures

@neuromechanist neuromechanist merged commit c37650c into develop Mar 30, 2026
14 checks passed
@neuromechanist neuromechanist deleted the feature/issue-129-cache-friendly-prompts branch March 30, 2026 10:46
neuromechanist added a commit that referenced this pull request Mar 30, 2026
…hing (#135)

* Move semantic hints to user prompt for cross-request caching (#130)

* Move semantic hints from system prompt to user prompt

System prompt is now static per schema version, enabling prompt caching
across requests. Semantic hints (which change per image/description)
are placed in the user prompt instead. The system prompt includes a
pointer instructing the LLM to check the user message for hints.

Fixes #129

* Address review findings for cache-friendly prompts

- Rename _format_semantic_hints to format_semantic_hints (public API,
  used cross-module)
- Align header: system prompt pointer and actual section both say
  "SEMANTIC HINTS"
- Soften system prompt wording to "may include" (hints are optional)
- Skip hints with empty tag keys
- Add debug logging when hints are included in user prompt
- Add 10 tests: user prompt with/without hints, confidence bucketing,
  system prompt caching invariant

* Bump version to 0.7.6.dev3

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
neuromechanist added a commit that referenced this pull request Apr 1, 2026
* Move semantic hints to user prompt for cross-request caching (#130)

* Move semantic hints from system prompt to user prompt

System prompt is now static per schema version, enabling prompt caching
across requests. Semantic hints (which change per image/description)
are placed in the user prompt instead. The system prompt includes a
pointer instructing the LLM to check the user message for hints.

Fixes #129

* Address review findings for cache-friendly prompts

- Rename _format_semantic_hints to format_semantic_hints (public API,
  used cross-module)
- Align header: system prompt pointer and actual section both say
  "SEMANTIC HINTS"
- Soften system prompt wording to "may include" (hints are optional)
- Skip hints with empty tag keys
- Add debug logging when hints are included in user prompt
- Add 10 tests: user prompt with/without hints, confidence bucketing,
  system prompt caching invariant

* Bump version to 0.7.6.dev3

* Update default models to latest Qwen and Anthropic

- Evaluation: qwen/qwen3-235b-a22b-2507 -> qwen/qwen3.5-397b-a17b
  (most capable Qwen MoE, $0.39/M prompt)
- Vision: qwen/qwen3-vl-30b-a3b-instruct -> qwen/qwen3-vl-32b-instruct
  (newer VL model, $0.10/M prompt)
- Annotation: keep anthropic/claude-haiku-4.5 (unchanged)
- Replace all legacy gpt-oss-120b references in defaults and docs
- Provider: let OpenRouter auto-route for Qwen models

* Bump version to 0.7.6.dev4

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant