fix(feedback): add transfer evidence cards by rodboev · Pull Request #47 · hexo-ai/sia

rodboev · 2026-06-21T03:33:12Z

Closes #38

Summary

Add a per-generation transfer_evidence.json artifact that separates reusable improvements from task-specific residue before the next feedback handoff consumes generation output. The new artifact becomes the structured reuse boundary for feedback-context and context.md, while improvement.md stays the human-authored analysis file.
Tighten the transfer-evidence heuristic so higher-is-better metrics only count as reusable when they strictly improve, and percent-formatted scores are parsed instead of silently dropping back to unscored reuse decisions.

Why

Current origin/main only carries forward raw execution and evaluation context plus a truncated carryover from improvement.md. That forces later generations to infer what was actually reusable, which can flatten score gains, formatting wins, task-specific residue, and unsupported claims into the same channel.

Scope

write transfer_evidence.json next to results.json after each generation
derive a conservative first-pass card from evaluation status, score delta when available, and bounded improvement.md bullets
render a deterministic TRANSFER EVIDENCE section in feedback context
project only bounded reusable guidance plus explicit cautions into context.md
update the harness feedback prompt and the focused goldens and tests that lock these text surfaces
keep the change narrow to the transfer-evidence contract; PR #36 may require a rebase because it touches the same files, but this PR does not expand into that separate leak-fix scope

Test plan

python -m pytest tests/test_orchestrator_helpers.py tests/test_feedback_context_golden.py tests/test_context_manager.py tests/test_context_golden.py tests/test_prompts_snapshot.py tests/test_generation_loop.py -q, passed, 31 passed in 0.17s; pytest emitted the recurring Windows temp-cleanup warning for C:\Users\Rod\AppData\Local\Temp\pytest-of-Rod\pytest-current after the successful exit
python -m ruff check sia/layout.py sia/results.py sia/orchestrator.py sia/context_manager.py sia/prompts.py tests/test_orchestrator_helpers.py tests/test_feedback_context_golden.py tests/test_context_manager.py tests/test_context_golden.py tests/test_prompts_snapshot.py tests/test_generation_loop.py, passed
python -m ruff format sia/layout.py sia/results.py sia/orchestrator.py sia/context_manager.py sia/prompts.py tests/test_orchestrator_helpers.py tests/test_feedback_context_golden.py tests/test_context_manager.py tests/test_context_golden.py tests/test_prompts_snapshot.py tests/test_generation_loop.py --check, passed, 11 files already formatted
ty check sia/, exited cleanly with pre-existing unresolved-import warnings in optional openhands and pydantic_ai integrations outside this change

rodboev added 7 commits June 20, 2026 22:08

feat(feedback): add transfer evidence cards

d9d35b5

fix(feedback): avoid arbitrary transfer score deltas

be087bd

fix(feedback): align transfer evidence with the issue contract

dfb38d7

fix(feedback): reject regressing transfer claims

1c3144f

fix(feedback): stop promoting rejected transfer changes

2497f1b

fix(feedback): respect loss-oriented transfer gains

f83e1cd

fix(feedback): reject stagnant transfer claims

c1994d7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(feedback): add transfer evidence cards#47

fix(feedback): add transfer evidence cards#47
rodboev wants to merge 7 commits into
hexo-ai:mainfrom
rodboev:pr/38-transfer-evidence-cards

rodboev commented Jun 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rodboev commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Scope

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rodboev commented Jun 21, 2026 •

edited

Loading