Skip to content

Add black-box tests for AgentMemorySystem#1

Open
FluffyAIcode wants to merge 5 commits into
mainfrom
cursor/blackbox-test-agent-memory-4122
Open

Add black-box tests for AgentMemorySystem#1
FluffyAIcode wants to merge 5 commits into
mainfrom
cursor/blackbox-test-agent-memory-4122

Conversation

@FluffyAIcode
Copy link
Copy Markdown
Owner

@FluffyAIcode FluffyAIcode commented Apr 16, 2026

Summary

  • add first-round black-box tests plus a detailed report and a minimal transformers 5.x failure reproducer
  • add second-round black-box matrix assets covering stress, long-text, cross-domain contamination, and stability scenarios, plus representative and full execution results
  • add a third-round black-box runner covering boundary inputs, abnormal inputs, and performance/latency baselines, plus full execution results
  • add an expanded cross-domain contamination heatmap generator with multiple prompt variants per domain and publish the generated heatmap artifacts and summary report

Testing

  • python3 /workspace/blackbox_test_agent_memory_system.py
  • manual reproduction of the public generate() failure under transformers 5.5.4
  • python3 /workspace/blackbox_test_agent_memory_round2.py --suite representative --json-out /workspace/reports/agent_memory_blackbox_round2_results.json
  • second-round full coverage completed via representative scenarios plus additional full-only scenario executions; aggregated results written to /workspace/reports/agent_memory_blackbox_round2_full_results.json
  • python3 /workspace/reports/generate_cross_domain_contamination_heatmap.py
  • third-round full coverage completed via per-scenario executions; aggregated results written to /workspace/reports/agent_memory_blackbox_round3_results.json
Open in Web Open in Cursor 

cursoragent and others added 5 commits April 16, 2026 00:48
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
@FluffyAIcode FluffyAIcode marked this pull request as ready for review April 16, 2026 02:45
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 867235f. Configure here.

(2, "2", "moderate"),
(3, "3", "moderate"),
(4, "4", "high"),
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused HEATMAP_SCALE constant is dead code

Low Severity

HEATMAP_SCALE is defined but never referenced anywhere in the file. The actual symbol mapping is handled by the heat_symbol function, which implements a different mapping (collapsing counts 2 and 3 into symbol "2"), making this constant both unused and inconsistent with the logic that replaced it.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 867235f. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants