Skip to content

Conversation

@ycedres
Copy link
Contributor

@ycedres ycedres commented Oct 14, 2025

The goal of this PR is improving the test runner and the test results. At this point some more automatic testing improvement is needed but this is a step forward.

  • Add placeholders to allow easier test configuration.

  • Improve judge model system prompt.

  • Improve confirmation mechanism for write-enabled tools (this allows confirmation mechanism of testing to also work).

  • Included patch produced in (so related tests pass)
    Fix nested tool calls #34

@ycedres ycedres requested a review from cbbayburt November 11, 2025 11:06
@ycedres ycedres marked this pull request as ready for review November 11, 2025 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants