APPENG-3801-B - Agent performance fixes - all agent stages #134

etsien · 2025-10-20T10:26:05Z

Comprehensive Prompting Improvements Across Vulnerability Analysis Pipeline

This PR implements systematic improvements to all prompting stages of the vulnerability analysis pipeline, focusing on consistency, clarity, reduced verbosity, and reduced LLM hallucinations. Also incorporates tool improvements from prior PR (APPENG-3801-A).

Edit:
Also rebased to incorporate the latest PRs that were merged upstream to branch rh-aiq-main.

Summary of Changes

All 7 stages of the vulnerability analysis pipeline have been improved with:

Consistent XML-style section markers (<TASK>, <INSTRUCTIONS>, <EXAMPLES>, etc.)
Reduced verbosity while maintaining technical precision
Dynamic tool awareness to prevent instructing agents to use disabled tools
Separated LLM responsibilities from deterministic code operations
Improved example quality and diversity

Files Modified

Core Prompting and Utilities

src/vuln_analysis/utils/prompting.py - Major restructuring
- Added build_tool_descriptions() consolidated base function
- Updated 6 prompt constants with structured sections
- Simplified get_agent_prompt() and get_cvss_prompt() functions
src/vuln_analysis/utils/intel_source_score.py
- Separated LLM scoring from arithmetic calculation
- LLM now provides only individual criterion scores
- Code calculates total with validation
src/vuln_analysis/utils/checklist_prompt_generator.py
- Added tool_names parameter to generate_checklist()
- Dynamic tool descriptions formatted for Jinja2 rendering
- Structured requirements section
src/vuln_analysis/utils/justification_parser.py
- Restructured JUSTIFICATION_PROMPT with clear sections
- Added exploitation conditions definition
- Explicit logical precedence order for 12 categories

Function Implementations

src/vuln_analysis/functions/cve_agent.py
- Updated _create_agent() to use build_tool_descriptions()
- Strategic tool guidance formatted locally
- Uses partial_variables for tool_selection_strategy
src/vuln_analysis/functions/cve_checklist.py
- Added agent_name field to CVEChecklistToolConfig
- Retrieves agent tool configuration
- Passes agent tool names to checklist generation
src/vuln_analysis/functions/cve_generate_cvss.py
- Simplified _create_agent() function
- Removed conditional example insertion

Tests

tests/test_base_tool_descriptions.py - New test suite
- Tests consolidated build_tool_descriptions() function
- Validates tool description generation
- Verifies MOD_FEW_SHOT structure

… better clarity and quality

…l descriptions to both checklist and agent prompts

…o pick the class

vbelouso · 2025-10-20T10:31:16Z

/ok-to-test

adding in constants during the vdb generation check

etsien · 2025-10-21T19:27:26Z

bugfix pushed for the vdb tool issue

zvigrinberg

Hi @etsien
The agent is not being started with these changes.
Please see my comment for a quick fix and will continue from there...
anyway we fixed it manually and continued with the confusion matrix second batch run...

Thanks.

src/vuln_analysis/functions/cve_checklist.py

zvigrinberg · 2025-11-05T07:37:43Z

/retest

zvigrinberg · 2025-11-05T07:39:50Z

/retest vulnerability-analysis-on-pr

… better clarity and quality

…l descriptions to both checklist and agent prompts

…o pick the class

adding in constants during the vdb generation check

…cution-stages' of https://github.com/etsien/vulnerability-analysis into APPENG-3801-B-Agent-performance-fixes-checklist-and-execution-stages

etsien · 2025-11-05T18:04:20Z

Just rebased to rh-aiq-main, incorporating discussed description changes to the tool prompting and configs

etsien · 2025-11-05T18:04:30Z

/retest vulnerability-analysis-on-pr

zvigrinberg · 2025-11-06T08:14:46Z

Hi @etsien ,
After you've resolved the conflicts, now the agent is crashing on startup

Can you please take a look and fix?

Thank you.

etsien · 2025-11-06T15:17:25Z

Hi @etsien , After you've resolved the conflicts, now the agent is crashing on startup Can you please take a look and fix?

Thank you.

patched, forgot to bring over the bugfix from the other branch

zvigrinberg · 2025-11-09T11:51:36Z

/retest vulnerability-analysis-on-pr

zvigrinberg · 2025-11-10T11:58:53Z

@etsien Please rebase and resolve conflicts, and we'll merge it ( that one improved the consistency and the results significantly).

…-checklist-and-execution-stages

etsien · 2025-11-11T14:52:00Z

/retest vulnerability-analysis-on-pr

zvigrinberg · 2025-11-18T15:50:24Z

LGTM Approved.
remaining problems of Summarize stage and aligning Function Locator tool with the other agent tools will be done as part of #142 ( already branched out from this PR' head branch).

…emAppEng#134) * bugfix in the testing env * update tool descriptions for clarity * refactor tool names to be class constants instead of disparate strings * add initial unit tests * rename tool names to be more consistent and distinct * update unit tests with tool names and tool constants * cleanup startup guide notebook * rework intel source score section * update agent execution stage prompts and make tool descriptions dynamic * add tests for dynamic tool descriptions * revamp the tool description list, as well as the checklist prompt for better clarity and quality * revamp checklist prompt implementation, as well as add in dynamic tool descriptions to both checklist and agent prompts * update tests for tool descriptions * add more detailed agent examples with more useful MRKL-formatted steps * update for summary prompt * update justification prompt with more logic and explanations on how to pick the class * update CVSS prompts and cleanup examples and guidance * bugfix on intel source * bug patch for vdb generation adding in constants during the vdb generation check * bugfix by Tamar * update register_function() and transitive_search() descriptions * bugfix in the testing env * update tool descriptions for clarity * refactor tool names to be class constants instead of disparate strings * add initial unit tests * rename tool names to be more consistent and distinct * update unit tests with tool names and tool constants * cleanup startup guide notebook * rework intel source score section * update agent execution stage prompts and make tool descriptions dynamic * add tests for dynamic tool descriptions * revamp the tool description list, as well as the checklist prompt for better clarity and quality * revamp checklist prompt implementation, as well as add in dynamic tool descriptions to both checklist and agent prompts * update tests for tool descriptions * add more detailed agent examples with more useful MRKL-formatted steps * update for summary prompt * update justification prompt with more logic and explanations on how to pick the class * update CVSS prompts and cleanup examples and guidance * bugfix on intel source * bug patch for vdb generation adding in constants during the vdb generation check * bugfix by Tamar * update register_function() and transitive_search() descriptions * add function locator descriptions * add names to configs * add local output for local testing * bugfix * Update tool_names.py

etsien added 18 commits October 2, 2025 16:13

bugfix in the testing env

a1834bd

update tool descriptions for clarity

1d79035

refactor tool names to be class constants instead of disparate strings

6864fa7

add initial unit tests

e05ea7a

rename tool names to be more consistent and distinct

8893f5c

update unit tests with tool names and tool constants

dd18463

cleanup startup guide notebook

4efffcd

rework intel source score section

8f3182e

update agent execution stage prompts and make tool descriptions dynamic

dd215cf

add tests for dynamic tool descriptions

35ee318

revamp the tool description list, as well as the checklist prompt for…

0af8e7a

… better clarity and quality

revamp checklist prompt implementation, as well as add in dynamic too…

a882f88

…l descriptions to both checklist and agent prompts

update tests for tool descriptions

26f0d74

add more detailed agent examples with more useful MRKL-formatted steps

186350d

update for summary prompt

f71e9f8

update justification prompt with more logic and explanations on how t…

9707671

…o pick the class

update CVSS prompts and cleanup examples and guidance

faeb811

bugfix on intel source

dcf836f

zvigrinberg mentioned this pull request Oct 21, 2025

Appeng 3801-A agent performance fixes - tool use stability #132

Closed

bug patch for vdb generation

efa84ad

adding in constants during the vdb generation check

zvigrinberg self-requested a review October 22, 2025 11:02

zvigrinberg requested changes Oct 22, 2025

View reviewed changes

src/vuln_analysis/functions/cve_checklist.py Outdated Show resolved Hide resolved

etsien added 2 commits October 22, 2025 09:53

bugfix by Tamar

be0b27d

update register_function() and transitive_search() descriptions

538257d

etsien added 2 commits November 5, 2025 11:14

bugfix in the testing env

36bb6d3

update tool descriptions for clarity

7b7695d

etsien added 15 commits November 5, 2025 11:59

revamp the tool description list, as well as the checklist prompt for…

0df74b9

… better clarity and quality

revamp checklist prompt implementation, as well as add in dynamic too…

333c1eb

…l descriptions to both checklist and agent prompts

update tests for tool descriptions

29306b3

add more detailed agent examples with more useful MRKL-formatted steps

a5368b6

update for summary prompt

dbab156

update justification prompt with more logic and explanations on how t…

f585abb

…o pick the class

update CVSS prompts and cleanup examples and guidance

b3b53e1

bugfix on intel source

c26936e

bug patch for vdb generation

ee0c6af

adding in constants during the vdb generation check

bugfix by Tamar

309a554

update register_function() and transitive_search() descriptions

40de82e

add function locator descriptions

7e0bddb

add names to configs

eb09d06

add local output for local testing

8754762

Merge branch 'APPENG-3801-B-Agent-performance-fixes-checklist-and-exe…

3449499

…cution-stages' of https://github.com/etsien/vulnerability-analysis into APPENG-3801-B-Agent-performance-fixes-checklist-and-execution-stages

etsien mentioned this pull request Nov 6, 2025

APPENG-3853 Prompt Standardization and Thinking mode changes #142

Open

bugfix

554ac47

Update tool_names.py

45fc146

Merge branch 'rh-aiq-main' into APPENG-3801-B-Agent-performance-fixes…

a2a7f94

…-checklist-and-execution-stages

etsien requested a review from zvigrinberg November 10, 2025 15:41

zvigrinberg merged commit 338adb7 into RHEcosystemAppEng:rh-aiq-main Nov 18, 2025
1 check passed

APPENG-3801-B - Agent performance fixes - all agent stages #134

APPENG-3801-B - Agent performance fixes - all agent stages #134

Uh oh!

Conversation

etsien commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comprehensive Prompting Improvements Across Vulnerability Analysis Pipeline

Summary of Changes

Files Modified

Core Prompting and Utilities

Function Implementations

Tests

Uh oh!

vbelouso commented Oct 20, 2025

Uh oh!

etsien commented Oct 21, 2025

Uh oh!

zvigrinberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zvigrinberg commented Nov 5, 2025

Uh oh!

zvigrinberg commented Nov 5, 2025

Uh oh!

etsien commented Nov 5, 2025

Uh oh!

etsien commented Nov 5, 2025

Uh oh!

zvigrinberg commented Nov 6, 2025

Uh oh!

etsien commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zvigrinberg commented Nov 9, 2025

Uh oh!

zvigrinberg commented Nov 10, 2025

Uh oh!

etsien commented Nov 11, 2025

Uh oh!

Uh oh!

zvigrinberg commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

etsien commented Oct 20, 2025 •

edited

Loading

etsien commented Nov 6, 2025 •

edited

Loading

zvigrinberg commented Nov 18, 2025 •

edited

Loading