fix(agent): use actual value in fillForm instead of hallucinated placeholder#1863
fix(agent): use actual value in fillForm instead of hallucinated placeholder#1863guoyangzhen wants to merge 1 commit intobrowserbase:mainfrom
Conversation
…eholder The fillForm tool previously only passed an 'action' string to observe(), which caused the LLM to hallucinate placeholder values like 'test@example.com' instead of using the actual intended text. Changes: - Add explicit 'value' field to the fillForm input schema - Override observe() results with the actual value before calling act() - Use substituteVariables() to support %variableName% substitution - Update action description to clarify action vs value separation Fixes browserbase#1789
|
|
This PR is from an external contributor and must be approved by a stagehand team member with write access before CI can run. |
Greptile SummaryThis PR fixes a real bug in the Key changes:
Issues found:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant LLM as Agent LLM
participant FT as fillFormTool
participant Obs as observe()
participant Act as act()
LLM->>FT: fields=[{action, value}, ...]
FT->>Obs: instruction = join(f.action for f in fields)
Obs-->>FT: observeResults[] (LLM-ordered)
Note over FT: Loop: observeResults[i].arguments[0]<br/>= substituteVariables(fields[i].value)<br/>⚠️ assumes same ordering as input fields
loop for each observeResult
FT->>Act: act(res, {variables, timeout})
Act-->>FT: actResult
end
FT-->>LLM: {success, actions, playwrightArguments}
Last reviewed commit: "fix(agent): use actu..." |
| for (let i = 0; i < observeResults.length && i < fields.length; i++) { | ||
| const res = observeResults[i]; | ||
| if (res.method === "fill" && res.arguments && res.arguments.length > 0) { | ||
| const actualValue = substituteVariables(fields[i].value, variables); | ||
| res.arguments[0] = actualValue; | ||
| } | ||
| } |
There was a problem hiding this comment.
Value/field index misalignment risk
The loop assumes observeResults[i] corresponds to fields[i] in a 1-to-1, order-preserving way. However, observe() delegates to an LLM which processes the combined instruction string and returns actions in whatever order it infers (typically DOM/accessibility-tree order, not declaration order). If the LLM returns results in a different sequence—or returns fewer results because one element wasn't found—the wrong value will be injected into the wrong field.
A concrete example: given fields = [{action: "email input", value: "user@example.com"}, {action: "password input", value: "s3cr3t"}], if the password field appears before the email field in the DOM, the observe LLM may return [{...password element...}, {...email element...}]. The loop would then do:
observeResults[0](password element) ←fields[0].value="user@example.com"❌observeResults[1](email element) ←fields[1].value="s3cr3t"❌
In the old code the value was embedded in the action string so each observe result already carried its own value regardless of ordering. The new separation of value from action makes the ordering dependency a correctness issue.
A safer approach would be to make one observe() call per field (matching each result directly to its value), or to enrich the instruction with a unique tag per field and parse that tag back from the observe results to reconstruct the mapping.
| const actionDescription = hasVariables | ||
| ? `Must follow the pattern: "type <exact value> into the <field name> <fieldType>". Use %variableName% to substitute a variable value. Available: ${Object.keys(variables).join(", ")}. Examples: "type %email% into the email input", "type %password% into the password input"` | ||
| : 'Must follow the pattern: "type <exact value> into the <field name> <fieldType>". Examples: "type john@example.com into the email input", "type John into the first name input"'; | ||
| ? `Describe which field to target, e.g. "type into the email input", "type into the password field". Use %variableName% to substitute a variable value. Available: ${Object.keys(variables).join(", ")}. Example: "type %email% into the email input"` | ||
| : 'Describe which field to target, e.g. "type into the email input", "type into the first name input"'; |
There was a problem hiding this comment.
Variable token in
action description is misleading
The actionDescription for the hasVariables case still mentions %variableName% syntax and includes the example "type %email% into the email input". This suggests putting variable tokens in the action field, but the action string is used only to build the observe() instruction and no substituteVariables() call is ever applied to it—so any %variableName% token placed there would be passed verbatim to the observe LLM and never resolved.
Since the intent of this PR is to separate element targeting (action) from value typing (value), the action description should be cleaned up to remove the variable-substitution guidance entirely:
| const actionDescription = hasVariables | |
| ? `Must follow the pattern: "type <exact value> into the <field name> <fieldType>". Use %variableName% to substitute a variable value. Available: ${Object.keys(variables).join(", ")}. Examples: "type %email% into the email input", "type %password% into the password input"` | |
| : 'Must follow the pattern: "type <exact value> into the <field name> <fieldType>". Examples: "type john@example.com into the email input", "type John into the first name input"'; | |
| ? `Describe which field to target, e.g. "type into the email input", "type into the password field". Use %variableName% to substitute a variable value. Available: ${Object.keys(variables).join(", ")}. Example: "type %email% into the email input"` | |
| : 'Describe which field to target, e.g. "type into the email input", "type into the first name input"'; | |
| ? `Describe which field to target, e.g. "the email input", "the password field".` | |
| : 'Describe which field to target, e.g. "the email input", "the first name input"'; |
There was a problem hiding this comment.
No issues found across 1 file
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
Architecture diagram
sequenceDiagram
participant Agent as LLM Agent
participant Tool as fillFormTool
participant Vars as Variable Utils
participant Engine as V3 Engine (Observe/Act)
Note over Agent,Tool: NEW: Input schema now strictly separates<br/>Target (action) and Content (value)
Agent->>Tool: execute(fields: [{action, value}])
Tool->>Engine: v3.observe(instruction)
Note right of Engine: LLM finds element but may<br/>hallucinate "value" argument
Engine-->>Tool: observeResults (hallucinated values)
loop For each field/result pair
Tool->>Vars: NEW: substituteVariables(field.value, variables)
Vars-->>Tool: actualValue
Note over Tool: NEW: Inject actualValue into observeResults<br/>overriding LLM-generated arguments
Tool->>Tool: res.arguments[0] = actualValue
Tool->>Engine: v3.act(res)
Engine-->>Tool: status
end
Tool-->>Agent: completion status
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Add one-off context when rerunning by tagging
@cubic-dev-aiwith guidance or docs links (includingllms.txt) - Ask questions if you need clarification on any suggestion
Problem
The
fillFormagent tool receives a value parameter but never uses it. When the tool callsobserve(), it only passes theactiondescription, causing the LLM to hallucinate placeholder values liketest@example.cominstead of using the actual provided value.This affects:
Fixes #1789
Root Cause
The
fillForminput schema only had anactionfield. Theobserve()call used only the action description to identify elements, but the observe LLM then hallucinates what value to type instead of using the intended one.Changes
observe()returns element identification results, the code injects the actual value into the observe result's arguments before callingact()substituteVariables()to support variable tokens in values, matching the behavior of thetypetoolVerification
The fix can be verified by:
Summary by cubic
Make the agent’s
fillFormtool use the actual provided value for each field instead of hallucinated placeholders. This fixes flaky fills and ensures exact values (e.g., TOTP codes) are typed.Bug Fixes
valueto each field in the input schema;actionnow only identifies the target element.observe()results beforeact(), withsubstituteVariables()support.Migration
fillFormcalls to include avaluefor each field.actionintovalue; keepactionfocused on the target (e.g., "type into the email input").%variableName%tokens are supported invaluewhen variables are available.Written for commit 331efb3. Summary will update on new commits. Review in cubic