Skip to content

variables for observe#1808

Merged
filip-michalsky merged 10 commits intomainfrom
fm/stg-1438-add-variables-to-observe
Mar 23, 2026
Merged

variables for observe#1808
filip-michalsky merged 10 commits intomainfrom
fm/stg-1438-add-variables-to-observe

Conversation

@filip-michalsky
Copy link
Collaborator

why

act() already supports variables, but observe() did not. That made safety-sensitive flows like login automation harder to use correctly: callers could inspect observe()
results before executing them, but then had to guess which returned action should receive each secret value before calling act().

This change brings variable support to observe() so it can return placeholder-backed actions like %username% and %password%. That preserves the existing safe pattern of:

  1. observe() candidate actions
  2. validate the returned actions
  3. act() the validated actions with real variable values at execution time

what changed

  • Added variables?: Variables support to observe() in the public SDK types and internal handler params.
  • Threaded observe variables through the local SDK path, inference layer, and prompt builder.
  • Updated the observe prompt so the model sees available variable names and returns %variableName% placeholders in action arguments instead of literal sensitive values.
  • Added observe variable support to the hosted/API path, including schema updates and flattening rich variable values to the existing wire format.
  • Updated the internal fillForm tool to pass variables into observe() as well as act().
  • Added docs for observe({ variables }) and the validate-then-act login flow.
  • Added a dedicated example at packages/core/examples/observe_variables_login.ts showing placeholder-based login planning with observe(), explicit validation, and execution via
    act().

test plan

  • Ran targeted unit tests covering:
    • public ObserveOptions type support
    • observe variable forwarding into inference/prompting
    • placeholder preservation in returned observe actions
    • API client observe variable serialization
    • fillForm forwarding variables to observe()
  • Ran:
    • pnpm --filter @browserbasehq/stagehand exec vitest run --config /tmp/stagehand-vitest-source.config.mjs tests/unit/public-api/public-types.test.ts tests/unit/agent-execution- model.test.ts tests/unit/timeout-handlers.test.ts tests/unit/api-client-observe-variables.test.ts
  • Ran formatting checks on changed files with Prettier.
  • Added integration coverage for observe request schemas in both v3 and v4 server tests.
  • Full repo typecheck/build is still blocked by unrelated pre-existing issues in packages/core/lib/v3/launch/browserbase.ts and existing server test environment/type-resolution
    failures.

@changeset-bot
Copy link

changeset-bot bot commented Mar 11, 2026

⚠️ No Changeset found

Latest commit: c99a0a4

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR adds variables support to observe(), matching the existing capability in act(). The design correctly exposes only variable names (as %variableName% tokens) to the LLM during observation, so secrets are never sent to the model until act() is called with the resolved values. The change is threaded consistently through the public types, internal handler params, inference layer, prompt builder, API client wire serialization (via flattenVariables), and the fillFormTool agent helper.

Key changes:

  • ObserveOptions and ObserveHandlerParams gain variables?: Variables
  • buildObserveSystemPrompt appends a variable-names hint to the system prompt without leaking values
  • StagehandAPIClient.observe calls flattenVariables to convert rich Variables to Record<string, string> before wire serialization, keeping the server schema (z.record(z.string(), z.string())) valid
  • New example (observe_variables_login.ts), unit tests, and docs demonstrate the validate-then-act login flow
  • Two small issues in the example file: timeoutMs is not a valid Playwright page.goto() option (should be timeout, causing the custom timeout to be silently ignored) and a typo in the example email domain (browserbaser.com)
  • A minor double-period appears in the assembled observe system prompt when variablesString is non-empty (...literal value.. When choosing...) due to the trailing period in variablesString combined with the period already present in the template string

Confidence Score: 4/5

  • This PR is safe to merge with minor fixes — the core variable-privacy design is sound, but the example file has a real silent bug with the wrong Playwright option name.
  • The security-sensitive core of this PR (exposing only variable names to the LLM and preserving placeholders through to act()) is implemented correctly and consistently across all code paths (local handler, API client, agent fillForm tool). The type threading, inference forwarding, and prompt construction are all aligned. The deductions are for: the timeoutMs bug in the example that silently ignores the intended timeout, a typo in the example email, and a minor double-period in the LLM system prompt. None of these affect production correctness of the feature itself.
  • packages/core/examples/observe_variables_login.ts (wrong timeoutMs option, typo) and packages/core/lib/prompt.ts (double-period in assembled prompt)

Important Files Changed

Filename Overview
packages/core/lib/prompt.ts Adds variables parameter to buildObserveSystemPrompt; correctly exposes only variable names (not values) to the LLM. Minor double-period formatting issue in the assembled prompt string when both variables and actions are present.
packages/core/lib/inference.ts Threads variables parameter through observe() inference function into buildObserveSystemPrompt. Change is minimal, correct, and consistent with how variables are handled in act().
packages/core/lib/v3/handlers/observeHandler.ts Forwards variables from ObserveHandlerParams into the runObserve call. Clean, correct, no issues found.
packages/core/lib/v3/api.ts Calls flattenVariables to convert rich Variables to Record<string, string> before wire serialization. Works correctly, but the schema description is misleading and the mismatch between the public Variables type and the wire-format schema is undocumented.
packages/core/lib/v3/types/public/api.ts Adds variables field to ObserveOptionsSchema as z.record(z.string(), z.string()). Schema type is flat strings only, which diverges from the SDK-level Variables type that supports rich objects; misleading description implies substitution happens during observe rather than act.
packages/core/examples/observe_variables_login.ts New example demonstrating observe({ variables }) with placeholder validation before act(). Contains two issues: timeoutMs is not a valid Playwright page.goto() option (should be timeout) causing the custom timeout to be silently ignored, and a typo in the email domain (browserbaser.com).
packages/core/lib/v3/agent/tools/fillform.ts Now passes variables into both observe() and act() calls within the fill-form tool, ensuring placeholder-backed actions flow end-to-end through the agent tool. Change is clean and correct.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant V3
    participant ObserveHandler
    participant LLM
    participant ActHandler

    Caller->>V3: observe(instruction, { variables })
    V3->>ObserveHandler: ObserveHandlerParams { instruction, variables }
    ObserveHandler->>LLM: buildObserveSystemPrompt(userInstructions, supportedActions, variables)<br/>System prompt exposes only %variableName% keys — no values
    LLM-->>ObserveHandler: Action[] with %placeholder% tokens in arguments
    ObserveHandler-->>V3: Action[] (placeholders preserved)
    V3-->>Caller: Action[] (e.g. [{ method: "fill", arguments: ["%username%"] }])

    Note over Caller: Caller validates returned actions

    Caller->>V3: act(emailAction, { variables })
    V3->>ActHandler: substituteVariables(%username% → "user@example.com")
    ActHandler-->>Caller: ActResult (real value used only at execution time)
Loading

Last reviewed commit: 10dc220

Comment on lines +543 to +550
variables: z
.record(z.string(), z.string())
.optional()
.meta({
description:
"Variables to substitute into observed action arguments using %variableName% placeholders",
example: { username: "john_doe" },
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading schema description — observe doesn't substitute variables

The description says "Variables to substitute into observed action arguments", but observe() doesn't substitute values — it tells the model to return %variableName% placeholders. Actual substitution happens later in act(). A more accurate description would be something like "Variables whose names are exposed to the model so it returns %variableName% placeholders in action arguments instead of literal values".

Also note this schema accepts only z.record(z.string(), z.string()) (flat strings), while the public ObserveOptions.variables type accepts rich Variables (including { value, description? } objects). flattenVariables is called client-side to reconcile this, but a developer calling the server API directly with a rich object would get a schema validation error without a clear reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this part is crucial:

Also note this schema accepts only z.record(z.string(), z.string()) (flat strings), while the public ObserveOptions.variables type accepts rich Variables (including { value, description? } objects). flattenVariables is called client-side to reconcile this, but a developer calling the server API directly with a rich object would get a schema validation error without a clear reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. observe() does not substitute values; it only exposes variable names so the model returns %variableName% placeholders in action arguments. Actual
substitution happens later in act(). I updated the schema description to reflect that and added coverage for both SDK flattening and direct server requests with
rich variables.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

2 issues found across 18 files

Confidence score: 4/5

  • This PR looks safe to merge with minimal risk: both findings are low-to-moderate severity quality issues rather than likely functional breakages in core runtime behavior.
  • Most severe issue: in packages/core/examples/observe_variables_login.ts, Playwright goto uses timeoutMs (invalid) instead of timeout, so the intended 30s limit is ignored; this is user-facing because example code is likely to be copied.
  • In packages/core/lib/prompt.ts, prompt assembly can produce a double period (..) when variables are present, which is minor but can degrade prompt polish/readability.
  • Pay close attention to packages/core/examples/observe_variables_login.ts and packages/core/lib/prompt.ts - fix the invalid Playwright timeout option and the punctuation duplication in generated prompts.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/prompt.ts">

<violation number="1" location="packages/core/lib/prompt.ts:137">
P2: Double period in the generated prompt when variables are provided. `variablesString` ends with `.` and the template appends another `.` right after, producing `...literal value.. When choosing...` after whitespace collapse. Remove the trailing period from `variablesString` or from the template literal to avoid the stutter.</violation>
</file>

<file name="packages/core/examples/observe_variables_login.ts">

<violation number="1" location="packages/core/examples/observe_variables_login.ts:59">
P2: `timeoutMs` is not a valid Playwright `goto` option — it will be silently ignored. Use `timeout` instead so the 30 s limit actually takes effect. Since users are likely to copy this example, the typo is worth fixing.</violation>
</file>

Linked issue analysis

Linked issue: STG-1438: Add variables to observe

Status Acceptance criteria Notes
Add variables support to observe in the public SDK types (ObserveOptions) ObserveOptions and schema now include variables
Add variables param to internal ObserveHandlerParams and handler signature ObserveHandlerParams and observe() accept variables
Thread variables through local SDK path (v3.ts) into handler params V3 passes options?.variables into handler params
Thread variables into inference.observe() signature and calls inference.observe now accepts and forwards variables
Update prompt builder to show available variable names to model buildObserveSystemPrompt now includes variablesString guidance
Make model return %variableName% placeholders (prompt instructs so) Prompt instructs placeholder usage and inference path preserves them
Preserve placeholders in returned Action[] (no literal secrets leaked) Test asserts observed action arguments remain %username%
Add observe variable support to hosted/API path and schemas API ObserveOptionsSchema now accepts variables and server tests added
Flatten rich variable values to existing wire format before sending API request StagehandAPIClient flattens variables via flattenVariables
Update internal fillForm tool to pass variables into observe() fillFormTool now includes variables in observeOptions
Ensure fillFormTool forwarding to v3.observe is tested Unit test verifies fillFormTool passes variables to v3.observe
Add docs for observe({ variables }) and validate-then-act login flow Docs updated with variables sections and examples
Add example demonstrating placeholder-based login planning and act() execution Example file added showing validate-then-act with variables
Add unit tests covering public ObserveOptions type support public-types.test updated to include Variables in ObserveOptions
Add unit tests verifying observe variable forwarding into inference/prompting Test asserts observeInference called with variables
Add unit tests for API client observe variable serialization api-client-observe-variables.test asserts flattened payload
Add integration tests for observe request schemas in v3 and v4 server tests Integration tests post observe with variables and assert success
Architecture diagram
sequenceDiagram
    participant User as User Script
    participant SDK as Stagehand SDK (V3)
    participant API as API Client / Server
    participant Handler as ObserveHandler
    participant Prompt as Prompt Builder
    participant LLM as LLM (Inference)

    Note over User,LLM: Variable-aware Observation Flow (Secure Login Pattern)

    User->>SDK: observe(instruction, { variables })
    
    opt If Hosted/Remote Execution
        SDK->>API: observe(instruction, { variables })
        API->>API: NEW: flattenVariables()
        Note right of API: Converts rich objects {value, desc} <br/>to simple strings for wire format
    end

    SDK->>Handler: observe(params)
    Handler->>Handler: Capture DOM & Accessibility Tree
    
    Handler->>Prompt: NEW: buildObserveSystemPrompt(..., variables)
    Prompt->>Prompt: Extract variable keys (e.g., "username")
    Note right of Prompt: Injects instruction: "When an action needs a <br/>sensitive value, return %variableName% placeholder"

    Handler->>LLM: inference.observe(prompt, snapshot)
    LLM-->>Handler: Return Action[] (with %placeholders%)
    Handler-->>SDK: Return observed actions
    SDK-->>User: Return actions (e.g., [{ method: "fill", args: ["%password%"] }])

    Note over User: Security Boundary: Real secrets never sent to LLM

    User->>User: CHANGED: Validate observed actions <br/>contain expected placeholders
    
    alt Happy Path: Validation Success
        User->>SDK: act(action, { variables })
        SDK->>SDK: CHANGED: Substitute %placeholder% with real value
        SDK->>Handler: Perform browser action with secret
    else Unhappy Path: Malicious/Unexpected Action
        User->>User: Abort execution (Safety check)
    end

    opt Internal Tooling: fillForm
        SDK->>SDK: fillFormTool.execute(fields, variables)
        SDK->>Handler: NEW: observe(instruction, { variables })
        Note right of SDK: fillForm now uses variable-aware <br/>observation to find form fields
    end
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

"Model configuration object or model name string (e.g., 'openai/gpt-5-nano')",
}),
variables: z
.record(z.string(), z.string())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong shape, shouldnt it be the new variables shape that supports description or flat?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. The earlier version only accepted Record<string, string>, but ObserveOptions.variables uses the shared Variables shape. I updated the wire schema/
OpenAPI to accept flat primitives and { value, description? } objects so direct API callers match the SDK surface too.

Comment on lines +543 to +550
variables: z
.record(z.string(), z.string())
.optional()
.meta({
description:
"Variables to substitute into observed action arguments using %variableName% placeholders",
example: { username: "john_doe" },
}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this part is crucial:

Also note this schema accepts only z.record(z.string(), z.string()) (flat strings), while the public ObserveOptions.variables type accepts rich Variables (including { value, description? } objects). flattenVariables is called client-side to reconcile this, but a developer calling the server API directly with a rich object would get a schema validation error without a clear reason.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 19, 2026

✱ Stainless preview builds

This PR will update the stagehand SDKs with the following commit message.

feat: variables for observe
stagehand-openapi studio · code

Your SDK build had at least one "note" diagnostic.
generate ✅

⚠️ stagehand-typescript studio · code

Your SDK build had at least one "warning" diagnostic.
generate ⚠️build ✅lint ✅test ✅

npm install https://pkg.stainless.com/s/stagehand-typescript/cc0e058196099528f7d82aaebbbf20a570f79404/dist.tar.gz
stagehand-ruby studio · code

Your SDK build had at least one "note" diagnostic.
generate ✅build ⏭️lint ✅test ✅

stagehand-java studio · code

Your SDK build had at least one "note" diagnostic.
generate ✅build ✅lint ✅test ✅

Add the following URL as a Maven source: 'https://pkg.stainless.com/s/stagehand-java/b10e2ad429ebac12464ac332eb32c2d11035c1ff/mvn'
stagehand-kotlin studio · code

Your SDK build had at least one "note" diagnostic.
generate ✅build ✅lint ✅test ✅

⚠️ stagehand-csharp studio · code

Your SDK build had a failure in the build CI job, which is a regression from the base state.
generate ⚠️build ❗lint ❗test ✅

⚠️ stagehand-python studio · code

Your SDK build had at least one "warning" diagnostic.
generate ⚠️build ✅lint ✅test ✅

pip install https://pkg.stainless.com/s/stagehand-python/93ef31098a0cb2b4acfd157ae4c45cedc2f2e58c/stagehand-3.6.0-py3-none-any.whl
stagehand-go studio · code

Your SDK build had at least one "note" diagnostic.
generate ✅build ⏭️lint ✅test ✅

go get github.com/stainless-sdks/stagehand-go@1b2c60e3ba18fa94b1b123debce9fdf6359a069f
⚠️ stagehand-php studio · code

Your SDK build had at least one "warning" diagnostic.
generate ⚠️lint ✅test ✅


This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-03-23 19:14:44 UTC

@filip-michalsky
Copy link
Collaborator Author

@pirate lint / test passing after feedback addressed!

@filip-michalsky filip-michalsky merged commit 4b835b7 into main Mar 23, 2026
193 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants