feat: add legacy /chat/completions support by chrisreadsf · Pull Request #1804 · browserbase/stagehand

chrisreadsf · 2026-03-10T04:28:54Z

why

providers that only support /chat/completions are not supported

what changed

added chatcompletions provider prefix (e.g. chatcompletions/glm-4-flash) that uses Chat Completions API instead of Responses API
threaded baseURL from SDK through server to core via x-model-base-url header, mirroring existing apiKey pattern
added output: "no-schema" fallback for chatcompletions models that can't do structured output, with a safety-net catch for other
fallback-pattern models
added fallback parsing for malformed model outputs (e.g. "[]" as string, missing array fields)

sister python PR here: browserbase/stagehand-python#318

test plan

tested locally with ZhipuAI glm-4-flash: observe, act, extract, and agent execute all pass

Thread modelBaseURL from x-model-base-url header through to V3 options, enabling providers like ZhipuAI, Ollama, and other OpenAI-compatible endpoints. Uses Chat Completions API (not Responses API) when a custom baseURL is set, and adds robust response coercion for models without native structured output support.

Adds "chatcompletions" as a generic provider that uses the Chat Completions API (/chat/completions) instead of the Responses API, for endpoints like ZhipuAI and Ollama. Also simplifies response coercion for models without native structured output support.

changeset-bot · 2026-03-10T04:28:59Z

🦋 Changeset detected

Latest commit: e19a7e0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages

Name	Type
@browserbasehq/stagehand	Minor
@browserbasehq/stagehand-server-v3	Minor
@browserbasehq/stagehand-server-v4	Minor
@browserbasehq/browse-cli	Patch
@browserbasehq/stagehand-evals	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

cubic-dev-ai

1 issue found across 8 files

Confidence score: 2/5

There is a high-confidence regression risk in packages/core/lib/v3/llm/LLMProvider.ts: the chatcompletions → .chat() mapping is only applied in the hasValidOptions path, so behavior diverges between configured and default client flows.
When clientOptions are absent, the else branch calls provider(subModelName) on the default openai instance, which can route chatcompletions models incorrectly and cause user-facing failures in common usage.
Pay close attention to packages/core/lib/v3/llm/LLMProvider.ts - ensure model normalization/dispatch is consistent in both branches so default and custom options behave the same.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/llm/LLMProvider.ts">

<violation number="1" location="packages/core/lib/v3/llm/LLMProvider.ts:53">
P1: The `chatcompletions` → `.chat()` handling only exists in the `hasValidOptions` branch. When no `clientOptions` are provided, the `else` branch calls `provider(subModelName)` on the default `openai` instance, which uses the Responses API — silently defeating the purpose of this provider.

Add the same `.chat()` handling in the `else` branch so `chatcompletions/model-name` always uses the Chat Completions API regardless of whether client options are present.</violation>
</file>

Architecture diagram

sequenceDiagram
    participant Client
    participant Server as Server (Fastify)
    participant Store as Session Store
    participant Prov as LLM Provider (Core)
    participant SDK as AI SDK Wrapper
    participant LLM as External LLM API

    Note over Client, LLM: Runtime flow for Model Base URL & Chat Completions Support

    Client->>Server: Request (Header: x-model-base-url, Body: provider/model)
    Server->>Server: NEW: getModelBaseURL() 
    Note right of Server: Checks body.options.model.baseURL <br/>OR x-model-base-url header

    Server->>Store: createSession(modelBaseURL, apiKey, ...)
    Store->>Prov: getAISDKLanguageModel(provider, model, baseURL)

    alt NEW: Provider prefix is "chatcompletions/"
        Prov->>Prov: Map to OpenAI provider instance
        Prov->>Prov: NEW: Force .chat() method (bypasses /responses)
    else Standard Provider
        Prov->>Prov: Initialize standard AI SDK provider
    end
    Prov-->>Store: LanguageModel instance (with baseURL)

    Store->>SDK: generateObject(schema, options)
    
    alt NEW: Model requires Prompt JSON Fallback
        SDK->>LLM: generateObject(output: "no-schema")
        LLM-->>SDK: Raw JSON String / Partial Object
        
        SDK->>SDK: NEW: Coerce stringified fields (e.g., "[]" to [])
        
        alt Schema Validation Fails
            SDK->>SDK: NEW: Heuristic fix (default missing arrays to [])
            SDK->>SDK: safeParse() retry
        end
    else Native Structured Output
        SDK->>LLM: generateObject(schema: ZodSchema)
        LLM-->>SDK: Structured Data
    end

    SDK-->>Store: Validated Object
    Store-->>Server: Session Result
    Server-->>Client: 200 OK / Stream Response

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/core/lib/v3/llm/LLMProvider.ts

greptile-apps · 2026-03-10T04:35:55Z

Greptile Summary

This PR adds support for OpenAI-compatible providers that only expose the /chat/completions endpoint (not the newer /responses endpoint), using a new chatcompletions/<model> provider prefix. It also threads a baseURL override through the server stack via an x-model-base-url header (mirroring the existing x-model-api-key pattern), and adds a no-schema fallback path in aisdk.ts for models that lack native structured-output support, with best-effort coercion of malformed responses.

Key implementation areas:

LLMProvider.ts: Registers chatcompletions in both static and factory provider maps, and calls provider.chat(subModelName) (instead of the default provider(subModelName)) when subProvider === "chatcompletions" to target /chat/completions. However, this special-case is only applied in the hasValidOptions branch; the else branch is missing the .chat() call and would silently use the Responses API if no API key or baseURL is supplied.
aisdk.ts: Adds a needsPromptJsonFallback branch that calls generateObject with output: "no-schema", parses the free-form response against the Zod schema, and retries after defaulting missing top-level array fields to []. The second schema.parse() in the retry is not wrapped in a try/catch, so a remaining ZodError surfaces as an untyped error rather than a structured diagnostic.
header.ts: Adds getModelBaseURL following the established body-then-header precedence pattern — clean and consistent.
SessionStore.ts / InMemorySessionStore.ts / stream.ts / start.ts: Propagates modelBaseURL from request context through to V3Options, mirroring the existing modelApiKey propagation cleanly.
types/model.ts: Registers "chatcompletions" in AISDK_PROVIDERS, enabling provider-name validation at the session-start route.

Confidence Score: 3/5

Mostly safe to merge, but a logic gap in LLMProvider.ts means chatcompletions silently degrades to the Responses API when used without client options.
The server-side baseURL threading is clean and consistent with existing patterns. The aisdk.ts no-schema fallback is reasonable and functional. The main concern is the missing .chat() call in the else branch of getAISDKLanguageModel — while unlikely to be hit in practice (since chatcompletions almost always requires a baseURL), it creates a silent misbehavior rather than an error. The unguarded second schema.parse() in the coercion retry path is a secondary concern around error diagnostics.
packages/core/lib/v3/llm/LLMProvider.ts — the else branch in getAISDKLanguageModel (line 131–139) lacks the .chat() special-case for chatcompletions. Recommend adding the .chat() call to ensure consistent behavior regardless of whether client options are provided.

Sequence Diagram

sequenceDiagram
    participant Client as SDK Client
    participant Server as server-v3
    participant Header as header.ts
    participant Store as InMemorySessionStore
    participant Core as LLMProvider
    participant AISDK as AI SDK

    Client->>Server: POST /v1/sessions/start<br/>x-model-base-url header<br/>modelName: chatcompletions/glm-4-flash

    Server->>Header: getModelBaseURL(request)
    Header-->>Server: baseURL value

    Server->>Header: getModelApiKey(request)
    Header-->>Server: apiKey value

    Server->>Store: getOrCreateStagehand(sessionId, ctx)

    Store->>Core: getAISDKLanguageModel("chatcompletions", "glm-4-flash", clientOptions)

    Note over Core: hasValidOptions = true<br/>(baseURL or apiKey present)
    Core->>AISDK: createOpenAI({ baseURL, apiKey })
    AISDK-->>Core: provider instance
    Core->>AISDK: provider.chat("glm-4-flash")
    Note over AISDK: Targets /chat/completions<br/>instead of /responses

    AISDK-->>Core: LanguageModelV2
    Core-->>Store: AISdkClient
    Store-->>Server: V3 instance
    Server-->>Client: sessionId + cdpUrl

Comments Outside Diff (1)

packages/core/lib/v3/llm/LLMProvider.ts, line 131-139 (link)

chatcompletions silently falls back to Responses API when no client options are provided

The .chat() special-case that routes to /chat/completions is only applied inside the hasValidOptions branch (line 125–127). The else branch below calls provider(subModelName) directly on the static openai instance, which routes to the Responses API — the exact opposite of what chatcompletions is supposed to do.

In practice chatcompletions will almost always be paired with a baseURL (that's the whole point), so hasValidOptions will be true. But someone who omits the baseURL and apiKey (e.g., relying on an OPENAI_API_KEY env var only) will silently get the Responses API instead.

_{Last reviewed commit: 87a5801}

greptile-apps · 2026-03-10T04:36:02Z

packages/core/lib/v3/llm/aisdk.ts

+            for (const issue of firstTry.error.issues) {
+              if (
+                issue.code === "invalid_type" &&
+                issue.expected === "array" &&
+                issue.path.length === 1
+              ) {
+                raw[issue.path[0] as string] = [];
+              }
+            }
+            parsed = options.response_model.schema.parse(raw);


Second parse() call can throw an untyped ZodError

After the array-field defaulting loop, options.response_model.schema.parse(raw) is called without a try/catch. If the response still fails validation for any reason other than a missing top-level array field (e.g., a nested object type mismatch, an extra required field), a raw ZodError is thrown. That error is caught by the outer catch (err) block, but that block only checks for NoObjectGeneratedError.isInstance(err) — a ZodError will just be re-thrown without the special logging context.

Consider wrapping this in a try/catch that converts ZodError into something more informative, or using .safeParse() again and surfacing the issues clearly:

Suggested change

for (const issue of firstTry.error.issues) {

if (

issue.code === "invalid_type" &&

issue.expected === "array" &&

issue.path.length === 1

) {

raw[issue.path[0] as string] = [];

}

}

parsed = options.response_model.schema.parse(raw);

const secondTry = options.response_model.schema.safeParse(raw);

if (!secondTry.success) {

throw new Error(

`Model response could not be coerced into the expected schema: ${secondTry.error.message}`,

);

}

parsed = secondTry.data;

Try structured output (schema:) first for all models. Only fall back to no-schema + response coercion when the call fails and the model matches a known fallback pattern. This avoids degrading DeepSeek/Kimi which already work with schema:.

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/llm/aisdk.ts">

<violation number="1" location="packages/core/lib/v3/llm/aisdk.ts:179">
P1: Models in `PROMPT_JSON_FALLBACK_PATTERNS` (deepseek, kimi, glm) will now always make a wasted API call that fails before falling back to no-schema mode. Previously these models skipped straight to the no-schema path. This doubles latency and cost for every extract call on these providers.

Consider keeping the original structure where `needsPromptJsonFallback` is checked *before* the first call, and only use the try-then-fallback pattern for models that are *not* in the known fallback list (i.e., the `chatcompletions/` prefix models that aren't predictable from the model ID).</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-03-10T04:52:18Z

packages/core/lib/v3/llm/aisdk.ts

+        // Try structured output first. If the provider doesn't support
+        // response_format (e.g. chatcompletions/ endpoints), this will throw
+        // and we fall back to no-schema mode with response coercion below.
        objectResponse = await generateObject({


P1: Models in PROMPT_JSON_FALLBACK_PATTERNS (deepseek, kimi, glm) will now always make a wasted API call that fails before falling back to no-schema mode. Previously these models skipped straight to the no-schema path. This doubles latency and cost for every extract call on these providers.

Consider keeping the original structure where needsPromptJsonFallback is checked before the first call, and only use the try-then-fallback pattern for models that are not in the known fallback list (i.e., the chatcompletions/ prefix models that aren't predictable from the model ID).

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/lib/v3/llm/aisdk.ts, line 179: <comment>Models in `PROMPT_JSON_FALLBACK_PATTERNS` (deepseek, kimi, glm) will now always make a wasted API call that fails before falling back to no-schema mode. Previously these models skipped straight to the no-schema path. This doubles latency and cost for every extract call on these providers. Consider keeping the original structure where `needsPromptJsonFallback` is checked *before* the first call, and only use the try-then-fallback pattern for models that are *not* in the known fallback list (i.e., the `chatcompletions/` prefix models that aren't predictable from the model ID).</comment> <file context> @@ -173,19 +173,39 @@ You must respond in JSON format. respond WITH JSON. Do not include any other tex + // Try structured output first. If the provider doesn't support + // response_format (e.g. chatcompletions/ endpoints), this will throw + // and we fall back to no-schema mode with response coercion below. + objectResponse = await generateObject({ + model: this.model, + messages: formattedMessages, </file context>

- Skip schema attempt for chatcompletions/ models (provider: openai.chat) since they can't do structured output — avoids a wasted LLM call per extract - Unify .chat() handling in getAISDKLanguageModel so chatcompletions/ works regardless of whether clientOptions are provided - Guard second schema.parse() with safeParse + descriptive error message

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/llm/aisdk.ts">

<violation number="1" location="packages/core/lib/v3/llm/aisdk.ts:291">
P1: Custom agent: **Exception and error message sanitization**

Generic `new Error()` with unsanitized Zod error message that may reflect sensitive prompt data back to the caller. Per the error-sanitization rule, use a typed error class and strip or redact the raw Zod message (which can contain actual field values from the model response).</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-03-10T05:18:48Z

packages/core/lib/v3/llm/aisdk.ts

+          // 4. Validate against schema
+          const secondTry = options.response_model.schema.safeParse(raw);
+          if (!secondTry.success) {
+            throw new Error(


P1: Custom agent: Exception and error message sanitization

Generic new Error() with unsanitized Zod error message that may reflect sensitive prompt data back to the caller. Per the error-sanitization rule, use a typed error class and strip or redact the raw Zod message (which can contain actual field values from the model response).

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/lib/v3/llm/aisdk.ts, line 291: <comment>Generic `new Error()` with unsanitized Zod error message that may reflect sensitive prompt data back to the caller. Per the error-sanitization rule, use a typed error class and strip or redact the raw Zod message (which can contain actual field values from the model response).</comment> <file context> @@ -172,115 +172,129 @@ You must respond in JSON format. respond WITH JSON. Do not include any other tex + // 4. Validate against schema + const secondTry = options.response_model.schema.safeParse(raw); + if (!secondTry.success) { + throw new Error( + `Model response could not be coerced into the expected schema: ${secondTry.error.message}`, + ); </file context>

…PI specs, and stainless config

github-actions · 2026-03-10T17:17:26Z

✱ Stainless preview builds

This PR will update the stagehand SDKs with the following commit message.

feat: add legacy /chat/completions support

Edit this comment to update it. It will appear in the SDK's changelogs.

✅ stagehand-typescript studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅ → build ✅ → lint ✅ → test ✅
npm install https://pkg.stainless.com/s/stagehand-typescript/3fbb8f58174f0d115aa04aab96e745c8127c0f44/dist.tar.gz

✅ stagehand-openapi studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅

⚡ stagehand-ruby studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, which is a regression from the base state.
You don't need to resolve this conflict right now, but you will need to resolve it for your changes to be released to your users. Read more about why this happened here.

✅ stagehand-php studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅ → lint ✅ → test ✅

✅ stagehand-go studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅ → build ✅ → lint ✅ → test ✅
go get github.com/stainless-sdks/stagehand-go@a6926dbb0c88425eca6d4245fae11bc034ab2741

⚡ stagehand-kotlin studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, which is a regression from the base state.
You don't need to resolve this conflict right now, but you will need to resolve it for your changes to be released to your users. Read more about why this happened here.

⚡ stagehand-java studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, which is a regression from the base state.
You don't need to resolve this conflict right now, but you will need to resolve it for your changes to be released to your users. Read more about why this happened here.

⚡ stagehand-python studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, which is a regression from the base state.
You don't need to resolve this conflict right now, but you will need to resolve it for your changes to be released to your users. Read more about why this happened here.

✅ stagehand-csharp studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️ → build ❗ → lint ❗ → test ✅

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-03-10 18:05:07 UTC

… it in SDKs

chrisreadsf added 3 commits March 9, 2026 19:46

fix: remove stale zhipuai reference from comment

87a5801

chrisreadsf mentioned this pull request Mar 10, 2026

feat: add model_base_url parameter for custom LLM endpoints browserbase/stagehand-python#318

Closed

cubic-dev-ai bot reviewed Mar 10, 2026

View reviewed changes

packages/core/lib/v3/llm/LLMProvider.ts Show resolved Hide resolved

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

cubic-dev-ai bot reviewed Mar 10, 2026

View reviewed changes

fix: add comments to no-schema fallback pipeline, restore codex comment

d2841c1

chrisreadsf changed the title ~~Chris/model base url~~ feat: add legacy /chat/completions support Mar 10, 2026

cubic-dev-ai bot reviewed Mar 10, 2026

View reviewed changes

feat: add x-model-base-url header across client SDK, server-v4, OpenA…

6b75973

…PI specs, and stainless config

chrisreadsf added 4 commits March 10, 2026 10:17

chore: add server-v4 to changeset

48b9335

fix: remove ModelBaseUrl from required security schemes (it's optional)

6f01035

fix: add MODEL_BASE_URL as nullable client opt so Stainless generates…

9243cc5

… it in SDKs

chore: update changeset to reflect full modelBaseURL scope

e19a7e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add legacy /chat/completions support#1804

feat: add legacy /chat/completions support#1804
chrisreadsf wants to merge 11 commits intomainfrom
chris/model-base-url

chrisreadsf commented Mar 10, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

greptile-apps bot commented Mar 10, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Mar 10, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Mar 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chrisreadsf commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

why

what changed

test plan

Uh oh!

changeset-bot bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chrisreadsf commented Mar 10, 2026 •

edited

Loading

changeset-bot bot commented Mar 10, 2026 •

edited

Loading

greptile-apps bot commented Mar 10, 2026 •

edited

Loading

cubic-dev-ai bot Mar 10, 2026 •

edited

Loading

cubic-dev-ai bot Mar 10, 2026 •

edited

Loading

github-actions bot commented Mar 10, 2026 •

edited

Loading