fix(codex): auth-aware default model — fixes codex empty builds (#82 Gap 3)#85
Merged
Conversation
On ChatGPT-account auth the `-codex` models return HTTP 400 "not supported when using Codex with a ChatGPT account" — in ~3s. SWE-AF defaulted every codex role to `gpt-5.3-codex` regardless of auth mode, so a ChatGPT-auth codex build hit that 400, the coder got an error result, returned files_changed:[], the foundation issue failed and cascaded — an empty build (#82 Gap 3). The same plan on claude_code worked, which is why it looked codex-specific. Resolve the codex base model by auth mode instead: keep `gpt-5.3-codex` for API-key auth (where the -codex models are available), and use a ChatGPT-compatible model (`gpt-5.5`) when codex authenticates via a ChatGPT account (SWE_CODEX_AUTH_MODE=chatgpt, or auto with no OPENAI_API_KEY). Explicit model overrides (models.default / per-role / SWE_DEFAULT_MODEL) still win. Verified live: gpt-5.3-codex 400s in 3s on ChatGPT auth; the resolved gpt-5.5 writes real code in ~8s. Refs #82 (Gap 3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…coder A codex model/auth 400 (e.g. a `-codex` model under ChatGPT-account auth, or a model needing a newer Codex CLI) is non-retryable: retrying with the same model and auth fails identically. Previously it was neither matched as fatal nor surfaced — the coder fell through to a bare `files_changed:[]` / "Coder agent failed" with no reason, burning the retry cap and cascading into a silent empty build. - fatal_error.py: match "not supported when using Codex with a ChatGPT account" and "requires a newer version of Codex" so check_fatal_harness_error raises FatalHarnessError with the real message and short-circuits retries. - run_coder: when the harness returns no parseable result, include the underlying error_message in the CoderResult summary so an empty result always carries *why*. Refs #82 (Gap 3). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Root-causes and fixes the codex empty-builds observation in #82 (Gap 3). It is not a structured-output problem — it's a model/auth mismatch, confirmed live.
SWE-AF defaulted every codex role to
gpt-5.3-codex. The-codexmodels are only available with OpenAI API-key auth — under ChatGPT-account auth they return:…in ~3 seconds. The coder turned that fast error into
files_changed: []→ "Coder agent failed" → the foundation issue failed → cascade → empty build. That's the reporter's exact symptom (~9–11s, codex-only;claude_codefine, because it never touched codex).Fix
1. Auth-aware codex default model (
schemas.py,fast/schemas.py)Resolve the codex base model by auth mode instead of a constant:
SWE_CODEX_AUTH_MODE=api_key, orautowithOPENAI_API_KEYset) →gpt-5.3-codex(unchanged).chatgpt, orautowith no key) →gpt-5.5(a ChatGPT-compatible model).Explicit overrides (
models.default/ per-role /SWE_DEFAULT_MODEL) still win.2. Codex model/auth errors are fatal + surfaced (
fatal_error.py,execution_agents.py)A model/auth 400 is non-retryable, so:
check_fatal_harness_errornow matches "not supported when using Codex with a ChatGPT account" and "requires a newer version of Codex" → raisesFatalHarnessErrorwith the real message, short-circuiting the retry cap.run_coderincludes the underlyingerror_messagein theCoderResultsummary when no result parses, so an empty result always carries why (the reporter saw a bare "Coder agent failed" with no reason).Validation Contract
gpt-5.5; API-key-auth codex default =gpt-5.3-codex; explicit overrides win;claude_code/open_codeunaffected.Test Plan — verified live (codex CLI 0.142.4, ChatGPT auth)
gpt-5.3-codex→ 400 "not supported … ChatGPT account" in 3s; resolvedgpt-5.5wrote a realhello.pyin 8s.resolve_runtime_models/fast_resolve_modelsreturngpt-5.5under chatgpt env,gpt-5.3-codexunder api_key env (new unit tests).make checkon py3.12: 1010 passed, 1 skipped.Scope
SWE-AF-only; independent of #84 (Gaps 1 & 2) and of the agentfield SDK — no cross-repo dependency. Tip for ChatGPT-plan users:
gpt-5.5runs at high reasoning effort (slower per call); setmodels.coderorSWE_DEFAULT_MODELto override if you want lower latency.Refs #82 (Gap 3).
🤖 Generated with Claude Code