fix(build): predictable error handling + transient-read retry for build logs by kristof-siket · Pull Request #104 · prisma/prisma-cli

kristof-siket · 2026-06-30T13:52:17Z

What

Two fixes to the build logs <build_id> command's failure behavior, stacked on top of #102 (now merged to main).

1. Predictable error handling — 401 maps to the shared auth error

The command streams GET /v1/builds/{buildId}/logs through openapi-fetch, which does not throw on non-2xx — a 401 arrives as response.status === 401. The command-runner's SDKAuthError → authRequiredError mapping therefore never fired for this path, so an expired/invalid CLI session surfaced as a generic BUILD_LOGS_FAILED ("Failed to read build logs… / Retry").

A 401 now maps to the shared authRequiredError(["prisma-cli auth login"]), so the output is consistent with the rest of the CLI and tells the user what to actually do. 404 → BUILD_NOT_FOUND and the generic fallback are unchanged; 403 is intentionally not special-cased (the API uses 404-collapse for unauthorized access).

2. Bounded retry honoring the API's `retryable` signal, resuming from cursor

The read (stream open + NDJSON consumption) now runs inside a bounded retry loop:

3 attempts max, backing off ~500ms then ~1500ms, abort-aware (a cancel during the wait returns cleanly, not as a failure).
Retries on: a transient open status (408, 429, 500, 502, 503, 504), a non-Abort network-style error from the GET/stream read, or the Management API's existing terminal error record with retryable: true.
Surfaces immediately (no retry): 401 → authRequiredError, 404 → BUILD_NOT_FOUND, any other non-transient status → BUILD_LOGS_FAILED, or a terminal error with retryable: false.
Resumes from the last cursor (tracked across log records and the terminal record) on every reconnect, so resumed reads don't reprint already-emitted output.
On exhaustion after a retryable failure, behavior is unchanged from before: the terminal error message prints to stderr and process.exitCode = 1 (terminal-record path), or the generic CliError is thrown (transient-open / network path).
--follow keeps working: a retryable drop reconnects from cursor through the same bounded loop (the attempt count stays bounded — follow does not retry unbounded).

This fixes the intermittent 408-from-Durable-Streams "Failed to read build logs." seen right after a build completes.

Trace evidence

A real failure showed the two-failure pattern this PR targets:

First attempt: 401 — the CLI JWT had expired. (Now → actionable authRequiredError instead of a generic read failure.)
Second attempt: the route returned 200, but the streams server (GET …/v1/stream/build-logs) returned HTTP 408 after ~5.2s. (Now → a retryable transient that the loop reconnects from the last cursor.)
Third attempt: succeeded.

Tests

New packages/cli/tests/build-logs-controller.test.ts (deterministic, backoff injected as zeros — no real timers):

401 → authRequiredError (asserts AUTH_REQUIRED, not BUILD_LOGS_FAILED).
Retryable terminal error on attempt 1, success on attempt 2 → succeeds, prints both log batches, and the second read carries the resumed cursor.
Retryable failure on all 3 attempts → process.exitCode = 1, message surfaced exactly once.
Transient open status (503) then success → retried.
--follow drop → reconnects from cursor with follow=true preserved.
404 and non-retryable terminal end/no_logs → not retried (unchanged).

pnpm --filter @prisma/cli test (562 tests) green; tsc --noEmit, tsdown build, and biome check all clean.

Notes

This was originally branched off feat/build-logs-command (feat(build): add build logs <build_id> command #102). Since feat(build): add build logs <build_id> command #102 merged to main first, the branch is rebased onto main and targets main — the diff is exactly these two files (no feat(build): add build logs <build_id> command #102 diff). It can still be folded into the feat(build): add build logs <build_id> command #102 line of work if preferred.
No new generic retry framework: the loop, the abort-aware sleep, and the small phase helpers (openBuildLogStream / consumeBuildLogStream) are local to build.ts.

🤖 Generated with Claude Code

…ld logs `build logs` now maps a 401 from the streaming endpoint to the shared authRequiredError ("run prisma-cli auth login") instead of a generic "Failed to read build logs" — the SDK returns the 401 as a response, so the command-runner's SDKAuthError mapping never fired for this path. The read (open + NDJSON consumption) now runs inside a bounded retry loop: at most 3 attempts, backing off ~500ms then ~1500ms, abort-aware. It retries a transient open status (408/429/5xx), a non-Abort network error, or the Management API's existing retryable terminal error record, resuming each reconnect from the last cursor so output isn't reprinted. 401, 404, other non-transient statuses, and non-retryable terminals are surfaced immediately. --follow reconnects through the same bounded loop. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

kristof-siket · 2026-06-30T14:04:56Z

@CodeRabbit review

coderabbitai · 2026-06-30T14:05:04Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-06-30T14:05:30Z

Summary by CodeRabbit

Bug Fixes
- Improved build log streaming so it can reconnect after temporary network or server issues without duplicating output.
- Preserved the latest log position during retries, helping follow-mode stay in sync after interruptions.
- Made error messages clearer when logs can’t be read after multiple attempts.
Tests
- Added coverage for authentication errors, missing builds, transient failures, retry exhaustion, and reconnect behavior.

Walkthrough

runBuildLogs in packages/cli/src/controllers/build.ts is refactored from a single-shot stream read into a retryable loop. It gains an injectable BuildLogsDeps parameter (carrying backoffMs), a ReadOutcome discriminated union classifying open and consume results, cursor-based resumption across reconnects, cancellable backoff via a new sleep helper, and outcome-based retry/exhaustion logic including process.exitCode assignment. buildLogsRequestError's why message is adjusted for HTTP vs non-HTTP failures. A new Vitest suite covering all retry/no-retry scenarios is added.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly matches the main change: build log error handling and transient-read retries.
Description check	✅ Passed	The description is directly related to the build logs retry and auth-handling changes.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/build-logs-retry-auth

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/build-logs-retry-auth

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/cli/src/controllers/build.ts`:
- Around line 251-274: The build flow in build() currently lets terminal errors
with retryable false fall through to writeBuildLogRecord() and then return done,
which can incorrectly succeed. Update the terminal-record handling branch in the
record-processing loop to detect non-retryable terminal errors (using
record.type, record.kind, and record.retryable) and immediately fail via a
distinct fatal-terminal outcome or by setting process.exitCode = 1, while
keeping retryable terminals in the existing retryable-terminal path.

In `@packages/cli/tests/build-logs-controller.test.ts`:
- Around line 87-212: The build logs controller tests are missing coverage for
two retry paths in the controller logic: a rejected GET that is not an
AbortError, and an AbortError raised during the retry backoff. Add one focused
test for each branch in build-logs-controller.test.ts, using runWithClient and
the existing get stub to drive the specific failure mode and assert the expected
retry/exit behavior so regressions in the controller’s retry handling are
caught.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6b386456-f543-43a0-96fd-4b8d48a6b376

📥 Commits

Reviewing files that changed from the base of the PR and between 0d559ab and 912d9d3.

📒 Files selected for processing (2)

packages/cli/src/controllers/build.ts
packages/cli/tests/build-logs-controller.test.ts

coderabbitai · 2026-06-30T14:09:06Z

+      if (
+        record.type === "terminal" &&
+        record.kind === "error" &&
+        record.retryable
+      ) {
+        retryableTerminal = record;
+        return;
+      }
+      writeBuildLogRecord(context, record);
+    });
+  } catch (error) {
+    if (isAbortError(error) || context.runtime.signal.aborted) {
+      throw error;
    }
-    writeBuildLogRecord(context, record);
-  });
+    return { outcome: { kind: "retryable-network" }, cursor: latestCursor };
+  }

-  if (sawError) {
-    process.exitCode = 1;
+  if (retryableTerminal) {
+    return {
+      outcome: { kind: "retryable-terminal", record: retryableTerminal },
+      cursor: latestCursor,
+    };
  }
+  return { outcome: { kind: "done" }, cursor: latestCursor };


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Make non-retryable terminal errors fail the command.

A terminal { kind: "error", retryable: false } falls through to writeBuildLogRecord() and then returns done, so the command can exit 0 after an error terminal. Return a distinct fatal-terminal outcome or set process.exitCode = 1 for non-retryable terminal errors. As per PR objectives, “Non-transient cases like 404 and non-retryable terminal errors remain immediate failures.” <pr_objectives>

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/cli/src/controllers/build.ts` around lines 251 - 274, The build flow in build() currently lets terminal errors with retryable false fall through to writeBuildLogRecord() and then return done, which can incorrectly succeed. Update the terminal-record handling branch in the record-processing loop to detect non-retryable terminal errors (using record.type, record.kind, and record.retryable) and immediately fail via a distinct fatal-terminal outcome or by setting process.exitCode = 1, while keeping retryable terminals in the existing retryable-terminal path.

coderabbitai · 2026-06-30T14:09:06Z

+describe("build logs controller", () => {
+  it("maps a 401 to the shared auth-required error", async () => {
+    const get = vi.fn().mockResolvedValue(errorResult(401));
+    const { run } = await runWithClient(get);
+
+    await expect(run).rejects.toMatchObject({
+      code: "AUTH_REQUIRED",
+      domain: "auth",
+    });
+    expect(get).toHaveBeenCalledTimes(1);
+  });
+
+  it("retries a retryable terminal error and resumes from the cursor", async () => {
+    const get = vi
+      .fn()
+      .mockResolvedValueOnce(
+        streamResult([logLine("first", "c1"), retryableTerminal("c1")]),
+      )
+      .mockResolvedValueOnce(
+        streamResult([
+          logLine("second", "c2"),
+          {
+            type: "terminal",
+            kind: "end",
+            code: "end",
+            retryable: false,
+            cursor: "c2",
+            message: "",
+          },
+        ]),
+      );
+    const { run, stdout } = await runWithClient(get);
+
+    await run;
+
+    expect(process.exitCode).toBeUndefined();
+    expect(stdout.buffer).toContain("first");
+    expect(stdout.buffer).toContain("second");
+    expect(get).toHaveBeenCalledTimes(2);
+    expect(queryOf(get.mock.calls[0])).not.toHaveProperty("cursor");
+    expect(queryOf(get.mock.calls[1])).toMatchObject({ cursor: "c1" });
+  });
+
+  it("exits non-zero and surfaces the message once when every attempt fails", async () => {
+    const get = vi
+      .fn()
+      .mockImplementation(async () => streamResult([retryableTerminal("c1")]));
+    const { run, stderr } = await runWithClient(get);
+
+    await run;
+
+    expect(process.exitCode).toBe(1);
+    expect(get).toHaveBeenCalledTimes(3);
+    const occurrences =
+      stderr.buffer.split("Failed to read build logs.").length - 1;
+    expect(occurrences).toBe(1);
+  });
+
+  it("reconnects a dropped --follow stream from the cursor", async () => {
+    const get = vi
+      .fn()
+      .mockResolvedValueOnce(
+        streamResult([logLine("a", "c1"), retryableTerminal("c1")]),
+      )
+      .mockResolvedValueOnce(streamResult([logLine("b", "c2")]));
+    const { run, stdout } = await runWithClient(get, { follow: true });
+
+    await run;
+
+    expect(process.exitCode).toBeUndefined();
+    expect(get).toHaveBeenCalledTimes(2);
+    expect(queryOf(get.mock.calls[0])).toMatchObject({ follow: "true" });
+    expect(queryOf(get.mock.calls[1])).toMatchObject({
+      follow: "true",
+      cursor: "c1",
+    });
+    expect(stdout.buffer).toContain("a");
+    expect(stdout.buffer).toContain("b");
+  });
+
+  it("retries a transient open status and then succeeds", async () => {
+    const get = vi
+      .fn()
+      .mockResolvedValueOnce(errorResult(503))
+      .mockResolvedValueOnce(streamResult([logLine("ok", "c1")]));
+    const { run, stdout } = await runWithClient(get);
+
+    await run;
+
+    expect(process.exitCode).toBeUndefined();
+    expect(get).toHaveBeenCalledTimes(2);
+    expect(stdout.buffer).toContain("ok");
+  });
+
+  it("does not retry a 404 and surfaces BUILD_NOT_FOUND", async () => {
+    const get = vi.fn().mockResolvedValue(errorResult(404));
+    const { run } = await runWithClient(get);
+
+    await expect(run).rejects.toMatchObject({ code: "BUILD_NOT_FOUND" });
+    expect(get).toHaveBeenCalledTimes(1);
+  });
+
+  it("does not retry a non-retryable terminal end", async () => {
+    const get = vi.fn().mockResolvedValue(
+      streamResult([
+        logLine("only", "c1"),
+        {
+          type: "terminal",
+          kind: "end",
+          code: "no_logs",
+          retryable: false,
+          cursor: "c1",
+          message: "No logs were produced.",
+        },
+      ]),
+    );
+    const { run, stdout, stderr } = await runWithClient(get);
+
+    await run;
+
+    expect(process.exitCode).toBeUndefined();
+    expect(get).toHaveBeenCalledTimes(1);
+    expect(stdout.buffer).toContain("only");
+    expect(stderr.buffer).toContain("No logs were produced.");
+  });
+});


📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add coverage for the two untested retry branches.

This suite still never drives a rejected GET (non-AbortError) or an abort during the retry backoff, even though the controller now has dedicated logic for both. A regression in either path would currently pass unnoticed; please add one test for each branch.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/cli/tests/build-logs-controller.test.ts` around lines 87 - 212, The build logs controller tests are missing coverage for two retry paths in the controller logic: a rejected GET that is not an AbortError, and an AbortError raised during the retry backoff. Add one focused test for each branch in build-logs-controller.test.ts, using runWithClient and the existing get stub to drive the specific failure mode and assert the expected retry/exit behavior so regressions in the controller’s retry handling are caught.

coderabbitai Bot requested changes Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(build): predictable error handling + transient-read retry for build logs#104

fix(build): predictable error handling + transient-read retry for build logs#104
kristof-siket wants to merge 1 commit into
mainfrom
feat/build-logs-retry-auth

kristof-siket commented Jun 30, 2026

Uh oh!

kristof-siket commented Jun 30, 2026

Uh oh!

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Uh oh!

coderabbitai Bot Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

kristof-siket commented Jun 30, 2026

What

1. Predictable error handling — 401 maps to the shared auth error

2. Bounded retry honoring the API's retryable signal, resuming from cursor

Trace evidence

Tests

Notes

Uh oh!

kristof-siket commented Jun 30, 2026

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

2. Bounded retry honoring the API's `retryable` signal, resuming from cursor

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading