Skip to content

perf(sandbox): reduce image build time by skipping broad permission repair#4018

Merged
ericksoa merged 7 commits into
mainfrom
fix/sandbox-build-perf
May 22, 2026
Merged

perf(sandbox): reduce image build time by skipping broad permission repair#4018
ericksoa merged 7 commits into
mainfrom
fix/sandbox-build-perf

Conversation

@zyang-dev
Copy link
Copy Markdown
Contributor

@zyang-dev zyang-dev commented May 21, 2026

Summary

This PR reduces sandbox image build time by avoiding broad recursive .openclaw permission repair on current unified-layout sandbox bases. It keeps the conservative repair path for legacy .openclaw-data migrations and adds Docker build step timing to make future build bottlenecks visible.
DGX Spark build-time comparison:

  • Original/restored Dockerfile: 178.1s
  • Optimized Dockerfile: 68.0s

Changes

  • Gate legacy .openclaw-data symlink verification so it only runs when legacy migration occurs.
  • Replace unconditional recursive .openclaw permission repair with a legacy-only broad repair and a targeted fast path for current unified layouts. The fast path relies on the current unified .openclaw layout being provisioned earlier in the image: Dockerfile.base creates the known state directories as sandbox:sandbox with group-write/setgid permissions, and openclaw plugins install runs as USER sandbox, so plugin-runtime-deps contents are already owned by sandbox:sandbox before the final repair step.
  • Add behavior-based regression coverage for the .openclaw repair path: current layouts use targeted permission repair, while legacy .openclaw-data migrations still use broad recursive repair.
  • Add an explicit .openclaw layout contract check so future directory/file layout changes fail tests until the targeted permission repair assumptions are reviewed.
  • Add classic Docker build step timing to sandbox create progress output.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Additional validation:

  • Verified in a freshly built sandbox that /sandbox/.openclaw/plugin-runtime-deps and sampled contents are owned by sandbox:sandbox; no non-sandbox:sandbox entries were found, and a sandbox-user write probe in plugin-runtime-deps succeeded.

Signed-off-by: zyang-dev 267119621+zyang-dev@users.noreply.github.com

Summary by CodeRabbit

  • New Features

    • Added build timing information to Docker build output for improved transparency.
  • Bug Fixes

    • Improved legacy data migration cleanup with stricter validation and error handling.
    • Enhanced initialization and permission management for more consistent behavior.
  • Tests

    • Added regression tests for permission repair logic.
    • Refactored port validation tests for better maintainability.

Review Change Stack

Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
@zyang-dev zyang-dev self-assigned this May 21, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ce5333d1-f2c4-4363-8de7-58a25515d718

📥 Commits

Reviewing files that changed from the base of the PR and between 42e229c and 38f860e.

📒 Files selected for processing (1)
  • Dockerfile

📝 Walkthrough

Walkthrough

This pull request refactors the Dockerfile's legacy OpenClaw layout migration to track and conditionally apply permission repairs, adds build-step timing instrumentation to the sandbox stream processor, includes sandboxed regression tests for layout/permission handling, and centralizes entrypoint port validation test assertions.

Changes

Legacy OpenClaw Layout & Conditional Permissions

Layer / File(s) Summary
Legacy layout tracking and cleanup enforcement
Dockerfile
Initialize legacy_layout state and marker file, record when legacy data directory exists, fail the build if legacy data persists post-cleanup, and scan .openclaw symlinks for unsafe pointers to legacy paths when migration occurred.
Directory initialization and conditional permission repair
Dockerfile
Create plugin-runtime-deps subdirectories with install -d mode 2770, initialize metadata files via separate touch/chown/chmod steps, and apply marker-gated permissions: broad recursive repair with directory setgid when legacy cleanup ran, or targeted repair to .openclaw, openclaw.json, and plugin-runtime-deps for modern layout.
Sandboxed regression tests for cleanup and permission repair
test/sandbox-provisioning.test.ts
Helper runOpenclawRepairLayoutCase extracts cleanup and permission-repair RUN blocks from Dockerfile, executes them in sandbox with optional legacy data seeding, captures filesystem command logs, and returns marker/directory/file state. Regression test runs both modern and legacy scenarios, asserting targeted vs broad chown/chmod/find sequences, marker lifecycle across phases, and expected directory/file structure.

Build Step Timing Instrumentation

Layer / File(s) Summary
Classic step regex and timing state
src/lib/sandbox/create-stream.ts
Add CLASSIC_DOCKER_STEP_RE constant to recognize "Step X/Y : …" output lines and initialize build-timing state: overall build start timestamp, finalization flag, and active step tracking structure.
Duration formatting and step orchestration
src/lib/sandbox/create-stream.ts
Add utilities to format duration strings, start/finish overall build timing, finish active classic steps with appropriate completion/stopped wording, and transition between steps on detected "Step X/Y" lines.
Stream integration and finalization
src/lib/sandbox/create-stream.ts
Integrate timing into streamed output handling: mark overall build start when BUILD_PROGRESS_PATTERNS appear, parse classic step lines to record per-step timings, detect Successfully built or Built image to finalize overall timing, and ensure timing finalization at stream termination with exit-status-based wording.

Entrypoint Port Rejection Tests

Layer / File(s) Summary
Port rejection helper and test refactoring
test/e2e-port-overrides.sh
Add expect_entrypoint_rejects_port helper that executes the real entrypoint with a port override, asserts rejection (non-zero exit), and validates output contains "must be an integer between 1024 and 65535". Refactor four existing rejection test cases (non-numeric, privileged port 80, port above 65535, pattern injection) to call the helper instead of duplicating RC/output/grep assertions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • cv

Poem

🐰 We cleanse the legacy paths with care,
Mark the phases, repair with flair,
Timing each step as Docker builds,
Test the layout, the port fulfills,
A tidy rabbit's perfect share! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main performance optimization: skipping unnecessary broad permission repair on unified-layout sandboxes while retaining it for legacy migrations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sandbox-build-perf

Comment @coderabbitai help to get the list of available commands and usage tips.

@zyang-dev zyang-dev added the v0.0.49 Release target label May 21, 2026
@zyang-dev zyang-dev changed the title perf(sandbox): skip broad openclaw permission repair on current layout perf(sandbox): reduce image build time by skipping broad permission repair May 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

E2E Advisor Recommendation

Required E2E: test-e2e-gateway-isolation, test-e2e-port-overrides, openclaw-onboard-security-posture-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e
Optional E2E: shields-config-e2e, runtime-overrides-e2e, onboard-repair-e2e

Dispatch hint: openclaw-onboard-security-posture-e2e,sandbox-operations-e2e,rebuild-openclaw-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • test-e2e-gateway-isolation (medium): Directly validates the built production image's gateway/sandbox user separation, mutable-default .openclaw permissions, config hash, state directories, and entrypoint hardening affected by the Dockerfile permission/layout changes.
  • test-e2e-port-overrides (low): Runs the touched port override E2E against the real built image and entrypoint, covering dashboard port validation and propagation through the runtime stack.
  • openclaw-onboard-security-posture-e2e (high): Exercises full OpenClaw install/onboard/inference with non-root-host security posture assertions, covering the Dockerfile's OpenClaw config permissions and the create-stream path used during real onboarding.
  • sandbox-operations-e2e (high): Validates real sandbox lifecycle operations including create, list, status, logs, destroy, gateway recovery, registry rebuild, and multi-sandbox isolation; this is high-signal for create-stream changes and image layout/permission regressions.
  • rebuild-openclaw-e2e (high): Covers upgrade/rebuild of an older OpenClaw sandbox and verifies workspace state survives, which is the closest existing full E2E for stale-base and OpenClaw state-layout migration risk introduced by the Dockerfile changes.

Optional E2E

  • shields-config-e2e (medium): Useful adjacent coverage for config mutability, shields up/down, and an in-sandbox assertion that the unified .openclaw layout has no legacy .openclaw-data mirror or symlink bridge.
  • runtime-overrides-e2e (medium): Provides additional confidence that Dockerfile/entrypoint runtime config mutation still works after the targeted OpenClaw permission repair changes, although the changed port override script is covered more directly by test-e2e-port-overrides.
  • onboard-repair-e2e (high): Adjacent confidence for interrupted onboarding and missing-sandbox recreation, both of which reuse the sandbox creation path touched in create-stream.ts.

New E2E recommendations

  • legacy .openclaw-data migration in real sandbox image (high): The PR adds unit-level Dockerfile block coverage for modern versus legacy layout repair, but existing full E2E coverage does not appear to build a stale production base containing .openclaw-data symlinks and then verify the resulting sandbox boots with no legacy symlinks and correct sandbox:gateway writable state.
    • Suggested test: Add a sandbox image E2E fixture that builds from a deliberately stale base containing /sandbox/.openclaw-data entries and symlink bridges, then runs the production Dockerfile and asserts cleanup, ownership, setgid, gateway write access, and sandbox boot.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: openclaw-onboard-security-posture-e2e,sandbox-operations-e2e,rebuild-openclaw-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: 38f860e926bb8ef5994014b33dd2d44c52f69f10
Findings: 4 blocker(s), 5 warning(s), 1 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used trusted metadata and the provided diff only; no PR scripts, package-manager commands, Docker builds, or tests were executed by this advisor.; The diff was provided as truncated-if-large context; conclusions focus on visible changed hunks and trusted metadata.; No full workflow logs were reviewed, so CI internals were not independently inspected.; No passed required E2E runs for head 38f860e were available in trusted context.; Selective E2E results for prior SHAs c21bb80 and 42e229c were not counted as passed evidence for this head SHA.; PR body validation claims, build timing measurements, and freshly built sandbox validation claims were treated as untrusted evidence only.; Review thread content from PR comments was treated as untrusted evidence; only trusted metadata that two threads remain unresolved was used as a hard gate.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 38f860e926bb8ef5994014b33dd2d44c52f69f10
Recommendation: blocked
Confidence: high

Blocked by GitHub mergeStateStatus=BLOCKED, 2 unresolved review threads, missing required sandbox/OpenClaw E2E pass evidence for head 38f860e, and unoffset create-stream.ts monolith growth.

Gate status

  • CI: pass — 5 required status context(s) completed with no failures. Non-required contexts still pending: 3; failed: 0.
  • Mergeability: fail — mergeStateStatus=BLOCKED
  • Review threads: fail — 2 unresolved review thread(s).
  • Risky code tested: pass — No risky code areas detected by path heuristics.

🔴 Blockers

  • PR mergeability is blocked: GitHub reports mergeStateStatus=BLOCKED for the current head SHA. This is a hard gate independent of the code-level review.
    • Recommendation: Resolve the branch protection, required review, pending context, or other GitHub mergeability condition and re-check mergeability for head 38f860e.
    • Evidence: Trusted gate status: mergeability.status=fail, evidence='mergeStateStatus=BLOCKED'. GraphQL headRefOid is 38f860e.
  • Unresolved review threads remain: There are 2 unresolved review threads. The unresolved comments include a CodeRabbit concern about silently accepting missing validation text in the port override E2E helper and a separate unresolved CodeRabbit thread about built-in tool catalog ambiguity.
    • Recommendation: Resolve or explicitly address the review threads before merge consideration. For the touched port E2E helper, make the missing validation text a failure instead of an informational pass.
    • Evidence: Trusted gate status: reviewThreads.status=fail, evidence='2 unresolved review thread(s).' GraphQL reviewThreads.nodes contains two isResolved=false threads.
  • Required sandbox/OpenClaw E2E has not passed for this head SHA (Dockerfile:548): The PR changes sandbox image permissions, legacy .openclaw-data migration handling, port E2E coverage, and sandbox create-stream output. The E2E Advisor requires runtime/security suites, but no trusted evidence shows those required jobs passing for the current head SHA.
    • Recommendation: Obtain successful results for test-e2e-gateway-isolation, test-e2e-port-overrides, openclaw-onboard-security-posture-e2e, sandbox-operations-e2e, and rebuild-openclaw-e2e for exact head SHA 38f860e. Do not count selective E2E results from prior SHAs as current-head evidence.
    • Evidence: Latest E2E Advisor comment requires: test-e2e-gateway-isolation, test-e2e-port-overrides, openclaw-onboard-security-posture-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e. Selective E2E comments provided target c21bb80... and 42e229c..., not 38f860e...
  • Sandbox create-stream monolith grew beyond repository budget (src/lib/sandbox/create-stream.ts:1): Trusted monolith analysis reports src/lib/sandbox/create-stream.ts grew from 385 to 459 lines, a +74 line increase. The repository budget flags current monolith growth of 20 or more lines as a blocker, and this is a sandbox provisioning path.
    • Recommendation: Extract the Docker build timing parser/state machine/helpers into a focused module or otherwise offset the growth before merge.
    • Evidence: monolithDeltas: file=src/lib/sandbox/create-stream.ts, baseLines=385, headLines=459, delta=74, severity=blocker.

🟡 Warnings

  • Port rejection helper can pass when validation text is missing (test/e2e-port-overrides.sh:65): The new expect_entrypoint_rejects_port helper logs an informational message if the expected validation text is not captured, but still calls pass. That weakens the E2E as a guardrail because a changed or missing rejection reason would still be reported as passing.
    • Recommendation: Change the helper so missing 'must be an integer between 1024 and 65535' text calls fail with the label, exit code, and captured output, and only calls pass when the expected text is present.
    • Evidence: Diff adds: if ! echo "$out" | grep -q "must be an integer between 1024 and 65535"; then info ...; fi; pass "$label rejected by entrypoint (exit $rc)". CodeRabbit unresolved thread PRRT_kwDORnw8lM6EBYRS flags the same issue.
  • Build timing stream behavior lacks direct regression tests (src/lib/sandbox/create-stream.ts:52): The PR adds classic Docker step timing, total build timing, completed/stopped timing output, and interactions with failure and ready-detach paths, but no changed test directly covers create-stream timing behavior.
    • Recommendation: Add unit tests around streamSandboxCreate with a mocked child process and deterministic clock for classic Step N/M lines, Successfully built/Built image completion, nonzero close stopped output, pending-line flush, duplicate suppression, and readyCheck forced-ready detachment.
    • Evidence: Diff adds CLASSIC_DOCKER_STEP_RE, buildStartedAtMs, buildTimingFinished, activeBuildStep, formatDuration, finishBuildTiming, maybeStartClassicBuildStep, and finish-time behavior in src/lib/sandbox/create-stream.ts; changed tests are test/e2e-port-overrides.sh and test/sandbox-provisioning.test.ts.
  • Fast permission repair path needs live-image proof (Dockerfile:548): The previous broad recursive repair guaranteed the /sandbox/.openclaw tree was sandbox-owned with group-writable/setgid directories. The new current-layout fast path updates only selected top-level paths and relies on earlier layers and plugin installation ownership assumptions.
    • Recommendation: Back the optimization with real-image/container assertions for nested ownership, group, mode bits, setgid inheritance, and sandbox-user write probes under plugin-runtime-deps and mutable state directories; retain targeted recursive repair for generated contents if E2E exposes drift.
    • Evidence: Dockerfile fast path runs non-recursive chown on /sandbox/.openclaw, /sandbox/.openclaw/openclaw.json, and /sandbox/.openclaw/plugin-runtime-deps, plus chmod 2770 on only two directories and chmod 660 on openclaw.json. Required E2E evidence is missing for head 38f860e...
  • Permission regression test mocks ownership-changing commands (test/sandbox-provisioning.test.ts:113): The new test executes extracted Dockerfile RUN blocks, which is stronger than pure source-shape checking, but it mocks chown and wraps install/find/chmod. It proves branch selection and filesystem shape, not actual owner/group results inside a Docker image or nested plugin-runtime-deps modes after OpenClaw plugin installation.
    • Recommendation: Keep the snippet regression test, but supplement it with real-image or containerized assertions that verify owner, group, mode bits, setgid inheritance, legacy marker removal, and sandbox-user write access for modern-layout and legacy-migration scenarios.
    • Evidence: runOpenclawRepairLayoutCase defines chown() as a logger and install() as a mkdir wrapper, then asserts logged calls such as chown -R and chmod rather than real image ownership.
  • High active-work overlap on Dockerfile and sandbox provisioning paths (Dockerfile:434): The changed files still exist, but Dockerfile and sandbox provisioning tests overlap with multiple active PRs touching sandbox, policy, OpenClaw, and runtime-provider behavior. This increases drift/rebase risk in a security-sensitive image path.

🔵 Suggestions

  • Progress-output behavior is user-visible but not documented (src/lib/sandbox/create-stream.ts:52): The PR adds build timing information to sandbox create progress output. This is likely a small CLI progress improvement, but it is user-visible and no docs or release note file is included.
    • Recommendation: Consider adding a short release note or CLI docs note if sandbox create progress output is considered part of the user-facing surface.
    • Evidence: PR body verification has 'Docs updated for user-facing behavior changes' unchecked; changedFiles contains no docs; src/lib/sandbox/create-stream.ts adds timing lines such as 'Sandbox image build completed in ...'.

Acceptance coverage

  • partial — This PR reduces sandbox image build time by avoiding broad recursive .openclaw permission repair on current unified-layout sandbox bases.: Dockerfile replaces the unconditional recursive chown/chmod/find repair with a legacy-marker branch and a targeted current-layout branch. The actual build-time reduction is a PR-body claim and no trusted benchmark or current-head E2E evidence was reviewed.
  • partial — It keeps the conservative repair path for legacy .openclaw-data migrations and adds Docker build step timing to make future build bottlenecks visible.: Dockerfile keeps broad repair when /tmp/nemoclaw-legacy-openclaw-layout exists, and src/lib/sandbox/create-stream.ts adds step/overall timing output. Coverage is partial because create-stream timing lacks direct tests and required E2E is missing for the current head.
  • unknown — DGX Spark build-time comparison:: This is a PR-body performance claim. No trusted benchmark logs or reproduced build timing evidence were reviewed.
  • unknown — Original/restored Dockerfile: 178.1s: This timing appears only in the untrusted PR body.
  • unknown — Optimized Dockerfile: 68.0s: This timing appears only in the untrusted PR body.
  • met — Gate legacy .openclaw-data symlink verification so it only runs when legacy migration occurs.: Dockerfile introduces legacy_layout=0, sets it to 1 only when /sandbox/.openclaw-data exists, and wraps post-cleanup legacy symlink verification in if [ "$legacy_layout" = "1" ]; then ... fi. The added test exercises modern and legacy cases.
  • partial — Replace unconditional recursive .openclaw permission repair with a legacy-only broad repair and a targeted fast path for current unified layouts.: Dockerfile implements the legacy marker path with recursive chown/chmod/find and a non-recursive current-layout fast path. Coverage remains partial because trusted evidence does not show required live sandbox/OpenClaw E2E passing for head 38f860e.
  • partial — The fast path relies on the current unified .openclaw layout being provisioned earlier in the image: Dockerfile.base creates the known state directories as sandbox:sandbox with group-write/setgid permissions, and openclaw plugins install runs as USER sandbox, so plugin-runtime-deps contents are already owned by sandbox:sandbox before the final repair step.: The visible Dockerfile runs openclaw plugins install as USER sandbox before switching back to root, and the test asserts expected unified layout directories. However, nested plugin-runtime-deps ownership/modes are not proven by trusted live-image E2E for this head SHA.
  • partial — Add behavior-based regression coverage for the .openclaw repair path: current layouts use targeted permission repair, while legacy .openclaw-data migrations still use broad recursive repair.: test/sandbox-provisioning.test.ts adds 'uses targeted permission repair unless legacy migration ran' and executes extracted Dockerfile RUN blocks against fixture trees. It verifies logged calls and filesystem shape, but ownership-changing commands are mocked and no real image is executed.
  • met — Add an explicit .openclaw layout contract check so future directory/file layout changes fail tests until the targeted permission repair assumptions are reviewed.: The new test asserts exact dirsAfterCleanup and filesAfterCleanup lists for the modern layout, including expected state directories and exec-approvals.json/update-check.json.
  • partial — Add classic Docker build step timing to sandbox create progress output.: src/lib/sandbox/create-stream.ts adds CLASSIC_DOCKER_STEP_RE, active step tracking, duration formatting, and completed/stopped build timing lines. No direct changed test evidence covers this output, duplicate suppression, failure handling, or ready-detach interactions.
  • unknownnpx prek run --all-files passes: This is a PR body checkbox claim and is treated as untrusted evidence. No trusted local command log was reviewed.
  • metnpm test passes: Trusted CI for the current head shows unit-vitest-linux completed with SUCCESS and required 'checks' completed with SUCCESS.
  • partial — Tests added or updated for new or changed behavior: test/sandbox-provisioning.test.ts and test/e2e-port-overrides.sh are updated. No direct test was added for src/lib/sandbox/create-stream.ts build timing behavior, and required E2E is missing.
  • met — No secrets, API keys, or credentials committed: Visible diff adds no hardcoded tokens, PEM/key material, credential JSON, or secret literals; Dockerfile token-clearing step remains present.
  • missing — Docs updated for user-facing behavior changes: PR body leaves this unchecked and changedFiles contains no docs. The change adds progress-output behavior but no user-facing docs or release note file in the diff.
  • unknownmake docs builds without warnings (doc changes only): No doc changes are in changedFiles, and no trusted make docs log was reviewed.
  • unknown — Doc pages follow the style guide (doc changes only): No doc pages are in changedFiles.
  • unknown — New doc pages include SPDX header and frontmatter (new pages only): No new doc pages are in changedFiles.
  • unknown — Verified in a freshly built sandbox that /sandbox/.openclaw/plugin-runtime-deps and sampled contents are owned by sandbox:sandbox; no non-sandbox:sandbox entries were found, and a sandbox-user write probe in plugin-runtime-deps succeeded.: This validation appears only in the PR body, which is untrusted evidence. Trusted status/E2E evidence does not show the required live sandbox/OpenClaw jobs passing for head 38f860e.

Security review

  • pass — 1. Secrets and Credentials: No hardcoded secrets, API keys, passwords, PEM/key files, credential JSON, or token literals were introduced in the visible diff. The existing Dockerfile step that clears generated OpenClaw gateway auth token remains intact.
  • warning — 2. Input Validation and Data Sanitization: No new external URL parser, unsafe deserialization path, eval/new Function path, or obvious shell interpolation of user-controlled input was added. However, test/e2e-port-overrides.sh weakens the negative test guard by passing even when expected validation text is missing, reducing confidence in port input validation coverage.
  • pass — 3. Authentication and Authorization: No authentication/authorization endpoints, token validation logic, scopes, privilege checks, or access-control decisions were added or modified. Gateway token generation/clearing boundaries are not visibly weakened.
  • pass — 4. Dependencies and Third-Party Libraries: The visible changed files do not add new package dependencies or new third-party registry sources. The Dockerfile continues using existing OpenClaw/plugin installation paths.
  • warning — 5. Error Handling and Logging: Build timing repeats Docker step instruction text into progress output and appends synthetic completed/stopped lines. This does not create a new secret source beyond existing Docker output, but direct tests are needed to ensure log consumers, duplicate suppression, ready detection, and failure handling are not confused.
  • pass — 6. Cryptography and Data Protection: Not applicable — no cryptographic operations, algorithms, signatures, keys, hashes used for security decisions, or data-at-rest/in-transit protection mechanisms were changed.
  • warning — 7. Configuration and Security Headers: The Dockerfile changes security-sensitive container filesystem permissions. The current-layout fast path removes previous recursive repair and only updates selected top-level paths, so live-image evidence is required to confirm least-privilege, mutability, group-write, setgid, and non-root runtime expectations are preserved.
  • warning — 8. Security Testing: A behavior-style Dockerfile snippet test was added, but required security-relevant E2E jobs for gateway isolation, port overrides, OpenClaw onboarding security posture, sandbox operations, and rebuild upgrade have not passed for the current head SHA in trusted evidence. Ownership-changing commands are mocked in the new unit test.
  • warning — 9. Holistic Security Posture: The PR targets sandbox image permissions and sandbox lifecycle output, both high-risk NemoClaw areas. The legacy fallback is retained and the performance goal is reasonable, but merge should wait for mergeability, unresolved review threads, required E2E, and create-stream tests to avoid permission drift, sandbox startup failures, or false confidence from mocked assertions.

Test / E2E status

  • Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: Dockerfile and src/lib/sandbox/create-stream.ts. Current unit coverage is helpful but mocks ownership-changing commands and does not directly cover create-stream timing behavior.
  • E2E Advisor: missing
  • Required E2E jobs: test-e2e-gateway-isolation, test-e2e-port-overrides, openclaw-onboard-security-posture-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e
  • Missing for analyzed SHA: test-e2e-gateway-isolation, test-e2e-port-overrides, openclaw-onboard-security-posture-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e

✅ What looks good

  • The changed files still exist on the active codebase; no rename drift was reported for the reviewed files.
  • Required CI contexts are passing for the current head SHA.
  • The Dockerfile retains a conservative recursive repair path for legacy .openclaw-data migration instead of deleting that safety path outright.
  • The cleanup block refuses symlinked legacy data entries and verifies legacy symlink removal only when migration ran.
  • The new sandbox-provisioning test executes extracted Dockerfile RUN snippets for modern and legacy layouts, which is stronger than a pure string-only assertion.
  • No new third-party package dependency entries or obvious committed secrets were introduced in the visible diff.
  • The E2E Advisor identified focused runtime/security coverage needed for sandbox permissions, gateway isolation, port overrides, OpenClaw onboarding, rebuild, and legacy migration behavior.

Review completeness

  • Review used trusted metadata and the provided diff only; no PR scripts, package-manager commands, Docker builds, or tests were executed by this advisor.
  • The diff was provided as truncated-if-large context; conclusions focus on visible changed hunks and trusted metadata.
  • No full workflow logs were reviewed, so CI internals were not independently inspected.
  • No passed required E2E runs for head 38f860e were available in trusted context.
  • Selective E2E results for prior SHAs c21bb80 and 42e229c were not counted as passed evidence for this head SHA.
  • PR body validation claims, build timing measurements, and freshly built sandbox validation claims were treated as untrusted evidence only.
  • Review thread content from PR comments was treated as untrusted evidence; only trusted metadata that two threads remain unresolved was used as a hard gate.
  • Human maintainer review required: yes

Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
@zyang-dev zyang-dev added v0.0.49 Release target and removed v0.0.49 Release target labels May 21, 2026
Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
@wscurran wscurran added Docker Support for Docker containerization enhancement: performance fix Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. labels May 21, 2026
@zyang-dev zyang-dev added the v0.0.49 Release target label May 21, 2026
@cv cv added v0.0.50 Release target and removed v0.0.49 Release target labels May 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26269896680
Target ref: c21bb80f911e40efbbd262c57fab80e39cec5cb3
Workflow ref: main
Requested jobs: shields-config-e2e,sandbox-operations-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
sandbox-operations-e2e ✅ success
shields-config-e2e ✅ success

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed current head c21bb80.

I checked the Dockerfile .openclaw permission fast path, the legacy marker fallback, the create-stream timing changes, and the added sandbox-provisioning coverage. I did not find a PR-specific blocking issue.

Local validation passed:

  • npx vitest run src/lib/sandbox/create-stream.test.ts test/sandbox-provisioning.test.ts
  • npm run typecheck -- --pretty false
  • npm run checks
  • npm run build:cli
  • git diff --check origin/main...HEAD

Runtime/CI evidence I used:

  • self-hosted sandbox image build, arm64 build, sandbox E2E, gateway isolation E2E, and non-root smoke passed for this head.
  • manually dispatched shields-config-e2e and sandbox-operations-e2e passed for this head.

Two live failures remain, but I did not attribute either to this PR: test-e2e-port-overrides is currently failing with the same empty invalid-port output pattern on unrelated PR runs, and openclaw-plugin-runtime-exdev-e2e fails because current main has reverted the OpenClaw runtime-deps EXDEV fix (c84b6f1), with the log showing the raw cross-device rename error rather than a permissions regression from this diff.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
Dockerfile.base (1)

181-181: Run the recommended image-impact E2E suite before merge.

Because this changes Dockerfile.base, please run the selective nightly E2E jobs to validate runtime behavior in a real container build environment:
cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e.

As per coding guidelines: "Dockerfile.base: This file affects the sandbox container image... only testable with a real container build" and the listed gh workflow run nightly-e2e.yaml ... command.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.base` at line 181, This change updates Dockerfile.base (ARG
OPENCLAW_VERSION) and must be validated by running the selective nightly E2E
jobs before merging: trigger the nightly-e2e workflow and run the cloud-e2e,
sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e jobs against the
branch to validate the real container build/runtime; if any job fails, fix the
Dockerfile.base changes (related to ARG OPENCLAW_VERSION) and re-run the same
suite until all pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/patch-openclaw-tool-catalog.js`:
- Around line 275-283: When targetFiles.length === 0 you currently pick the
first match from selectionFiles via hasBuiltInToolCatalog and return {status:
"skipped-built-in"}, which hides cases where multiple selection-*.js files
contain built-in catalogs; update the logic around
selectionFiles/hasBuiltInToolCatalog to collect all matching files, and only
return {status: "skipped-built-in", file, version} if exactly one match is
found; if more than one match is found return/throw an explicit ambiguous result
(e.g. {status: "ambiguous-built-in", files: [...], version} or throw a clear
error) so the caller can notice dist shape drift instead of silently choosing
the first file.

In `@test/e2e-port-overrides.sh`:
- Around line 65-68: The current check logs info when the expected validation
text isn't found but still calls pass("$label rejected by entrypoint (exit
$rc)"), allowing false positives; change the logic so that after running grep on
"$out" for "must be an integer between 1024 and 65535" you only call
pass("$label rejected by entrypoint (exit $rc)") if the grep succeeded, and call
fail with a clear message (e.g., include $label, $rc and $out) when the grep did
not match; update the block around the grep, info, pass functions to make
failure conditional and reference the existing variables out, rc, and label and
helper functions info, pass, and fail.

---

Nitpick comments:
In `@Dockerfile.base`:
- Line 181: This change updates Dockerfile.base (ARG OPENCLAW_VERSION) and must
be validated by running the selective nightly E2E jobs before merging: trigger
the nightly-e2e workflow and run the cloud-e2e, sandbox-survival-e2e,
hermes-e2e, and rebuild-openclaw-e2e jobs against the branch to validate the
real container build/runtime; if any job fails, fix the Dockerfile.base changes
(related to ARG OPENCLAW_VERSION) and re-run the same suite until all pass.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dea40e96-4cfb-4539-bb9f-f4610ebe1350

📥 Commits

Reviewing files that changed from the base of the PR and between 65e281a and 42e229c.

📒 Files selected for processing (7)
  • Dockerfile
  • Dockerfile.base
  • agents/openclaw/manifest.yaml
  • nemoclaw-blueprint/blueprint.yaml
  • scripts/patch-openclaw-tool-catalog.js
  • test/e2e-port-overrides.sh
  • test/openclaw-tool-catalog-patch.test.ts
✅ Files skipped from review due to trivial changes (1)
  • nemoclaw-blueprint/blueprint.yaml

Comment thread scripts/patch-openclaw-tool-catalog.js Outdated
Comment on lines +275 to +283
if (targetFiles.length === 0) {
const builtInCatalogFile = selectionFiles.find((file) => {
const text = fs.readFileSync(file, "utf-8");
return hasBuiltInToolCatalog(text);
});
if (builtInCatalogFile) {
return { status: "skipped-built-in", file: builtInCatalogFile, version };
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce a single built-in candidate before returning skipped-built-in.

When targetFiles.length === 0, this path returns the first built-in match. If multiple selection-*.js files contain built-in catalog signatures, the script silently picks one and reports success, which can hide dist shape drift.

Suggested fix
   if (targetFiles.length !== 1) {
     if (targetFiles.length === 0) {
-      const builtInCatalogFile = selectionFiles.find((file) => {
+      const builtInCatalogFiles = selectionFiles.filter((file) => {
         const text = fs.readFileSync(file, "utf-8");
         return hasBuiltInToolCatalog(text);
       });
-      if (builtInCatalogFile) {
-        return { status: "skipped-built-in", file: builtInCatalogFile, version };
+      if (builtInCatalogFiles.length === 1) {
+        return { status: "skipped-built-in", file: builtInCatalogFiles[0], version };
+      }
+      if (builtInCatalogFiles.length > 1) {
+        throw new Error(
+          `Expected exactly one built-in selection target, found ${builtInCatalogFiles.length}`,
+        );
       }
     }
     throw new Error(`Expected exactly one selection-*.js target, found ${targetFiles.length}`);
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/patch-openclaw-tool-catalog.js` around lines 275 - 283, When
targetFiles.length === 0 you currently pick the first match from selectionFiles
via hasBuiltInToolCatalog and return {status: "skipped-built-in"}, which hides
cases where multiple selection-*.js files contain built-in catalogs; update the
logic around selectionFiles/hasBuiltInToolCatalog to collect all matching files,
and only return {status: "skipped-built-in", file, version} if exactly one match
is found; if more than one match is found return/throw an explicit ambiguous
result (e.g. {status: "ambiguous-built-in", files: [...], version} or throw a
clear error) so the caller can notice dist shape drift instead of silently
choosing the first file.

Comment on lines +65 to +68
if ! echo "$out" | grep -q "must be an integer between 1024 and 65535"; then
info "$label rejected with exit $rc but validation text was not captured; entrypoint script text is checked below"
fi
pass "$label rejected by entrypoint (exit $rc)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not pass when validation text is missing.

This helper can report PASS even if the rejection reason changes or disappears, which makes the test flaky as a guardrail.

Suggested fix
-  if ! echo "$out" | grep -q "must be an integer between 1024 and 65535"; then
-    info "$label rejected with exit $rc but validation text was not captured; entrypoint script text is checked below"
-  fi
-  pass "$label rejected by entrypoint (exit $rc)"
+  if ! echo "$out" | grep -q "must be an integer between 1024 and 65535"; then
+    fail "$label rejected with exit $rc but missing expected validation text: $out"
+    return
+  fi
+  pass "$label rejected by entrypoint (exit $rc)"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if ! echo "$out" | grep -q "must be an integer between 1024 and 65535"; then
info "$label rejected with exit $rc but validation text was not captured; entrypoint script text is checked below"
fi
pass "$label rejected by entrypoint (exit $rc)"
if ! echo "$out" | grep -q "must be an integer between 1024 and 65535"; then
fail "$label rejected with exit $rc but missing expected validation text: $out"
return
fi
pass "$label rejected by entrypoint (exit $rc)"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-port-overrides.sh` around lines 65 - 68, The current check logs info
when the expected validation text isn't found but still calls pass("$label
rejected by entrypoint (exit $rc)"), allowing false positives; change the logic
so that after running grep on "$out" for "must be an integer between 1024 and
65535" you only call pass("$label rejected by entrypoint (exit $rc)") if the
grep succeeded, and call fail with a clear message (e.g., include $label, $rc
and $out) when the grep did not match; update the block around the grep, info,
pass functions to make failure conditional and reference the existing variables
out, rc, and label and helper functions info, pass, and fail.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26272360471
Target ref: 42e229c2d8bcce8c1f8844a3566150a57e002b6f
Workflow ref: main
Requested jobs: all (no filter)
Summary: 9 passed, 0 failed, 2 skipped

Job Result
bedrock-runtime-compatible-anthropic-e2e ✅ success
brave-search-e2e ✅ success
channels-add-remove-e2e ⚠️ cancelled
channels-stop-start-e2e ⚠️ cancelled
cloud-e2e ⚠️ cancelled
cloud-inference-e2e ⚠️ cancelled
cloud-onboard-e2e ⚠️ cancelled
credential-migration-e2e ⚠️ cancelled
credential-sanitization-e2e ⚠️ cancelled
device-auth-health-e2e ⚠️ cancelled
diagnostics-e2e ⚠️ cancelled
docs-validation-e2e ✅ success
double-onboard-e2e ⚠️ cancelled
gpu-double-onboard-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-slack-e2e ⚠️ cancelled
inference-routing-e2e ⚠️ cancelled
issue-2478-crash-loop-recovery-e2e ⚠️ cancelled
kimi-inference-compat-e2e ⚠️ cancelled
launchable-smoke-e2e ⚠️ cancelled
messaging-compatible-endpoint-e2e ⚠️ cancelled
messaging-providers-e2e ⚠️ cancelled
network-policy-e2e ⚠️ cancelled
onboard-negative-paths-e2e ⚠️ cancelled
onboard-repair-e2e ⚠️ cancelled
onboard-resume-e2e ⚠️ cancelled
openclaw-inference-switch-e2e ⚠️ cancelled
openclaw-onboard-security-posture-e2e ⚠️ cancelled
openclaw-slack-pairing-e2e ⚠️ cancelled
openshell-gateway-upgrade-e2e ⚠️ cancelled
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ⚠️ cancelled
rebuild-hermes-stale-base-e2e ⚠️ cancelled
rebuild-openclaw-e2e ⚠️ cancelled
runtime-overrides-e2e ⚠️ cancelled
sandbox-operations-e2e ⚠️ cancelled
sandbox-survival-e2e ⚠️ cancelled
shields-config-e2e ⚠️ cancelled
skill-agent-e2e ⚠️ cancelled
snapshot-commands-e2e ✅ success
state-backup-restore-e2e ⚠️ cancelled
telegram-injection-e2e ⚠️ cancelled
token-rotation-e2e ⚠️ cancelled
tunnel-lifecycle-e2e ⚠️ cancelled
upgrade-stale-sandbox-e2e ⚠️ cancelled

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26272729419
Target ref: 38f860e926bb8ef5994014b33dd2d44c52f69f10
Workflow ref: main
Requested jobs: all (no filter)
Summary: 47 passed, 0 failed, 2 skipped

Job Result
bedrock-runtime-compatible-anthropic-e2e ✅ success
brave-search-e2e ✅ success
channels-add-remove-e2e ✅ success
channels-stop-start-e2e ✅ success
cloud-e2e ✅ success
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
credential-migration-e2e ✅ success
credential-sanitization-e2e ✅ success
device-auth-health-e2e ✅ success
diagnostics-e2e ✅ success
docs-validation-e2e ✅ success
double-onboard-e2e ✅ success
gpu-double-onboard-e2e ⏭️ skipped
gpu-e2e ⏭️ skipped
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
hermes-inference-switch-e2e ✅ success
hermes-onboard-security-posture-e2e ✅ success
hermes-slack-e2e ✅ success
inference-routing-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ✅ success
kimi-inference-compat-e2e ✅ success
launchable-smoke-e2e ✅ success
messaging-compatible-endpoint-e2e ✅ success
messaging-providers-e2e ✅ success
network-policy-e2e ✅ success
onboard-negative-paths-e2e ✅ success
onboard-repair-e2e ✅ success
onboard-resume-e2e ✅ success
openclaw-inference-switch-e2e ✅ success
openclaw-onboard-security-posture-e2e ✅ success
openclaw-slack-pairing-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success
overlayfs-autofix-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
rebuild-openclaw-e2e ✅ success
runtime-overrides-e2e ✅ success
sandbox-operations-e2e ✅ success
sandbox-survival-e2e ✅ success
shields-config-e2e ✅ success
skill-agent-e2e ✅ success
snapshot-commands-e2e ✅ success
state-backup-restore-e2e ✅ success
telegram-injection-e2e ✅ success
token-rotation-e2e ✅ success
tunnel-lifecycle-e2e ✅ success
upgrade-stale-sandbox-e2e ✅ success

@ericksoa ericksoa merged commit cccf874 into main May 22, 2026
29 checks passed
@ericksoa ericksoa deleted the fix/sandbox-build-perf branch May 22, 2026 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Docker Support for Docker containerization enhancement: performance fix Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. v0.0.50 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants