feat(security): sandbox measure/simulation subprocess execution#63
Merged
Conversation
…ts 1-2) Three exec sites (apply_measure, test_measure, sim _launch) now route through mcp_server/sandbox.py, gated by OSMCP_SANDBOX (default off = passthrough): - clean-env allowlist strips host env (secrets) from measure code - UID drop to unprivileged `sandbox` user + rlimits via _sandbox_exec shim - unconditional is_path_allowed() check on apply_measure(measure_dir) - OSMCP_SIM_TIMEOUT_SECONDS config added (enforcement pending) Dockerfile bakes the `sandbox` user. Landlock FS rules + seccomp net-deny land next in _sandbox_exec. Security PoC suite kept local (git-excluded); standalone security.yml workflow added (manual, no-op when suite absent). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Full tier (OSMCP_SANDBOX=auto) closes the last two holes, on top of the POSIX floor, all via owned ctypes shims (no dependency, pure Python): - _landlock.py: read-deny-by-default FS policy — ro system roots, rw only the run dir (+ /dev for null/urandom). Blocks read-escape (even world-readable files) and write-escape, mount-independent. - _seccomp.py: raw single-arch cBPF denying socket(AF_INET/AF_INET6) with EAFNOSUPPORT (AF_UNIX/local IPC unaffected). Blocks outbound IP exfil. Applied in _sandbox_exec after no_new_privs, degrade-loudly; active_tier reports "landlock". Chose raw BPF over pyseccomp (broken wheel import). Verified under auto: read/write escape + net exfil blocked, apply + real EnergyPlus sim still succeed; comstock/common + weather suites green. Default stays off; flipping to auto (after a full-suite run under auto) is next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…unt) Under clean-env, pass redirect_tmp=False for test_measure's pytest/ruby run so TMPDIR stays on /tmp. test_measure isn't Landlocked, and pytest's capture uses an unlinked tempfile that the Docker Desktop bind mount can't keep open when TMPDIR is run_dir/tmp -> FileNotFoundError on truncate. apply_measure/sim keep TMPDIR=run_dir/tmp (Landlock needs it; openstudio doesn't use unlinked tempfiles). Fixes the only 2 failures in the full-suite-under-auto run (719 -> 721 pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…x increment 4) Decouple the confinement layers and make auto the default: - _sandbox_exec: drop uid only when root; the unprivileged layers (Landlock + seccomp + rlimits) still apply for a non-root local server. - sandbox.py: degrade loudly on platforms with no kernel backend (macOS/Windows -> clean-env only + one-shot warning); chown run dir only when root; active_tier() reports "clean-env" where no backend. - config: OSMCP_SANDBOX default off -> auto (off stays the escape hatch). This protects LOCAL users: the server bind-mounts host dirs (/repo, /inputs); the sandbox makes them read-only so an LLM-authored measure can't write the user's real files. Full suite previously verified green under auto (721 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a new sandboxing layer for subprocesses that execute LLM-authored OpenStudio measures and run EnergyPlus simulations, aiming to reduce risk from arbitrary code execution by confining environment inheritance and (on Linux) applying kernel-backed restrictions.
Changes:
- Adds
mcp_server/sandbox.pyand routes measure/simulation subprocess execution through it (clean environment + privilege/rlimit wrapper). - Introduces Linux confinement primitives (
_landlock.py,_seccomp.py) and an exec shim (_sandbox_exec.py) to apply rlimits, no_new_privs, Landlock FS policy, and seccomp network deny. - Adds new config knobs / defaults and supporting infrastructure (Docker
sandboxuser, manual security workflow, planning doc).
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| mcp_server/skills/simulation/operations.py | Runs simulation subprocess via sandbox env + wrapper. |
| mcp_server/skills/measures/operations.py | Adds path allowlisting for measure_dir and wraps measure execution with sandbox. |
| mcp_server/skills/measure_authoring/operations.py | Filters env for test subprocesses (but does not wrap/sandbox them). |
| mcp_server/sandbox.py | New central chokepoint for env filtering, command wrapping, and workdir preparation. |
| mcp_server/config.py | Adds sandbox/network knobs and sim timeout configuration. |
| mcp_server/_seccomp.py | New raw seccomp-BPF filter to deny AF_INET/AF_INET6 sockets. |
| mcp_server/_sandbox_exec.py | New exec shim to apply rlimits/no_new_privs/Landlock/seccomp and (when root) uid-drop. |
| mcp_server/_landlock.py | New Landlock ruleset builder via ctypes syscalls. |
| docs/plans/measure-exec-sandbox.md | Design/planning document for the sandbox approach and test strategy. |
| docker/Dockerfile | Adds sandbox user for uid-drop. |
| .github/workflows/security.yml | Adds manual-trigger workflow to run security tests when present. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
1017
to
1021
| proc = subprocess.run( | ||
| ["python3", "-m", "pytest", "tests/", "-v", "--tb=short"], | ||
| cwd=str(mdir), | ||
| env=sandbox.build_env(mdir, redirect_tmp=False), | ||
| capture_output=True, text=True, timeout=60, check=False, |
Comment on lines
1028
to
1032
| proc = subprocess.run( | ||
| ["ruby", "-I", ".", str(test_files[0])], | ||
| cwd=str(mdir), | ||
| env=sandbox.build_env(mdir, redirect_tmp=False), | ||
| capture_output=True, text=True, timeout=60, check=False, |
Comment on lines
+69
to
+72
| SANDBOX_MODE = os.environ.get("OSMCP_SANDBOX", "auto").strip().lower() | ||
| # Network policy for confined subprocesses: deny (default) blocks outbound TCP | ||
| # once the seccomp backend lands; allow leaves it open (trusted/BCL deployments). | ||
| SANDBOX_NET = os.environ.get("OSMCP_SANDBOX_NET", "deny").strip().lower() |
Comment on lines
+74
to
+79
| # Wall-clock cap for a single simulation (run_osw/run_simulation). 0 = no cap. | ||
| SIM_TIMEOUT_SECONDS = _safe_float( | ||
| os.environ.get("OPENSTUDIO_MCP_SIM_TIMEOUT_SECONDS", | ||
| os.environ.get("OSMCP_SIM_TIMEOUT_SECONDS", "7200")), | ||
| 7200.0, | ||
| ) |
Comment on lines
+8
to
+12
| `OSMCP_SANDBOX` (config.SANDBOX_MODE) selects the mode: | ||
| off — full passthrough (current behaviour / explicit escape hatch) | ||
| posix — clean-env allowlist (this increment); UID drop + rlimits + Landlock | ||
| FS policy + seccomp net-deny arrive in later increments, same knob | ||
| auto — best confinement available (currently == posix) |
Comment on lines
+192
to
+196
| with contextlib.suppress(OSError): | ||
| os.chown(work, SANDBOX_UID, SANDBOX_GID) | ||
| for path in work.rglob("*"): | ||
| with contextlib.suppress(OSError): | ||
| os.chown(path, SANDBOX_UID, SANDBOX_GID) |
Comment on lines
+55
to
+58
| class _PathBeneathAttr(ctypes.Structure): | ||
| _pack_ = 1 | ||
| _fields_ = [("allowed_access", ctypes.c_uint64), ("parent_fd", ctypes.c_int32)] | ||
|
|
Comment on lines
+9
to
+13
| Runs as root, applies rlimits, drops to the unprivileged uid/gid, sets | ||
| ``no_new_privs``, then ``execvp``s CMD. A standalone exec — NOT a Popen | ||
| ``preexec_fn`` — so there is no fork-safety hazard in the threaded server, and | ||
| the dropped image keeps the same pid (so the dispatcher's existing | ||
| terminate/kill-by-pid path still works). Landlock FS rules + a seccomp net-deny |
Comment on lines
+121
to
+129
| if redirect_tmp: | ||
| work = Path(work_dir) | ||
| env["HOME"] = str(work) | ||
| tmp = work / "tmp" | ||
| try: | ||
| tmp.mkdir(parents=True, exist_ok=True) | ||
| env["TMPDIR"] = str(tmp) | ||
| except OSError: | ||
| pass |
Comment on lines
+311
to
+315
| 3. **seccomp net-deny via libseccomp (`pyseccomp`), not hand-rolled BPF.** | ||
| Security-critical filter, and we ship **amd64 + arm64** — raw BPF must | ||
| hand-handle `AUDIT_ARCH` + per-arch syscall numbers for both, and a | ||
| subtly-wrong filter fails open. libseccomp is what Docker itself uses, | ||
| abstracts the arch mess, and `libseccomp2` is already in the base image. The |
Planning/working doc — keep local-only, not in the repo. Removed from the tree (history retains it); now git-excluded so it won't be re-added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Never let a bad OSMCP_SANDBOX_UID override leave confined code as root. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…p (Codex H1) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dex H4/M1) Also documents the packed path_beneath struct (rebuts Copilot #7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… H2) Load the landlock/seccomp shims before applying the FS restriction; refresh docstring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… env, drop /repo (Codex H3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ks=True (Codex C3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… arg code; confine reporting-measure test (Codex C5/C2/C1) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ation inputs; reject escaping OSW symlinks (Copilot+Codex C4) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sandbox # Conflicts: # mcp_server/skills/measures/operations.py
Vulnerabilities are fixed (this PR), so the PoC suite is no longer held back. Runs in the dedicated security.yml workflow under OSMCP_SANDBOX=auto (each test still pins its own tier). Reports how Landlock+seccomp behave on the GitHub runner. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ly; add sandbox user to arm64 image Both amd64 + arm64 integration test jobs now set OSMCP_SANDBOX=auto (no longer relying on the code default). Dockerfile.arm64 gains the unprivileged sandbox user for parity. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Validates the aarch64 seccomp BPF (different syscall numbers) and Landlock confinement on the native arm64 runner, not just amd64. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
arm64 now runs, under OSMCP_SANDBOX=auto: shard 1 (SEB4 sim/EUI + weather/loops) and shard 2 (SWIG-memleak + stdout suppression [deb-vs-wheel], measure apply/authoring [Ruby/Python exec + bundler], an HVAC EnergyPlus sim). Security PoC runs on arm64 via the security workflow. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…studio arm64-test shard 2 failed: 5 test_measure_authoring cases hit `require': cannot load such file -- openstudio (LoadError)`. test_measure runs Ruby minitest via raw `ruby -I .` (openstudio measure -r doesn't run minitest), so system ruby needs RUBYLIB pointing at the OpenStudio Ruby bindings. The amd64 nrel/openstudio base sets this; the arm64 (ubuntu+.deb) image never did, so the path was only exposed once arm64 shard 2 started running these tests. Symlink the install's Ruby dir to a stable /usr/local path and set RUBYLIB to it (parity with amd64). Both under /usr/local, which the sandbox Landlock policy already grants; RUBYLIB is already in the sandbox env allowlist. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines
+343
to
+360
| for rec in expired: | ||
| with contextlib.suppress(psutil.NoSuchProcess, Exception): | ||
| p = psutil.Process(rec.pid) | ||
| p.terminate() | ||
| try: | ||
| p.wait(timeout=5) | ||
| except psutil.TimeoutExpired: | ||
| p.kill() | ||
| with _sim_lock: | ||
| rec.status = "failed" | ||
| rec.ended_at = _now() | ||
| rec.exit_code = -1 if rec.exit_code is None else rec.exit_code | ||
| rec.error = (f"Simulation exceeded the {SIM_TIMEOUT_SECONDS:.0f}s wall-clock " | ||
| "cap (OSMCP_SIM_TIMEOUT_SECONDS)") | ||
| _RUNS[rec.run_id] = rec | ||
| _persist_run_record(rec) | ||
| audit("sim_timeout", run_id=rec.run_id, user=rec.user_key, | ||
| ran_seconds=round(now - (rec.started_at or now), 1)) |
Comment on lines
+34
to
+36
| # Argument names become bare Ruby/Python identifiers in generated code — they must | ||
| # be plain identifiers, never arbitrary text (else code injection at generation). | ||
| _ARG_NAME_RE = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,63}$") |
Comment on lines
+1
to
+8
| """Confirm the measure-exec vulnerabilities that the sandbox (planned) will close. | ||
|
|
||
| Context: docs/plans/measure-exec-sandbox.md. Today an applied measure runs as | ||
| root, with the server's full environment, unconfined filesystem, and open | ||
| network — see skills/measures/operations.py (`env=os.environ.copy()`, no UID | ||
| drop, no FS policy). These tests PROVE each hole exists, safely, using canaries | ||
| and decoys only (no real secret/file/host is touched, nothing leaves the box). | ||
|
|
…v/timeout hardening Codex gpt-5.5 + Copilot round-2 review fixes (low-risk): - #1 (critical RCE): _escape_ruby_str now escapes '#'. A measure description/ default/choice value with '#{...}' was live Ruby interpolation in a double-quoted literal, executed when `openstudio measure -u` runs (unsandboxed, as root). - #6: unknown OSMCP_SANDBOX value (typo) normalized to 'auto' (fail-closed) instead of silently downgrading enabled-but-not-full to posix. - #7: drop the bare 'LC_' env allowlist prefix; enumerate standard locale categories so an LC_-prefixed host secret (e.g. LC_API_TOKEN) can't leak to measure code. - #9: _safe_float rejects NaN/inf so a non-finite SIM_TIMEOUT can't silently disable the wall-clock cap. - Copilot: _ARG_NAME_RE requires lowercase-leading (uppercase -> Ruby dynamic constant assignment SyntaxError in generated arguments()). - Copilot: _enforce_timeouts skips a dead-but-unreaped pid (_pid_alive) so a finished run isn't force-failed; left for the reaper to classify by exit code. Tests: 6 added (test_sandbox.py x3, test_measure_authoring.py x2, test_sim_queue.py x1). Verified 6 new + full measure_authoring + sim_queue (38 passed) under OSMCP_SANDBOX=auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… kill Codex gpt-5.5 review fixes (medium): - #4: replace the whole-/dev rw Landlock grant with per-file rules — /dev/null, /dev/zero (rw) and /dev/urandom, /dev/random (ro). A /dev dir grant also exposed writable /dev/shm + /dev/mqueue (shared storage/IPC outside the run dir) and permitted mknod. _landlock._add now masks a rule on a non-directory to file-only rights (a dir-only bit on a file makes add_rule return EINVAL -> fail-closed). - #8: sims launch with start_new_session=True (own process-group leader) and timeout/cancel now kill the whole group via _kill_process_group(), reaping forked children (EnergyPlus, helpers) a single-pid kill would orphan. Only ever signals a group we created (pgid == pid) — never the server's own group. Tests: red-green verified (both fail on unfixed source, pass with fix). 2 added (test_sandbox.py wrap_cmd device rules, test_sim_queue.py group kill). Full test_sandbox + test_sim_queue + test_measures green under OSMCP_SANDBOX=auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aging, drop /inputs RO Codex gpt-5.5 review fixes (the staging/policy refactors): - #2 (critical, cross-tenant read): the PUBLIC run_osw now is_path_allowed-gates the OSW (and EPW) BEFORE reading/copying — it copies the OSW's whole parent dir into the run dir, so an un-gated path let a client read another tenant's run or host files (e.g. run_osw(/runs/<other>/<id>/in.osw)). A keyword-only _internal=True (not exposed via MCP) is the trusted path for run_simulation, whose temp OSW is server-built and whose inputs it already validated. - #3 (high, shared-content mutation): test_measure copied test_model.osm into, chowned, and Landlock-granted the caller's measure dir. For a measure under a shared read-only root (e.g. a bundled measure) that corrupted shared content. Now: measures outside the caller's own writable run root are copied into a private run dir and tested there; own-root measures still test in place (xml update persists). Private copy is cleaned up. - #5 (high, cross-tenant read): drop INPUT_ROOT (/inputs) from the Landlock RO allowlist — it's shared/multi-tenant and inputs are staged into the run dir anyway. Documented the DAC rationale for keeping /etc,/proc,/sys (uid-1001 already blocks secret reads). Tests: red-green verified per fix (each fails on unfixed source). run_osw gate + internal-path (test_path_safety), test_measure-no-shared-mutation + /inputs-not-RO (test_sandbox). 107 passed across path_safety/sandbox/measures/measure_authoring/ sim_queue under OSMCP_SANDBOX=auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixup on the post-test measure-metadata-refresh hardening: 1. Drop two raw codex-review dumps committed to repo root (codex-security-review-*.MD) — docs/review/ is already gitignored for exactly these artifacts. 2. Bounded, no-follow copy-back (Copilot C1 + TOCTOU): new util.read_file_bounded reads measure.xml/README.md with O_NOFOLLOW and caps the read at max_bytes+1, so a confined-but-untrusted 'measure -u' cannot slurp a giant file into the unconfined server nor swap the output for a symlink to a host secret. 3. chmod best-effort (Copilot C2): replace() is the critical step; suppress OSError so a restrictive mount/non-POSIX host does not fail the whole refresh. 4. test_measure: post-test refresh is now best-effort — a failure is surfaced as metadata_warning instead of masking a passing (confined) test. Confinement, not the refresh exit code, is the security boundary. create/edit keep the raise (there the xml IS the product). Tests (red-green verified): test_path_safety read_file_bounded (oversize/symlink/non-regular); test_measure_authoring passing-test-not-masked (RED on revert: ok:False 'Symlink escapes...'). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ning Harden post-test measure metadata refresh
…/M1/M2) Post-merge hardening of the measure/sim sandbox for multi-tenant HTTP. H1 (secret leak): /proc was a Landlock RO root, so a confined measure could read /proc/<pid>/environ — on a non-root server (measure uid == server uid) that recovers the server secrets clean-env stripped, incl. HTTP auth tokens. Drop /proc from the RO allowlist; Landlock now denies it for everyone. Verified EnergyPlus/OpenStudio still run (test_mcp_seb4 + full sandbox suite green). M1 (fork-bomb DoS) + M2 (posix no FS isolation): every tenant shared uid 1001, so RLIMIT_NPROC (per-uid) was one shared budget (one tenant starves all) and posix-tier run dirs were mutually readable by DAC. Add config.sandbox_ids(): each remote tenant (HTTP session / auth principal) derives its own stable uid via hashlib (2000..61999, gid==uid); LOCAL keeps the baked 1001. Wired into wrap_cmd + prepare_workdir so chown and setuid agree. Per-uid NPROC budgets + DAC isolation even without Landlock. Tests (red-green verified, all RED on unfixed code): test_path_safety TestPerTenantSandboxUid (distinct/stable uids, LOCAL base, uid<=0 clamp); test_sandbox proc-not-in-RO-allowlist, full-tier-denies-/proc/self/environ, per-tenant-uid DAC isolation. Updated test_posix/test_h5 uid assertions to the per-tenant contract (HTTP fixtures now derive a uid); base-uid clamp moved to a unit test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Confines the subprocess that runs LLM-authored OpenStudio measures and EnergyPlus
simulations, behind a new
OSMCP_SANDBOXknob (defaultauto;offis theexplicit escape hatch). Addresses external-review point #4 — keeps full
functionality instead of disabling tools.
What it does
All arbitrary-code execution funnels through three subprocess sites
(
apply_measure,test_measure, sim_launch); each now routes throughmcp_server/sandbox.py. Tiers (degrade-loudly, Codex-CLI model):sandboxuser (uid 1001) whenroot; FSIZE/NPROC caps (
_sandbox_exec.pyshim, pid-preserving exec)_landlock.py, own ctypes) — read-deny-by-default: ro systemroots, writable only the run dir (+ specific device files, not all of /dev)
_seccomp.py, raw cBPF) — blockssocket(AF_INET/INET6)is_path_allowed()checks on attacker-controlled paths (measure dir, OSW, EPW, model)The unprivileged layers (Landlock + seccomp + rlimits) apply even for a
non-root local server; only the uid-drop is root-gated and skipped gracefully.
On platforms with no kernel backend (macOS/Windows bare installs) it falls back
to clean-env with a one-shot warning. This protects local users too: the server
bind-mounts host dirs (
/repo,/inputs), not readable by confined measures.Closes
runs-as-root · env-secret leak · filesystem read/write escape · network exfil ·
measure_dirpath traversal · cross-tenant file disclosure · shared-content mutation.Round-2 review fixes (Copilot + Codex gpt-5.5)
A second adversarial review pass (Codex
gpt-5.5) plus Copilot found nine items;all triaged and fixed in three batches (each fix red-green verified — the test
fails on unfixed source, passes with the fix):
#{}interpolation RCE in generated measures (escape#);fail-closed on an unknown
OSMCP_SANDBOXvalue (typo no longer downgrades toposix); explicit
LC_locale allowlist (noLC_*secret leak); reject non-finiteSIM_TIMEOUT; lowercase-leading arg-name regex (Ruby constant-assignment bug);_pid_aliveguard so a finished-but-unreaped run isn't force-failed./devLandlock rules (no writable/dev/shm,/dev/mqueue,no mknod); process-group kill on timeout/cancel (reaps forked EnergyPlus children).
run_oswaccess-gates the OSW/EPW path (was copying the OSW's wholeparent dir → cross-tenant read;
run_simulationuses a trusted internal path);test_measurecopies a shared/bundled measure into a private run dir instead ofwriting/chowning the source; dropped shared
/inputsfrom the Landlock RO allowlist.The
__packed12-byte Landlock struct flagged by both review rounds is a confirmedfalse positive (the kernel UAPI struct is
__attribute__((packed))); left as-iswith a comment.
Validation
Full integration suite green under
OSMCP_SANDBOX=autoon amd64 and arm64 (realEnergyPlus sims, bundled comstock/common measures, weather, HTTP/multi-user). The
dedicated
security.ymlruns the PoC/confinement suite on both arches.Security PoC suite (now included)
tests/test_sandbox.py(the confinement / exploit-PoC suite) is committed nowthat the fixes are shipped, and runs in CI (
security.yml, both arches) plus thebatch fixes' regression tests. It was kept local/git-excluded only while the holes
were still open.
Suggested review focus
_sandbox_exec.py— privilege-drop ordering (rlimits → no_new_privs → Landlock→ seccomp → setuid-if-root)
_landlock.py— policy correctness, struct packing, file-vs-dir rule masking_seccomp.py— hand-written cBPF correctness / bypasssandbox.py— RO allowlist breadth, clean-env allowlist,/devdevice rulesoperations.pyexec sites +run_osw/test_measureaccess gates🤖 Generated with Claude Code