feat(security): sandbox measure/simulation subprocess execution by brianlball · Pull Request #63 · NatLabRockies/openstudio-mcp

brianlball · 2026-06-06T18:27:00Z

Confines the subprocess that runs LLM-authored OpenStudio measures and EnergyPlus
simulations, behind a new OSMCP_SANDBOX knob (default auto; off is the
explicit escape hatch). Addresses external-review point #4 — keeps full
functionality instead of disabling tools.

What it does

All arbitrary-code execution funnels through three subprocess sites
(apply_measure, test_measure, sim _launch); each now routes through
mcp_server/sandbox.py. Tiers (degrade-loudly, Codex-CLI model):

clean-env — allowlist strips host env (no secrets reach measure code)
UID drop + rlimits — runs as unprivileged sandbox user (uid 1001) when
root; FSIZE/NPROC caps (_sandbox_exec.py shim, pid-preserving exec)
Landlock FS (_landlock.py, own ctypes) — read-deny-by-default: ro system
roots, writable only the run dir (+ specific device files, not all of /dev)
seccomp net-deny (_seccomp.py, raw cBPF) — blocks socket(AF_INET/INET6)
is_path_allowed() checks on attacker-controlled paths (measure dir, OSW, EPW, model)

The unprivileged layers (Landlock + seccomp + rlimits) apply even for a
non-root local server; only the uid-drop is root-gated and skipped gracefully.
On platforms with no kernel backend (macOS/Windows bare installs) it falls back
to clean-env with a one-shot warning. This protects local users too: the server
bind-mounts host dirs (/repo, /inputs), not readable by confined measures.

Closes

runs-as-root · env-secret leak · filesystem read/write escape · network exfil ·
measure_dir path traversal · cross-tenant file disclosure · shared-content mutation.

Round-2 review fixes (Copilot + Codex gpt-5.5)

A second adversarial review pass (Codex gpt-5.5) plus Copilot found nine items;
all triaged and fixed in three batches (each fix red-green verified — the test
fails on unfixed source, passes with the fix):

Batch 1 — Ruby #{} interpolation RCE in generated measures (escape #);
fail-closed on an unknown OSMCP_SANDBOX value (typo no longer downgrades to
posix); explicit LC_ locale allowlist (no LC_* secret leak); reject non-finite
SIM_TIMEOUT; lowercase-leading arg-name regex (Ruby constant-assignment bug);
_pid_alive guard so a finished-but-unreaped run isn't force-failed.
Batch 2 — per-file /dev Landlock rules (no writable /dev/shm, /dev/mqueue,
no mknod); process-group kill on timeout/cancel (reaps forked EnergyPlus children).
Batch 3 — run_osw access-gates the OSW/EPW path (was copying the OSW's whole
parent dir → cross-tenant read; run_simulation uses a trusted internal path);
test_measure copies a shared/bundled measure into a private run dir instead of
writing/chowning the source; dropped shared /inputs from the Landlock RO allowlist.

The __packed 12-byte Landlock struct flagged by both review rounds is a confirmed
false positive (the kernel UAPI struct is __attribute__((packed))); left as-is
with a comment.

Validation

Full integration suite green under OSMCP_SANDBOX=auto on amd64 and arm64 (real
EnergyPlus sims, bundled comstock/common measures, weather, HTTP/multi-user). The
dedicated security.yml runs the PoC/confinement suite on both arches.

Security PoC suite (now included)

tests/test_sandbox.py (the confinement / exploit-PoC suite) is committed now
that the fixes are shipped, and runs in CI (security.yml, both arches) plus the
batch fixes' regression tests. It was kept local/git-excluded only while the holes
were still open.

Suggested review focus

_sandbox_exec.py — privilege-drop ordering (rlimits → no_new_privs → Landlock
→ seccomp → setuid-if-root)
_landlock.py — policy correctness, struct packing, file-vs-dir rule masking
_seccomp.py — hand-written cBPF correctness / bypass
sandbox.py — RO allowlist breadth, clean-env allowlist, /dev device rules
the three operations.py exec sites + run_osw/test_measure access gates

🤖 Generated with Claude Code

…ts 1-2) Three exec sites (apply_measure, test_measure, sim _launch) now route through mcp_server/sandbox.py, gated by OSMCP_SANDBOX (default off = passthrough): - clean-env allowlist strips host env (secrets) from measure code - UID drop to unprivileged `sandbox` user + rlimits via _sandbox_exec shim - unconditional is_path_allowed() check on apply_measure(measure_dir) - OSMCP_SIM_TIMEOUT_SECONDS config added (enforcement pending) Dockerfile bakes the `sandbox` user. Landlock FS rules + seccomp net-deny land next in _sandbox_exec. Security PoC suite kept local (git-excluded); standalone security.yml workflow added (manual, no-op when suite absent). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Full tier (OSMCP_SANDBOX=auto) closes the last two holes, on top of the POSIX floor, all via owned ctypes shims (no dependency, pure Python): - _landlock.py: read-deny-by-default FS policy — ro system roots, rw only the run dir (+ /dev for null/urandom). Blocks read-escape (even world-readable files) and write-escape, mount-independent. - _seccomp.py: raw single-arch cBPF denying socket(AF_INET/AF_INET6) with EAFNOSUPPORT (AF_UNIX/local IPC unaffected). Blocks outbound IP exfil. Applied in _sandbox_exec after no_new_privs, degrade-loudly; active_tier reports "landlock". Chose raw BPF over pyseccomp (broken wheel import). Verified under auto: read/write escape + net exfil blocked, apply + real EnergyPlus sim still succeed; comstock/common + weather suites green. Default stays off; flipping to auto (after a full-suite run under auto) is next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…unt) Under clean-env, pass redirect_tmp=False for test_measure's pytest/ruby run so TMPDIR stays on /tmp. test_measure isn't Landlocked, and pytest's capture uses an unlinked tempfile that the Docker Desktop bind mount can't keep open when TMPDIR is run_dir/tmp -> FileNotFoundError on truncate. apply_measure/sim keep TMPDIR=run_dir/tmp (Landlock needs it; openstudio doesn't use unlinked tempfiles). Fixes the only 2 failures in the full-suite-under-auto run (719 -> 721 pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…x increment 4) Decouple the confinement layers and make auto the default: - _sandbox_exec: drop uid only when root; the unprivileged layers (Landlock + seccomp + rlimits) still apply for a non-root local server. - sandbox.py: degrade loudly on platforms with no kernel backend (macOS/Windows -> clean-env only + one-shot warning); chown run dir only when root; active_tier() reports "clean-env" where no backend. - config: OSMCP_SANDBOX default off -> auto (off stays the escape hatch). This protects LOCAL users: the server bind-mounts host dirs (/repo, /inputs); the sandbox makes them read-only so an LLM-authored measure can't write the user's real files. Full suite previously verified green under auto (721 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR introduces a new sandboxing layer for subprocesses that execute LLM-authored OpenStudio measures and run EnergyPlus simulations, aiming to reduce risk from arbitrary code execution by confining environment inheritance and (on Linux) applying kernel-backed restrictions.

Changes:

Adds mcp_server/sandbox.py and routes measure/simulation subprocess execution through it (clean environment + privilege/rlimit wrapper).
Introduces Linux confinement primitives (_landlock.py, _seccomp.py) and an exec shim (_sandbox_exec.py) to apply rlimits, no_new_privs, Landlock FS policy, and seccomp network deny.
Adds new config knobs / defaults and supporting infrastructure (Docker sandbox user, manual security workflow, planning doc).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
mcp_server/skills/simulation/operations.py	Runs simulation subprocess via sandbox env + wrapper.
mcp_server/skills/measures/operations.py	Adds path allowlisting for `measure_dir` and wraps measure execution with sandbox.
mcp_server/skills/measure_authoring/operations.py	Filters env for test subprocesses (but does not wrap/sandbox them).
mcp_server/sandbox.py	New central chokepoint for env filtering, command wrapping, and workdir preparation.
mcp_server/config.py	Adds sandbox/network knobs and sim timeout configuration.
mcp_server/_seccomp.py	New raw seccomp-BPF filter to deny AF_INET/AF_INET6 sockets.
mcp_server/_sandbox_exec.py	New exec shim to apply rlimits/no_new_privs/Landlock/seccomp and (when root) uid-drop.
mcp_server/_landlock.py	New Landlock ruleset builder via ctypes syscalls.
docs/plans/measure-exec-sandbox.md	Design/planning document for the sandbox approach and test strategy.
docker/Dockerfile	Adds `sandbox` user for uid-drop.
.github/workflows/security.yml	Adds manual-trigger workflow to run security tests when present.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

            proc = subprocess.run(
                ["python3", "-m", "pytest", "tests/", "-v", "--tb=short"],
                cwd=str(mdir),
+                env=sandbox.build_env(mdir, redirect_tmp=False),
                capture_output=True, text=True, timeout=60, check=False,


            proc = subprocess.run(
                ["ruby", "-I", ".", str(test_files[0])],
                cwd=str(mdir),
+                env=sandbox.build_env(mdir, redirect_tmp=False),
                capture_output=True, text=True, timeout=60, check=False,


+SANDBOX_MODE = os.environ.get("OSMCP_SANDBOX", "auto").strip().lower()
+# Network policy for confined subprocesses: deny (default) blocks outbound TCP
+# once the seccomp backend lands; allow leaves it open (trusted/BCL deployments).
+SANDBOX_NET = os.environ.get("OSMCP_SANDBOX_NET", "deny").strip().lower()


+# Wall-clock cap for a single simulation (run_osw/run_simulation). 0 = no cap.
+SIM_TIMEOUT_SECONDS = _safe_float(
+    os.environ.get("OPENSTUDIO_MCP_SIM_TIMEOUT_SECONDS",
+                   os.environ.get("OSMCP_SIM_TIMEOUT_SECONDS", "7200")),
+    7200.0,
+)


+`OSMCP_SANDBOX` (config.SANDBOX_MODE) selects the mode:
+  off    — full passthrough (current behaviour / explicit escape hatch)
+  posix  — clean-env allowlist (this increment); UID drop + rlimits + Landlock
+           FS policy + seccomp net-deny arrive in later increments, same knob
+  auto   — best confinement available (currently == posix)


+    with contextlib.suppress(OSError):
+        os.chown(work, SANDBOX_UID, SANDBOX_GID)
+    for path in work.rglob("*"):
+        with contextlib.suppress(OSError):
+            os.chown(path, SANDBOX_UID, SANDBOX_GID)


+class _PathBeneathAttr(ctypes.Structure):
+    _pack_ = 1
+    _fields_ = [("allowed_access", ctypes.c_uint64), ("parent_fd", ctypes.c_int32)]
+


+Runs as root, applies rlimits, drops to the unprivileged uid/gid, sets
+``no_new_privs``, then ``execvp``s CMD. A standalone exec — NOT a Popen
+``preexec_fn`` — so there is no fork-safety hazard in the threaded server, and
+the dropped image keeps the same pid (so the dispatcher's existing
+terminate/kill-by-pid path still works). Landlock FS rules + a seccomp net-deny


+    if redirect_tmp:
+        work = Path(work_dir)
+        env["HOME"] = str(work)
+        tmp = work / "tmp"
+        try:
+            tmp.mkdir(parents=True, exist_ok=True)
+            env["TMPDIR"] = str(tmp)
+        except OSError:
+            pass


+3. **seccomp net-deny via libseccomp (`pyseccomp`), not hand-rolled BPF.**
+   Security-critical filter, and we ship **amd64 + arm64** — raw BPF must
+   hand-handle `AUDIT_ARCH` + per-arch syscall numbers for both, and a
+   subtly-wrong filter fails open. libseccomp is what Docker itself uses,
+   abstracts the arch mess, and `libseccomp2` is already in the base image. The


Planning/working doc — keep local-only, not in the repo. Removed from the tree (history retains it); now git-excluded so it won't be re-added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Never let a bad OSMCP_SANDBOX_UID override leave confined code as root. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…p (Codex H1) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…dex H4/M1) Also documents the packed path_beneath struct (rebuts Copilot #7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… H2) Load the landlock/seccomp shims before applying the FS restriction; refresh docstring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… env, drop /repo (Codex H3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ks=True (Codex C3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… arg code; confine reporting-measure test (Codex C5/C2/C1) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ation inputs; reject escaping OSW symlinks (Copilot+Codex C4) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…sandbox # Conflicts: # mcp_server/skills/measures/operations.py

Vulnerabilities are fixed (this PR), so the PoC suite is no longer held back. Runs in the dedicated security.yml workflow under OSMCP_SANDBOX=auto (each test still pins its own tier). Reports how Landlock+seccomp behave on the GitHub runner. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ly; add sandbox user to arm64 image Both amd64 + arm64 integration test jobs now set OSMCP_SANDBOX=auto (no longer relying on the code default). Dockerfile.arm64 gains the unprivileged sandbox user for parity. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Validates the aarch64 seccomp BPF (different syscall numbers) and Landlock confinement on the native arm64 runner, not just amd64. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

arm64 now runs, under OSMCP_SANDBOX=auto: shard 1 (SEB4 sim/EUI + weather/loops) and shard 2 (SWIG-memleak + stdout suppression [deb-vs-wheel], measure apply/authoring [Ruby/Python exec + bundler], an HVAC EnergyPlus sim). Security PoC runs on arm64 via the security workflow. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…studio arm64-test shard 2 failed: 5 test_measure_authoring cases hit `require': cannot load such file -- openstudio (LoadError)`. test_measure runs Ruby minitest via raw `ruby -I .` (openstudio measure -r doesn't run minitest), so system ruby needs RUBYLIB pointing at the OpenStudio Ruby bindings. The amd64 nrel/openstudio base sets this; the arm64 (ubuntu+.deb) image never did, so the path was only exposed once arm64 shard 2 started running these tests. Symlink the install's Ruby dir to a stable /usr/local path and set RUBYLIB to it (parity with amd64). Both under /usr/local, which the sandbox Landlock policy already grants; RUBYLIB is already in the sandbox env allowlist. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

+    for rec in expired:
+        with contextlib.suppress(psutil.NoSuchProcess, Exception):
+            p = psutil.Process(rec.pid)
+            p.terminate()
+            try:
+                p.wait(timeout=5)
+            except psutil.TimeoutExpired:
+                p.kill()
+        with _sim_lock:
+            rec.status = "failed"
+            rec.ended_at = _now()
+            rec.exit_code = -1 if rec.exit_code is None else rec.exit_code
+            rec.error = (f"Simulation exceeded the {SIM_TIMEOUT_SECONDS:.0f}s wall-clock "
+                         "cap (OSMCP_SIM_TIMEOUT_SECONDS)")
+            _RUNS[rec.run_id] = rec
+        _persist_run_record(rec)
+        audit("sim_timeout", run_id=rec.run_id, user=rec.user_key,
+              ran_seconds=round(now - (rec.started_at or now), 1))


+# Argument names become bare Ruby/Python identifiers in generated code — they must
+# be plain identifiers, never arbitrary text (else code injection at generation).
+_ARG_NAME_RE = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,63}$")


+"""Confirm the measure-exec vulnerabilities that the sandbox (planned) will close.
+
+Context: docs/plans/measure-exec-sandbox.md. Today an applied measure runs as
+root, with the server's full environment, unconfined filesystem, and open
+network — see skills/measures/operations.py (`env=os.environ.copy()`, no UID
+drop, no FS policy). These tests PROVE each hole exists, safely, using canaries
+and decoys only (no real secret/file/host is touched, nothing leaves the box).
+


…v/timeout hardening Codex gpt-5.5 + Copilot round-2 review fixes (low-risk): - #1 (critical RCE): _escape_ruby_str now escapes '#'. A measure description/ default/choice value with '#{...}' was live Ruby interpolation in a double-quoted literal, executed when `openstudio measure -u` runs (unsandboxed, as root). - #6: unknown OSMCP_SANDBOX value (typo) normalized to 'auto' (fail-closed) instead of silently downgrading enabled-but-not-full to posix. - #7: drop the bare 'LC_' env allowlist prefix; enumerate standard locale categories so an LC_-prefixed host secret (e.g. LC_API_TOKEN) can't leak to measure code. - #9: _safe_float rejects NaN/inf so a non-finite SIM_TIMEOUT can't silently disable the wall-clock cap. - Copilot: _ARG_NAME_RE requires lowercase-leading (uppercase -> Ruby dynamic constant assignment SyntaxError in generated arguments()). - Copilot: _enforce_timeouts skips a dead-but-unreaped pid (_pid_alive) so a finished run isn't force-failed; left for the reaper to classify by exit code. Tests: 6 added (test_sandbox.py x3, test_measure_authoring.py x2, test_sim_queue.py x1). Verified 6 new + full measure_authoring + sim_queue (38 passed) under OSMCP_SANDBOX=auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… kill Codex gpt-5.5 review fixes (medium): - #4: replace the whole-/dev rw Landlock grant with per-file rules — /dev/null, /dev/zero (rw) and /dev/urandom, /dev/random (ro). A /dev dir grant also exposed writable /dev/shm + /dev/mqueue (shared storage/IPC outside the run dir) and permitted mknod. _landlock._add now masks a rule on a non-directory to file-only rights (a dir-only bit on a file makes add_rule return EINVAL -> fail-closed). - #8: sims launch with start_new_session=True (own process-group leader) and timeout/cancel now kill the whole group via _kill_process_group(), reaping forked children (EnergyPlus, helpers) a single-pid kill would orphan. Only ever signals a group we created (pgid == pid) — never the server's own group. Tests: red-green verified (both fail on unfixed source, pass with fix). 2 added (test_sandbox.py wrap_cmd device rules, test_sim_queue.py group kill). Full test_sandbox + test_sim_queue + test_measures green under OSMCP_SANDBOX=auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…aging, drop /inputs RO Codex gpt-5.5 review fixes (the staging/policy refactors): - #2 (critical, cross-tenant read): the PUBLIC run_osw now is_path_allowed-gates the OSW (and EPW) BEFORE reading/copying — it copies the OSW's whole parent dir into the run dir, so an un-gated path let a client read another tenant's run or host files (e.g. run_osw(/runs/<other>/<id>/in.osw)). A keyword-only _internal=True (not exposed via MCP) is the trusted path for run_simulation, whose temp OSW is server-built and whose inputs it already validated. - #3 (high, shared-content mutation): test_measure copied test_model.osm into, chowned, and Landlock-granted the caller's measure dir. For a measure under a shared read-only root (e.g. a bundled measure) that corrupted shared content. Now: measures outside the caller's own writable run root are copied into a private run dir and tested there; own-root measures still test in place (xml update persists). Private copy is cleaned up. - #5 (high, cross-tenant read): drop INPUT_ROOT (/inputs) from the Landlock RO allowlist — it's shared/multi-tenant and inputs are staged into the run dir anyway. Documented the DAC rationale for keeping /etc,/proc,/sys (uid-1001 already blocks secret reads). Tests: red-green verified per fix (each fails on unfixed source). run_osw gate + internal-path (test_path_safety), test_measure-no-shared-mutation + /inputs-not-RO (test_sandbox). 107 passed across path_safety/sandbox/measures/measure_authoring/ sim_queue under OSMCP_SANDBOX=auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Fixup on the post-test measure-metadata-refresh hardening: 1. Drop two raw codex-review dumps committed to repo root (codex-security-review-*.MD) — docs/review/ is already gitignored for exactly these artifacts. 2. Bounded, no-follow copy-back (Copilot C1 + TOCTOU): new util.read_file_bounded reads measure.xml/README.md with O_NOFOLLOW and caps the read at max_bytes+1, so a confined-but-untrusted 'measure -u' cannot slurp a giant file into the unconfined server nor swap the output for a symlink to a host secret. 3. chmod best-effort (Copilot C2): replace() is the critical step; suppress OSError so a restrictive mount/non-POSIX host does not fail the whole refresh. 4. test_measure: post-test refresh is now best-effort — a failure is surfaced as metadata_warning instead of masking a passing (confined) test. Confinement, not the refresh exit code, is the security boundary. create/edit keep the raise (there the xml IS the product). Tests (red-green verified): test_path_safety read_file_bounded (oversize/symlink/non-regular); test_measure_authoring passing-test-not-masked (RED on revert: ok:False 'Symlink escapes...'). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ning Harden post-test measure metadata refresh

…/M1/M2) Post-merge hardening of the measure/sim sandbox for multi-tenant HTTP. H1 (secret leak): /proc was a Landlock RO root, so a confined measure could read /proc/<pid>/environ — on a non-root server (measure uid == server uid) that recovers the server secrets clean-env stripped, incl. HTTP auth tokens. Drop /proc from the RO allowlist; Landlock now denies it for everyone. Verified EnergyPlus/OpenStudio still run (test_mcp_seb4 + full sandbox suite green). M1 (fork-bomb DoS) + M2 (posix no FS isolation): every tenant shared uid 1001, so RLIMIT_NPROC (per-uid) was one shared budget (one tenant starves all) and posix-tier run dirs were mutually readable by DAC. Add config.sandbox_ids(): each remote tenant (HTTP session / auth principal) derives its own stable uid via hashlib (2000..61999, gid==uid); LOCAL keeps the baked 1001. Wired into wrap_cmd + prepare_workdir so chown and setuid agree. Per-uid NPROC budgets + DAC isolation even without Landlock. Tests (red-green verified, all RED on unfixed code): test_path_safety TestPerTenantSandboxUid (distinct/stable uids, LOCAL base, uid<=0 clamp); test_sandbox proc-not-in-RO-allowlist, full-tier-denies-/proc/self/environ, per-tenant-uid DAC isolation. Updated test_posix/test_h5 uid assertions to the per-tenant contract (HTTP fixtures now derive a uid); base-uid clamp moved to a unit test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

brianlball and others added 4 commits June 6, 2026 12:33

brianlball requested a review from Copilot June 6, 2026 18:28

Copilot started reviewing on behalf of brianlball June 6, 2026 18:28 View session

Copilot AI reviewed Jun 6, 2026

View reviewed changes

brianlball and others added 16 commits June 6, 2026 17:28

chore: stop tracking docs/plans/measure-exec-sandbox.md

e75d0b8

Planning/working doc — keep local-only, not in the repo. Removed from the tree (history retains it); now git-excluded so it won't be re-added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

refactor(security): add reject_escaping_symlinks() staging guard

90e89c6

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(security): reject sandbox uid/gid <= 0 (Codex H5)

838b6c3

Never let a bad OSMCP_SANDBOX_UID override leave confined code as root. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(security): seccomp KILL unexpected arch + x32, deny io_uring_setu…

a27f41f

…p (Codex H1) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(security): Landlock IOCTL_DEV (ABI>=5) + fail-closed add_rule (Co…

cce1673

…dex H4/M1) Also documents the packed path_beneath struct (rebuts Copilot #7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(security): fail-closed when Landlock/seccomp cannot engage (Codex…

60373e8

… H2) Load the landlock/seccomp shims before applying the FS restriction; refresh docstring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(security): respect OSMCP_SANDBOX_NET, symlink-safe chown, tighten…

b51d237

… env, drop /repo (Codex H3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(security): apply_measure rejects escaping symlinks, stages symlin…

cca9681

…ks=True (Codex C3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(security): validate test_measure paths; escape/validate generated…

37cd3b8

… arg code; confine reporting-measure test (Codex C5/C2/C1) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

feat(security): enforce OSMCP_SIM_TIMEOUT_SECONDS; validate run_simul…

bfe7d21

…ation inputs; reject escaping OSW symlinks (Copilot+Codex C4) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/develop' into feat/measure-exec-…

be16d83

…sandbox # Conflicts: # mcp_server/skills/measures/operations.py

ci(security): run sandbox suite on arm64 too (matrix amd64+arm64)

a38a280

Validates the aarch64 seccomp BPF (different syscall numbers) and Landlock confinement on the native arm64 runner, not just amd64. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

brianlball requested a review from Copilot June 6, 2026 22:57

Copilot started reviewing on behalf of brianlball June 6, 2026 22:57 View session

Copilot AI reviewed Jun 6, 2026

View reviewed changes

brianlball and others added 4 commits June 6, 2026 20:45

Harden post-test measure metadata refresh

eb82af8

brianlball and others added 4 commits June 7, 2026 11:01

Document simulation sandbox security

9633194

Merge pull request #64 from NatLabRockies/codex/measure-sandbox-harde…

05e27fd

…ning Harden post-test measure metadata refresh

brianlball marked this pull request as ready for review June 7, 2026 17:04

brianlball merged commit 935bcaf into develop Jun 8, 2026
22 checks passed

brianlball deleted the feat/measure-exec-sandbox branch June 8, 2026 12:08

brianlball mentioned this pull request Jun 8, 2026

Merge develop into measure-updates (resolve PR #65 sandbox conflicts) #67

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(security): sandbox measure/simulation subprocess execution#63

feat(security): sandbox measure/simulation subprocess execution#63
brianlball merged 28 commits into
developfrom
feat/measure-exec-sandbox

brianlball commented Jun 6, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brianlball commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What it does

Closes

Round-2 review fixes (Copilot + Codex gpt-5.5)

Validation

Security PoC suite (now included)

Suggested review focus

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brianlball commented Jun 6, 2026 •

edited

Loading