Skip to content

feat(security): sandbox measure/simulation subprocess execution#63

Merged
brianlball merged 28 commits into
developfrom
feat/measure-exec-sandbox
Jun 8, 2026
Merged

feat(security): sandbox measure/simulation subprocess execution#63
brianlball merged 28 commits into
developfrom
feat/measure-exec-sandbox

Conversation

@brianlball

@brianlball brianlball commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Confines the subprocess that runs LLM-authored OpenStudio measures and EnergyPlus
simulations, behind a new OSMCP_SANDBOX knob (default auto; off is the
explicit escape hatch). Addresses external-review point #4 — keeps full
functionality instead of disabling tools.

What it does

All arbitrary-code execution funnels through three subprocess sites
(apply_measure, test_measure, sim _launch); each now routes through
mcp_server/sandbox.py. Tiers (degrade-loudly, Codex-CLI model):

  • clean-env — allowlist strips host env (no secrets reach measure code)
  • UID drop + rlimits — runs as unprivileged sandbox user (uid 1001) when
    root; FSIZE/NPROC caps (_sandbox_exec.py shim, pid-preserving exec)
  • Landlock FS (_landlock.py, own ctypes) — read-deny-by-default: ro system
    roots, writable only the run dir (+ specific device files, not all of /dev)
  • seccomp net-deny (_seccomp.py, raw cBPF) — blocks socket(AF_INET/INET6)
  • is_path_allowed() checks on attacker-controlled paths (measure dir, OSW, EPW, model)

The unprivileged layers (Landlock + seccomp + rlimits) apply even for a
non-root local server
; only the uid-drop is root-gated and skipped gracefully.
On platforms with no kernel backend (macOS/Windows bare installs) it falls back
to clean-env with a one-shot warning. This protects local users too: the server
bind-mounts host dirs (/repo, /inputs), not readable by confined measures.

Closes

runs-as-root · env-secret leak · filesystem read/write escape · network exfil ·
measure_dir path traversal · cross-tenant file disclosure · shared-content mutation.

Round-2 review fixes (Copilot + Codex gpt-5.5)

A second adversarial review pass (Codex gpt-5.5) plus Copilot found nine items;
all triaged and fixed in three batches (each fix red-green verified — the test
fails on unfixed source, passes with the fix):

  • Batch 1 — Ruby #{} interpolation RCE in generated measures (escape #);
    fail-closed on an unknown OSMCP_SANDBOX value (typo no longer downgrades to
    posix); explicit LC_ locale allowlist (no LC_* secret leak); reject non-finite
    SIM_TIMEOUT; lowercase-leading arg-name regex (Ruby constant-assignment bug);
    _pid_alive guard so a finished-but-unreaped run isn't force-failed.
  • Batch 2 — per-file /dev Landlock rules (no writable /dev/shm, /dev/mqueue,
    no mknod); process-group kill on timeout/cancel (reaps forked EnergyPlus children).
  • Batch 3run_osw access-gates the OSW/EPW path (was copying the OSW's whole
    parent dir → cross-tenant read; run_simulation uses a trusted internal path);
    test_measure copies a shared/bundled measure into a private run dir instead of
    writing/chowning the source; dropped shared /inputs from the Landlock RO allowlist.

The __packed 12-byte Landlock struct flagged by both review rounds is a confirmed
false positive (the kernel UAPI struct is __attribute__((packed))); left as-is
with a comment.

Validation

Full integration suite green under OSMCP_SANDBOX=auto on amd64 and arm64 (real
EnergyPlus sims, bundled comstock/common measures, weather, HTTP/multi-user). The
dedicated security.yml runs the PoC/confinement suite on both arches.

Security PoC suite (now included)

tests/test_sandbox.py (the confinement / exploit-PoC suite) is committed now
that the fixes are shipped, and runs in CI (security.yml, both arches) plus the
batch fixes' regression tests. It was kept local/git-excluded only while the holes
were still open.

Suggested review focus

  • _sandbox_exec.py — privilege-drop ordering (rlimits → no_new_privs → Landlock
    → seccomp → setuid-if-root)
  • _landlock.py — policy correctness, struct packing, file-vs-dir rule masking
  • _seccomp.py — hand-written cBPF correctness / bypass
  • sandbox.py — RO allowlist breadth, clean-env allowlist, /dev device rules
  • the three operations.py exec sites + run_osw/test_measure access gates

🤖 Generated with Claude Code

brianlball and others added 4 commits June 6, 2026 12:33
…ts 1-2)

Three exec sites (apply_measure, test_measure, sim _launch) now route through
mcp_server/sandbox.py, gated by OSMCP_SANDBOX (default off = passthrough):
- clean-env allowlist strips host env (secrets) from measure code
- UID drop to unprivileged `sandbox` user + rlimits via _sandbox_exec shim
- unconditional is_path_allowed() check on apply_measure(measure_dir)
- OSMCP_SIM_TIMEOUT_SECONDS config added (enforcement pending)

Dockerfile bakes the `sandbox` user. Landlock FS rules + seccomp net-deny
land next in _sandbox_exec. Security PoC suite kept local (git-excluded);
standalone security.yml workflow added (manual, no-op when suite absent).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Full tier (OSMCP_SANDBOX=auto) closes the last two holes, on top of the POSIX
floor, all via owned ctypes shims (no dependency, pure Python):
- _landlock.py: read-deny-by-default FS policy — ro system roots, rw only the
  run dir (+ /dev for null/urandom). Blocks read-escape (even world-readable
  files) and write-escape, mount-independent.
- _seccomp.py: raw single-arch cBPF denying socket(AF_INET/AF_INET6) with
  EAFNOSUPPORT (AF_UNIX/local IPC unaffected). Blocks outbound IP exfil.
Applied in _sandbox_exec after no_new_privs, degrade-loudly; active_tier reports
"landlock". Chose raw BPF over pyseccomp (broken wheel import).

Verified under auto: read/write escape + net exfil blocked, apply + real
EnergyPlus sim still succeed; comstock/common + weather suites green.
Default stays off; flipping to auto (after a full-suite run under auto) is next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…unt)

Under clean-env, pass redirect_tmp=False for test_measure's pytest/ruby run so
TMPDIR stays on /tmp. test_measure isn't Landlocked, and pytest's capture uses an
unlinked tempfile that the Docker Desktop bind mount can't keep open when TMPDIR
is run_dir/tmp -> FileNotFoundError on truncate. apply_measure/sim keep
TMPDIR=run_dir/tmp (Landlock needs it; openstudio doesn't use unlinked tempfiles).
Fixes the only 2 failures in the full-suite-under-auto run (719 -> 721 pass).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…x increment 4)

Decouple the confinement layers and make auto the default:
- _sandbox_exec: drop uid only when root; the unprivileged layers (Landlock +
  seccomp + rlimits) still apply for a non-root local server.
- sandbox.py: degrade loudly on platforms with no kernel backend (macOS/Windows
  -> clean-env only + one-shot warning); chown run dir only when root;
  active_tier() reports "clean-env" where no backend.
- config: OSMCP_SANDBOX default off -> auto (off stays the escape hatch).

This protects LOCAL users: the server bind-mounts host dirs (/repo, /inputs);
the sandbox makes them read-only so an LLM-authored measure can't write the
user's real files. Full suite previously verified green under auto (721 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new sandboxing layer for subprocesses that execute LLM-authored OpenStudio measures and run EnergyPlus simulations, aiming to reduce risk from arbitrary code execution by confining environment inheritance and (on Linux) applying kernel-backed restrictions.

Changes:

  • Adds mcp_server/sandbox.py and routes measure/simulation subprocess execution through it (clean environment + privilege/rlimit wrapper).
  • Introduces Linux confinement primitives (_landlock.py, _seccomp.py) and an exec shim (_sandbox_exec.py) to apply rlimits, no_new_privs, Landlock FS policy, and seccomp network deny.
  • Adds new config knobs / defaults and supporting infrastructure (Docker sandbox user, manual security workflow, planning doc).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
mcp_server/skills/simulation/operations.py Runs simulation subprocess via sandbox env + wrapper.
mcp_server/skills/measures/operations.py Adds path allowlisting for measure_dir and wraps measure execution with sandbox.
mcp_server/skills/measure_authoring/operations.py Filters env for test subprocesses (but does not wrap/sandbox them).
mcp_server/sandbox.py New central chokepoint for env filtering, command wrapping, and workdir preparation.
mcp_server/config.py Adds sandbox/network knobs and sim timeout configuration.
mcp_server/_seccomp.py New raw seccomp-BPF filter to deny AF_INET/AF_INET6 sockets.
mcp_server/_sandbox_exec.py New exec shim to apply rlimits/no_new_privs/Landlock/seccomp and (when root) uid-drop.
mcp_server/_landlock.py New Landlock ruleset builder via ctypes syscalls.
docs/plans/measure-exec-sandbox.md Design/planning document for the sandbox approach and test strategy.
docker/Dockerfile Adds sandbox user for uid-drop.
.github/workflows/security.yml Adds manual-trigger workflow to run security tests when present.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1017 to 1021
proc = subprocess.run(
["python3", "-m", "pytest", "tests/", "-v", "--tb=short"],
cwd=str(mdir),
env=sandbox.build_env(mdir, redirect_tmp=False),
capture_output=True, text=True, timeout=60, check=False,
Comment on lines 1028 to 1032
proc = subprocess.run(
["ruby", "-I", ".", str(test_files[0])],
cwd=str(mdir),
env=sandbox.build_env(mdir, redirect_tmp=False),
capture_output=True, text=True, timeout=60, check=False,
Comment thread mcp_server/config.py Outdated
Comment on lines +69 to +72
SANDBOX_MODE = os.environ.get("OSMCP_SANDBOX", "auto").strip().lower()
# Network policy for confined subprocesses: deny (default) blocks outbound TCP
# once the seccomp backend lands; allow leaves it open (trusted/BCL deployments).
SANDBOX_NET = os.environ.get("OSMCP_SANDBOX_NET", "deny").strip().lower()
Comment thread mcp_server/config.py
Comment on lines +74 to +79
# Wall-clock cap for a single simulation (run_osw/run_simulation). 0 = no cap.
SIM_TIMEOUT_SECONDS = _safe_float(
os.environ.get("OPENSTUDIO_MCP_SIM_TIMEOUT_SECONDS",
os.environ.get("OSMCP_SIM_TIMEOUT_SECONDS", "7200")),
7200.0,
)
Comment thread mcp_server/sandbox.py Outdated
Comment on lines +8 to +12
`OSMCP_SANDBOX` (config.SANDBOX_MODE) selects the mode:
off — full passthrough (current behaviour / explicit escape hatch)
posix — clean-env allowlist (this increment); UID drop + rlimits + Landlock
FS policy + seccomp net-deny arrive in later increments, same knob
auto — best confinement available (currently == posix)
Comment thread mcp_server/sandbox.py Outdated
Comment on lines +192 to +196
with contextlib.suppress(OSError):
os.chown(work, SANDBOX_UID, SANDBOX_GID)
for path in work.rglob("*"):
with contextlib.suppress(OSError):
os.chown(path, SANDBOX_UID, SANDBOX_GID)
Comment thread mcp_server/_landlock.py
Comment on lines +55 to +58
class _PathBeneathAttr(ctypes.Structure):
_pack_ = 1
_fields_ = [("allowed_access", ctypes.c_uint64), ("parent_fd", ctypes.c_int32)]

Comment thread mcp_server/_sandbox_exec.py Outdated
Comment on lines +9 to +13
Runs as root, applies rlimits, drops to the unprivileged uid/gid, sets
``no_new_privs``, then ``execvp``s CMD. A standalone exec — NOT a Popen
``preexec_fn`` — so there is no fork-safety hazard in the threaded server, and
the dropped image keeps the same pid (so the dispatcher's existing
terminate/kill-by-pid path still works). Landlock FS rules + a seccomp net-deny
Comment thread mcp_server/sandbox.py
Comment on lines +121 to +129
if redirect_tmp:
work = Path(work_dir)
env["HOME"] = str(work)
tmp = work / "tmp"
try:
tmp.mkdir(parents=True, exist_ok=True)
env["TMPDIR"] = str(tmp)
except OSError:
pass
Comment thread docs/plans/measure-exec-sandbox.md Outdated
Comment on lines +311 to +315
3. **seccomp net-deny via libseccomp (`pyseccomp`), not hand-rolled BPF.**
Security-critical filter, and we ship **amd64 + arm64** — raw BPF must
hand-handle `AUDIT_ARCH` + per-arch syscall numbers for both, and a
subtly-wrong filter fails open. libseccomp is what Docker itself uses,
abstracts the arch mess, and `libseccomp2` is already in the base image. The
brianlball and others added 16 commits June 6, 2026 17:28
Planning/working doc — keep local-only, not in the repo. Removed from the tree
(history retains it); now git-excluded so it won't be re-added.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Never let a bad OSMCP_SANDBOX_UID override leave confined code as root.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…p (Codex H1)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dex H4/M1)

Also documents the packed path_beneath struct (rebuts Copilot #7).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… H2)

Load the landlock/seccomp shims before applying the FS restriction; refresh docstring.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… env, drop /repo (Codex H3)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ks=True (Codex C3)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… arg code; confine reporting-measure test (Codex C5/C2/C1)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ation inputs; reject escaping OSW symlinks (Copilot+Codex C4)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sandbox

# Conflicts:
#	mcp_server/skills/measures/operations.py
Vulnerabilities are fixed (this PR), so the PoC suite is no longer held back. Runs in the dedicated security.yml workflow under OSMCP_SANDBOX=auto (each test still pins its own tier). Reports how Landlock+seccomp behave on the GitHub runner.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ly; add sandbox user to arm64 image

Both amd64 + arm64 integration test jobs now set OSMCP_SANDBOX=auto (no longer relying on the code default). Dockerfile.arm64 gains the unprivileged sandbox user for parity.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Validates the aarch64 seccomp BPF (different syscall numbers) and Landlock confinement on the native arm64 runner, not just amd64.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
arm64 now runs, under OSMCP_SANDBOX=auto: shard 1 (SEB4 sim/EUI + weather/loops) and shard 2 (SWIG-memleak + stdout suppression [deb-vs-wheel], measure apply/authoring [Ruby/Python exec + bundler], an HVAC EnergyPlus sim). Security PoC runs on arm64 via the security workflow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…studio

arm64-test shard 2 failed: 5 test_measure_authoring cases hit
`require': cannot load such file -- openstudio (LoadError)`. test_measure runs
Ruby minitest via raw `ruby -I .` (openstudio measure -r doesn't run minitest),
so system ruby needs RUBYLIB pointing at the OpenStudio Ruby bindings. The amd64
nrel/openstudio base sets this; the arm64 (ubuntu+.deb) image never did, so the
path was only exposed once arm64 shard 2 started running these tests.

Symlink the install's Ruby dir to a stable /usr/local path and set RUBYLIB to it
(parity with amd64). Both under /usr/local, which the sandbox Landlock policy
already grants; RUBYLIB is already in the sandbox env allowlist.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Comment on lines +343 to +360
for rec in expired:
with contextlib.suppress(psutil.NoSuchProcess, Exception):
p = psutil.Process(rec.pid)
p.terminate()
try:
p.wait(timeout=5)
except psutil.TimeoutExpired:
p.kill()
with _sim_lock:
rec.status = "failed"
rec.ended_at = _now()
rec.exit_code = -1 if rec.exit_code is None else rec.exit_code
rec.error = (f"Simulation exceeded the {SIM_TIMEOUT_SECONDS:.0f}s wall-clock "
"cap (OSMCP_SIM_TIMEOUT_SECONDS)")
_RUNS[rec.run_id] = rec
_persist_run_record(rec)
audit("sim_timeout", run_id=rec.run_id, user=rec.user_key,
ran_seconds=round(now - (rec.started_at or now), 1))
Comment on lines +34 to +36
# Argument names become bare Ruby/Python identifiers in generated code — they must
# be plain identifiers, never arbitrary text (else code injection at generation).
_ARG_NAME_RE = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,63}$")
Comment thread tests/test_sandbox.py
Comment on lines +1 to +8
"""Confirm the measure-exec vulnerabilities that the sandbox (planned) will close.

Context: docs/plans/measure-exec-sandbox.md. Today an applied measure runs as
root, with the server's full environment, unconfined filesystem, and open
network — see skills/measures/operations.py (`env=os.environ.copy()`, no UID
drop, no FS policy). These tests PROVE each hole exists, safely, using canaries
and decoys only (no real secret/file/host is touched, nothing leaves the box).

brianlball and others added 4 commits June 6, 2026 20:45
…v/timeout hardening

Codex gpt-5.5 + Copilot round-2 review fixes (low-risk):
- #1 (critical RCE): _escape_ruby_str now escapes '#'. A measure description/
  default/choice value with '#{...}' was live Ruby interpolation in a double-quoted
  literal, executed when `openstudio measure -u` runs (unsandboxed, as root).
- #6: unknown OSMCP_SANDBOX value (typo) normalized to 'auto' (fail-closed) instead
  of silently downgrading enabled-but-not-full to posix.
- #7: drop the bare 'LC_' env allowlist prefix; enumerate standard locale categories
  so an LC_-prefixed host secret (e.g. LC_API_TOKEN) can't leak to measure code.
- #9: _safe_float rejects NaN/inf so a non-finite SIM_TIMEOUT can't silently disable
  the wall-clock cap.
- Copilot: _ARG_NAME_RE requires lowercase-leading (uppercase -> Ruby dynamic
  constant assignment SyntaxError in generated arguments()).
- Copilot: _enforce_timeouts skips a dead-but-unreaped pid (_pid_alive) so a finished
  run isn't force-failed; left for the reaper to classify by exit code.

Tests: 6 added (test_sandbox.py x3, test_measure_authoring.py x2, test_sim_queue.py x1).
Verified 6 new + full measure_authoring + sim_queue (38 passed) under OSMCP_SANDBOX=auto.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… kill

Codex gpt-5.5 review fixes (medium):
- #4: replace the whole-/dev rw Landlock grant with per-file rules — /dev/null,
  /dev/zero (rw) and /dev/urandom, /dev/random (ro). A /dev dir grant also exposed
  writable /dev/shm + /dev/mqueue (shared storage/IPC outside the run dir) and
  permitted mknod. _landlock._add now masks a rule on a non-directory to file-only
  rights (a dir-only bit on a file makes add_rule return EINVAL -> fail-closed).
- #8: sims launch with start_new_session=True (own process-group leader) and
  timeout/cancel now kill the whole group via _kill_process_group(), reaping forked
  children (EnergyPlus, helpers) a single-pid kill would orphan. Only ever signals a
  group we created (pgid == pid) — never the server's own group.

Tests: red-green verified (both fail on unfixed source, pass with fix). 2 added
(test_sandbox.py wrap_cmd device rules, test_sim_queue.py group kill). Full
test_sandbox + test_sim_queue + test_measures green under OSMCP_SANDBOX=auto.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aging, drop /inputs RO

Codex gpt-5.5 review fixes (the staging/policy refactors):
- #2 (critical, cross-tenant read): the PUBLIC run_osw now is_path_allowed-gates the
  OSW (and EPW) BEFORE reading/copying — it copies the OSW's whole parent dir into the
  run dir, so an un-gated path let a client read another tenant's run or host files
  (e.g. run_osw(/runs/<other>/<id>/in.osw)). A keyword-only _internal=True (not exposed
  via MCP) is the trusted path for run_simulation, whose temp OSW is server-built and
  whose inputs it already validated.
- #3 (high, shared-content mutation): test_measure copied test_model.osm into, chowned,
  and Landlock-granted the caller's measure dir. For a measure under a shared read-only
  root (e.g. a bundled measure) that corrupted shared content. Now: measures outside the
  caller's own writable run root are copied into a private run dir and tested there;
  own-root measures still test in place (xml update persists). Private copy is cleaned up.
- #5 (high, cross-tenant read): drop INPUT_ROOT (/inputs) from the Landlock RO allowlist
  — it's shared/multi-tenant and inputs are staged into the run dir anyway. Documented
  the DAC rationale for keeping /etc,/proc,/sys (uid-1001 already blocks secret reads).

Tests: red-green verified per fix (each fails on unfixed source). run_osw gate +
internal-path (test_path_safety), test_measure-no-shared-mutation + /inputs-not-RO
(test_sandbox). 107 passed across path_safety/sandbox/measures/measure_authoring/
sim_queue under OSMCP_SANDBOX=auto.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
brianlball and others added 4 commits June 7, 2026 11:01
Fixup on the post-test measure-metadata-refresh hardening:

1. Drop two raw codex-review dumps committed to repo root (codex-security-review-*.MD) — docs/review/ is already gitignored for exactly these artifacts.

2. Bounded, no-follow copy-back (Copilot C1 + TOCTOU): new util.read_file_bounded reads measure.xml/README.md with O_NOFOLLOW and caps the read at max_bytes+1, so a confined-but-untrusted 'measure -u' cannot slurp a giant file into the unconfined server nor swap the output for a symlink to a host secret.

3. chmod best-effort (Copilot C2): replace() is the critical step; suppress OSError so a restrictive mount/non-POSIX host does not fail the whole refresh.

4. test_measure: post-test refresh is now best-effort — a failure is surfaced as metadata_warning instead of masking a passing (confined) test. Confinement, not the refresh exit code, is the security boundary. create/edit keep the raise (there the xml IS the product).

Tests (red-green verified): test_path_safety read_file_bounded (oversize/symlink/non-regular); test_measure_authoring passing-test-not-masked (RED on revert: ok:False 'Symlink escapes...').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ning

Harden post-test measure metadata refresh
…/M1/M2)

Post-merge hardening of the measure/sim sandbox for multi-tenant HTTP.

H1 (secret leak): /proc was a Landlock RO root, so a confined measure could read /proc/<pid>/environ — on a non-root server (measure uid == server uid) that recovers the server secrets clean-env stripped, incl. HTTP auth tokens. Drop /proc from the RO allowlist; Landlock now denies it for everyone. Verified EnergyPlus/OpenStudio still run (test_mcp_seb4 + full sandbox suite green).

M1 (fork-bomb DoS) + M2 (posix no FS isolation): every tenant shared uid 1001, so RLIMIT_NPROC (per-uid) was one shared budget (one tenant starves all) and posix-tier run dirs were mutually readable by DAC. Add config.sandbox_ids(): each remote tenant (HTTP session / auth principal) derives its own stable uid via hashlib (2000..61999, gid==uid); LOCAL keeps the baked 1001. Wired into wrap_cmd + prepare_workdir so chown and setuid agree. Per-uid NPROC budgets + DAC isolation even without Landlock.

Tests (red-green verified, all RED on unfixed code): test_path_safety TestPerTenantSandboxUid (distinct/stable uids, LOCAL base, uid<=0 clamp); test_sandbox proc-not-in-RO-allowlist, full-tier-denies-/proc/self/environ, per-tenant-uid DAC isolation. Updated test_posix/test_h5 uid assertions to the per-tenant contract (HTTP fixtures now derive a uid); base-uid clamp moved to a unit test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@brianlball brianlball marked this pull request as ready for review June 7, 2026 17:04
@brianlball brianlball merged commit 935bcaf into develop Jun 8, 2026
22 checks passed
@brianlball brianlball deleted the feat/measure-exec-sandbox branch June 8, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants