Skip to content

Make memfd_create seccomp restriction configurable #1427

@rhuss

Description

@rhuss

Problem

OpenShell's seccomp filter unconditionally blocks memfd_create, which breaks several commonly-used developer tools inside sandboxes. This syscall is used by JIT compilers, WASM runtimes, and container runtimes for legitimate purposes. Since OpenShell targets AI coding agent workloads, many of the affected tools are core to those workflows.

Affected Tools

High impact (direct memfd_create usage)

  • Wasmtime (WASM JIT runtime) — uses memfd_create for copy-on-write memory initialization and JIT code storage
  • runc/crun (OCI container runtimes) — self-clone into sealed memfd to prevent CVE-2019-5736 container escapes
  • Zellij < v0.44 — terminal multiplexer with WASM plugin system (used wasmtime); fixed in v0.44 by migrating to wasmi interpreter

Medium impact (indirect via Node.js/V8 JIT)

  • Node.js / V8 — JIT compiler uses memfd_create for W^X-compliant dual mapping of executable pages
  • Claude Code — runs on Node.js
  • OpenAI Codex CLI — runs on Node.js
  • VS Code Remote Server — runs on Node.js
  • GitHub Copilot CLI — runs on Node.js
  • Cursor — Electron/V8 based

Context: other sandbox implementations

  • Docker/Podman default seccomp profiles allow memfd_create
  • OpenAI Codex sandbox (Landlock + seccomp) does not block memfd_create
  • Claude Code's own bubblewrap sandbox does not block memfd_create
  • systemd makes this opt-in via MemoryDenyWriteExecute= (does not block by default)

Reproduction

# Inside an OpenShell sandbox
$ zellij --layout my-layout  # with WASM plugin, Zellij < v0.44
# ERROR: cannot create a memfd: Operation not permitted (os error 1)

Proposal

Make memfd_create blocking configurable in the policy schema rather than unconditionally blocked. For example:

version: 1
process:
  run_as_user: sandbox
  run_as_group: sandbox
  allow_memfd: true  # or a broader syscall allowlist mechanism

Alternatively, a seccomp_relaxations or syscall_allow section could provide finer-grained control:

seccomp:
  allow:
    - memfd_create

This would let sandbox operators make informed risk/compatibility tradeoffs for their specific workloads, similar to how Docker/Podman allow users to provide custom seccomp profiles.

Workaround

For Zellij specifically: upgrade to v0.44.0+ which migrated from wasmtime (JIT) to wasmi (interpreter), eliminating the memfd_create dependency.

For Node.js: the --jitless flag disables V8 JIT, but at significant performance cost.

No workaround exists for runc/crun self-cloning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions