Skip to content

fix(macos): don't kill the session on a transient reply error#137

Open
dlicudi wants to merge 1 commit into
Sherlock-Holo:masterfrom
dlicudi:fix/macos-reply-error-resilience
Open

fix(macos): don't kill the session on a transient reply error#137
dlicudi wants to merge 1 commit into
Sherlock-Holo:masterfrom
dlicudi:fix/macos-reply-error-resilience

Conversation

@dlicudi

@dlicudi dlicudi commented Jun 12, 2026

Copy link
Copy Markdown

While porting XEarthLayer to macOS, I encountered a bug where the mount would die mid-flight under X-Plane's concurrent read load. Text below is derived from Claude assisted commits.

macFUSE can transiently reject a reply write to /dev/macfuse with EINVAL (or EAGAIN/EINTR) under heavy concurrent load — e.g. while Spotlight crawls the mount. The session loop treated ANY non-NotFound reply-send failure as fatal (send_failed.notify -> session returns Err), so a single transient failure tore down the whole mount and left it wedged (every later op ENXIO).

Route send1/send2/send3's error handling through a shared handle_send_error that treats transient, per-request errors (ENOENT, EINVAL, EAGAIN, EINTR) as recoverable — drop that one reply and keep serving — while connection-gone errors (ENODEV/EBADF on unmount) stay fatal so a clean unmount still stops the session. Policy extracted to a pure is_recoverable_reply_error fn with unit tests.

macFUSE can transiently reject a reply write to /dev/macfuse with EINVAL
(or EAGAIN/EINTR) under heavy concurrent load — e.g. while Spotlight
crawls the mount. The session loop treated ANY non-NotFound reply-send
failure as fatal (send_failed.notify -> session returns Err), so a single
transient failure tore down the whole mount and left it wedged (every
later op ENXIO).

Route send1/send2/send3's error handling through a shared
handle_send_error that treats transient, per-request errors (ENOENT,
EINVAL, EAGAIN, EINTR) as recoverable — drop that one reply and keep
serving — while connection-gone errors (ENODEV/EBADF on unmount) stay
fatal so a clean unmount still stops the session. Policy extracted to a
pure is_recoverable_reply_error fn with unit tests.
dlicudi added a commit to dlicudi/xearthlayer that referenced this pull request Jun 12, 2026
The reply-resilience patch is now submitted upstream as
Sherlock-Holo/fuse3#137. Once merged, repoint
the dependency to Sherlock-Holo/fuse3 (which also retires main's
samsoir/fuse3 branch pin, merged upstream as samsoir#136).
dlicudi added a commit to dlicudi/xearthlayer that referenced this pull request Jun 12, 2026
Update platform support in README and CLAUDE.md: macOS works with
macFUSE (Apple Silicon tested, multi-hour X-Plane flights). Notes the
fuse3 fork rev pin and its pending upstream PR (Sherlock-Holo/fuse3#137).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant