Coalesce watcher bursts before daemon reconciliation#73
Conversation
The daemon now merges filesystem event bursts into one logical wakeup window, keeps only one pending rerun while a tick is in flight, and exposes coalescing state in daemon_status. This brings the watcher backend in line with the next behavioral donor lane from watchexec without changing the public MCP surface. Constraint: Keep daemon_tick as the single reconciliation implementation while reducing watcher burst amplification Rejected: A separate coalescing tool or queue API | unnecessary surface area for internal daemon behavior Rejected: Unbounded pending reruns | risks event storms and duplicate work under bursty filesystems Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep watcher burst handling single-flight until a richer scheduler policy is proven with production telemetry Tested: cargo fmt --check; cargo test -p m1nd-mcp run_daemon_tick_marks_pending_rerun_when_already_in_flight -- --nocapture; cargo test -p m1nd-mcp daemon_ -- --nocapture; MCP burst smoke Not-tested: Multi-client contention across long-lived sessions
There was a problem hiding this comment.
Pull request overview
This PR introduces burst coalescing for filesystem watcher events and adds a single-flight mechanism so daemon reconciliation (“tick”) runs at most one at a time with a pending rerun signal.
Changes:
- Extend daemon runtime state with coalescing + single-flight status fields (and expose them via
daemon_status). - Coalesce
WatchNoticebursts using a configurable time window before triggering reconciliation. - Add
tick_in_flight/pending_rerungating inrun_daemon_tick, plus a unit test for the in-flight behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
m1nd-mcp/src/session.rs |
Adds new persisted daemon runtime state fields for coalescing and single-flight tracking. |
m1nd-mcp/src/server.rs |
Implements watcher burst coalescing and single-flight tick execution logic; adds a unit test. |
m1nd-mcp/src/daemon_handlers.rs |
Initializes and reports new coalescing/single-flight fields in daemon_start and daemon_status, and updates tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #[derive(Clone, Debug, Default, Serialize, Deserialize)] | ||
| pub struct DaemonRuntimeState { | ||
| pub active: bool, | ||
| pub started_at_ms: Option<u64>, | ||
| pub last_tick_ms: Option<u64>, | ||
| pub last_tick_trigger: Option<String>, | ||
| pub watch_paths: Vec<String>, | ||
| pub poll_interval_ms: u64, | ||
| pub coalesce_window_ms: u64, | ||
| pub pending_rerun: bool, | ||
| pub tick_in_flight: bool, | ||
| pub last_coalesced_event_ms: Option<u64>, | ||
| pub coalesced_event_count: u64, | ||
| pub tracked_files: HashMap<String, DaemonTrackedFile>, | ||
| pub tick_count: u64, |
There was a problem hiding this comment.
DaemonRuntimeState is persisted to daemon_state.json. Adding new non-optional fields without #[serde(default)] makes deserialization of previously persisted state fail (missing fields), causing load_daemon_state() to fall back to Default for the entire daemon state and potentially lose active/paths/intervals after upgrade. Consider adding #[serde(default)] on DaemonRuntimeState (or per-field defaults) so older JSON can be read forward-compatibly.
| state.daemon_state.tick_in_flight = false; | ||
|
|
||
| if state.daemon_state.pending_rerun { | ||
| state.daemon_state.pending_rerun = false; | ||
| state.daemon_state.tick_in_flight = true; | ||
| state.daemon_state.last_tick_trigger = Some("reconciliation".into()); | ||
| let _ = crate::daemon_handlers::handle_daemon_tick( | ||
| state, | ||
| layers::DaemonTickInput { | ||
| agent_id: "daemon".into(), | ||
| max_files: 32, | ||
| }, | ||
| ); | ||
| state.daemon_state.tick_in_flight = false; | ||
| } |
There was a problem hiding this comment.
pending_rerun can be set again while the reconciliation rerun is executing (calls to run_daemon_tick while tick_in_flight is true). After the rerun finishes, the function exits without re-checking pending_rerun, so a burst that spans the rerun window can leave pending_rerun=true but no tick scheduled, potentially missing prompt reconciliation until some later periodic/traffic trigger. Consider draining pending reruns in a loop (while pending_rerun) or re-checking once more after the rerun completes.
| @@ -2193,6 +2199,21 @@ fn run_daemon_tick(state: &mut SessionState, trigger: &str) { | |||
| max_files: 32, | |||
| }, | |||
| ); | |||
| state.daemon_state.tick_in_flight = false; | |||
|
|
|||
There was a problem hiding this comment.
tick_in_flight is manually set/reset around handle_daemon_tick. If handle_daemon_tick panics, tick_in_flight will remain true and the daemon can get stuck permanently marking pending_rerun without running ticks. Consider using an RAII guard (e.g., scopeguard) or std::panic::catch_unwind to ensure tick_in_flight is cleared on unwind.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 078c8a283d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| state.daemon_state.tick_in_flight = true; | ||
| state.daemon_state.last_tick_trigger = Some(trigger.to_string()); | ||
| let _ = crate::daemon_handlers::handle_daemon_tick( |
There was a problem hiding this comment.
Persist tick_in_flight after clearing post-tick state
run_daemon_tick marks tick_in_flight = true before calling handle_daemon_tick, but handle_daemon_tick persists daemon_state during that call; the flag is cleared only afterward and never re-persisted. This means a normal tick can leave daemon_state.json with tick_in_flight: true, and after restart run_daemon_tick will keep returning early (setting pending_rerun) instead of reconciling, effectively stalling daemon ticks until daemon_start resets the state.
Useful? React with 👍 / 👎.
| match rx.recv_timeout(Duration::from_millis( | ||
| self.state.daemon_state.coalesce_window_ms.max(1), | ||
| )) { |
There was a problem hiding this comment.
Propagate StdinClosed out of watch-event coalescing loop
Using blocking recv_timeout inside the watch-notice coalescing loop creates a window where StdinClosed is consumed by the inner loop and dropped instead of terminating serve(). If EOF arrives during this debounce wait, the outer match never receives ServerEvent::StdinClosed, so the process can keep running indefinitely (the daemon runtime still holds a sender, so channel disconnect does not force exit).
Useful? React with 👍 / 👎.
Summary
daemon_tickas the only reconciliation implementationdaemon_statusValidation
Why this matters
This is the watchexec-style behavior layer on top of the native watcher backend. The daemon now reacts once per logical change burst instead of stampeding itself with redundant ticks.