Skip to content

Coalesce watcher bursts before daemon reconciliation#73

Open
maxkle1nz wants to merge 1 commit into
mainfrom
codex/m1nd-daemon-coalesce-v3
Open

Coalesce watcher bursts before daemon reconciliation#73
maxkle1nz wants to merge 1 commit into
mainfrom
codex/m1nd-daemon-coalesce-v3

Conversation

@maxkle1nz
Copy link
Copy Markdown
Owner

Summary

  • coalesce watcher bursts into one logical daemon wakeup window
  • add single-flight tick execution with at most one pending rerun
  • keep daemon_tick as the only reconciliation implementation
  • expose coalescing state in daemon_status

Validation

  • cargo fmt --check
  • cargo test -p m1nd-mcp run_daemon_tick_marks_pending_rerun_when_already_in_flight -- --nocapture
  • cargo test -p m1nd-mcp daemon_ -- --nocapture
  • MCP burst smoke for coalesced watcher events

Why this matters

This is the watchexec-style behavior layer on top of the native watcher backend. The daemon now reacts once per logical change burst instead of stampeding itself with redundant ticks.

The daemon now merges filesystem event bursts into one logical wakeup window, keeps only one pending rerun while a tick is in flight, and exposes coalescing state in daemon_status. This brings the watcher backend in line with the next behavioral donor lane from watchexec without changing the public MCP surface.

Constraint: Keep daemon_tick as the single reconciliation implementation while reducing watcher burst amplification
Rejected: A separate coalescing tool or queue API | unnecessary surface area for internal daemon behavior
Rejected: Unbounded pending reruns | risks event storms and duplicate work under bursty filesystems
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep watcher burst handling single-flight until a richer scheduler policy is proven with production telemetry
Tested: cargo fmt --check; cargo test -p m1nd-mcp run_daemon_tick_marks_pending_rerun_when_already_in_flight -- --nocapture; cargo test -p m1nd-mcp daemon_ -- --nocapture; MCP burst smoke
Not-tested: Multi-client contention across long-lived sessions
Copilot AI review requested due to automatic review settings April 5, 2026 21:32
@maxkle1nz maxkle1nz enabled auto-merge (squash) April 5, 2026 21:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces burst coalescing for filesystem watcher events and adds a single-flight mechanism so daemon reconciliation (“tick”) runs at most one at a time with a pending rerun signal.

Changes:

  • Extend daemon runtime state with coalescing + single-flight status fields (and expose them via daemon_status).
  • Coalesce WatchNotice bursts using a configurable time window before triggering reconciliation.
  • Add tick_in_flight/pending_rerun gating in run_daemon_tick, plus a unit test for the in-flight behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
m1nd-mcp/src/session.rs Adds new persisted daemon runtime state fields for coalescing and single-flight tracking.
m1nd-mcp/src/server.rs Implements watcher burst coalescing and single-flight tick execution logic; adds a unit test.
m1nd-mcp/src/daemon_handlers.rs Initializes and reports new coalescing/single-flight fields in daemon_start and daemon_status, and updates tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread m1nd-mcp/src/session.rs
Comment on lines 162 to 176
#[derive(Clone, Debug, Default, Serialize, Deserialize)]
pub struct DaemonRuntimeState {
pub active: bool,
pub started_at_ms: Option<u64>,
pub last_tick_ms: Option<u64>,
pub last_tick_trigger: Option<String>,
pub watch_paths: Vec<String>,
pub poll_interval_ms: u64,
pub coalesce_window_ms: u64,
pub pending_rerun: bool,
pub tick_in_flight: bool,
pub last_coalesced_event_ms: Option<u64>,
pub coalesced_event_count: u64,
pub tracked_files: HashMap<String, DaemonTrackedFile>,
pub tick_count: u64,
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DaemonRuntimeState is persisted to daemon_state.json. Adding new non-optional fields without #[serde(default)] makes deserialization of previously persisted state fail (missing fields), causing load_daemon_state() to fall back to Default for the entire daemon state and potentially lose active/paths/intervals after upgrade. Consider adding #[serde(default)] on DaemonRuntimeState (or per-field defaults) so older JSON can be read forward-compatibly.

Copilot uses AI. Check for mistakes.
Comment thread m1nd-mcp/src/server.rs
Comment on lines +2202 to +2216
state.daemon_state.tick_in_flight = false;

if state.daemon_state.pending_rerun {
state.daemon_state.pending_rerun = false;
state.daemon_state.tick_in_flight = true;
state.daemon_state.last_tick_trigger = Some("reconciliation".into());
let _ = crate::daemon_handlers::handle_daemon_tick(
state,
layers::DaemonTickInput {
agent_id: "daemon".into(),
max_files: 32,
},
);
state.daemon_state.tick_in_flight = false;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending_rerun can be set again while the reconciliation rerun is executing (calls to run_daemon_tick while tick_in_flight is true). After the rerun finishes, the function exits without re-checking pending_rerun, so a burst that spans the rerun window can leave pending_rerun=true but no tick scheduled, potentially missing prompt reconciliation until some later periodic/traffic trigger. Consider draining pending reruns in a loop (while pending_rerun) or re-checking once more after the rerun completes.

Copilot uses AI. Check for mistakes.
Comment thread m1nd-mcp/src/server.rs
Comment on lines 2193 to +2203
@@ -2193,6 +2199,21 @@ fn run_daemon_tick(state: &mut SessionState, trigger: &str) {
max_files: 32,
},
);
state.daemon_state.tick_in_flight = false;

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tick_in_flight is manually set/reset around handle_daemon_tick. If handle_daemon_tick panics, tick_in_flight will remain true and the daemon can get stuck permanently marking pending_rerun without running ticks. Consider using an RAII guard (e.g., scopeguard) or std::panic::catch_unwind to ensure tick_in_flight is cleared on unwind.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 078c8a283d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread m1nd-mcp/src/server.rs
Comment on lines +2193 to 2195
state.daemon_state.tick_in_flight = true;
state.daemon_state.last_tick_trigger = Some(trigger.to_string());
let _ = crate::daemon_handlers::handle_daemon_tick(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Persist tick_in_flight after clearing post-tick state

run_daemon_tick marks tick_in_flight = true before calling handle_daemon_tick, but handle_daemon_tick persists daemon_state during that call; the flag is cleared only afterward and never re-persisted. This means a normal tick can leave daemon_state.json with tick_in_flight: true, and after restart run_daemon_tick will keep returning early (setting pending_rerun) instead of reconciling, effectively stalling daemon ticks until daemon_start resets the state.

Useful? React with 👍 / 👎.

Comment thread m1nd-mcp/src/server.rs
Comment on lines +2682 to +2684
match rx.recv_timeout(Duration::from_millis(
self.state.daemon_state.coalesce_window_ms.max(1),
)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Propagate StdinClosed out of watch-event coalescing loop

Using blocking recv_timeout inside the watch-notice coalescing loop creates a window where StdinClosed is consumed by the inner loop and dropped instead of terminating serve(). If EOF arrives during this debounce wait, the outer match never receives ServerEvent::StdinClosed, so the process can keep running indefinitely (the daemon runtime still holds a sender, so channel disconnect does not force exit).

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants