Add adaptive backoff to the daemon scheduler#68
Conversation
The daemon scheduler now reports when the next tick is due and whether it is currently overdue, and the server loop uses remaining-time scheduling instead of sleeping the full poll interval every time. Constraint: Improve scheduler visibility and efficiency without introducing new watcher dependencies Rejected: Waiting for notify/watchman before adding scheduler timing signals | left the current polling scheduler opaque and less efficient than necessary Rejected: A new scheduler tool | the timing state belongs in daemon_status and the existing server loop Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep next-tick timing derived from persisted daemon state so future watcher backends can share the same status contract Tested: cargo fmt --check; cargo check -p m1nd-mcp -p m1nd-ingest; cargo test -p m1nd-mcp daemon_ -- --nocapture; cargo clippy -p m1nd-mcp -p m1nd-ingest -- -D warnings; MCP smoke for next_tick_due_ms / overdue_ms Not-tested: Interaction with future native watcher backends and long-running background scheduler drift over many hours
The daemon scheduler now tracks idle streaks, scales its effective poll interval under quiet conditions, and reports that effective interval through daemon_status. This reduces wasted polling while keeping the daemon fully observable through the existing status contract. Constraint: Improve daemon efficiency without adding watcher dependencies or changing the existing daemon tool surface Rejected: Fixed-interval polling until native watchers land | wastes cycles and hides useful scheduler state during the interim phase Rejected: A separate scheduler configuration tool | unnecessary surface growth before watcher backends exist Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep scheduler backoff state derived from daemon runtime state so native watcher backends can eventually bypass polling without breaking status semantics Tested: cargo fmt --check; cargo check -p m1nd-mcp -p m1nd-ingest; cargo test -p m1nd-mcp daemon_ -- --nocapture; cargo clippy -p m1nd-mcp -p m1nd-ingest -- -D warnings; MCP smoke for effective_poll_interval_ms / idle_streak / next_tick_due_ms / overdue_ms Not-tested: Very long idle periods across multiple days and interaction with future watcher-native backends
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 28d7059167
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pub idle_streak: u32, | ||
| pub max_backoff_multiplier: u32, |
There was a problem hiding this comment.
Add defaults for new daemon-state fields
The new idle_streak and max_backoff_multiplier fields are required during deserialization, so pre-existing daemon_state.json files (written before this commit) will fail to parse in load_daemon_state and silently fall back to DaemonRuntimeState::default(). On upgrade, that drops persisted daemon runtime state (including active status, watch paths, tracked files, and counters), which breaks cross-session daemon persistence.
Useful? React with 👍 / 👎.
| state | ||
| .daemon_state | ||
| .last_tick_ms | ||
| .map(|last| last.saturating_add(state.daemon_state.poll_interval_ms)) |
There was a problem hiding this comment.
Base daemon_status due time on effective interval
next_tick_due_ms is computed with last_tick_ms + poll_interval_ms, but the scheduler now waits on a backoff-expanded interval. When idle_streak > 0, daemon_status reports an earlier due time and premature overdue_ms, so monitoring/automation that trusts these fields will see false overdue conditions.
Useful? React with 👍 / 👎.
| } else { | ||
| effective_poll_interval_ms | ||
| .saturating_sub(elapsed) | ||
| .clamp(25, 1000) |
There was a problem hiding this comment.
Remove 1s cap from adaptive daemon wait
Capping the remaining wait with .clamp(25, 1000) truncates backoff intervals above 1 second. After a few idle ticks (for example, 200ms base with idle_streak >= 3), the loop wakes every second and background_tick_if_due can tick immediately because the base poll interval has elapsed, so adaptive backoff never reaches its intended multi-second range.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR adds idle-streak tracking and an adaptive (exponential) backoff mechanism to the daemon scheduler, and exposes additional scheduling telemetry (effective_poll_interval_ms, idle_streak, plus next/overdue tick timing) via daemon_status.
Changes:
- Extend
DaemonRuntimeStatewithidle_streakandmax_backoff_multiplierfor adaptive scheduling. - Add
daemon_wait_duration_ms()and use it to drive the server loop’srecv_timeoutwait duration. - Enhance
daemon_statusoutput and update tick logic to increment/resetidle_streakbased on whether work happened.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
m1nd-mcp/src/session.rs |
Adds persisted daemon runtime fields needed for backoff state tracking. |
m1nd-mcp/src/server.rs |
Introduces a helper to compute adaptive wait duration and applies it to the main server loop; adds tests. |
m1nd-mcp/src/daemon_handlers.rs |
Initializes/updates idle streak, and exposes scheduling/backoff fields in daemon_status. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pub last_tick_changed_files: usize, | ||
| pub last_tick_deleted_files: usize, | ||
| pub last_tick_alerts_emitted: usize, | ||
| pub idle_streak: u32, |
There was a problem hiding this comment.
DaemonRuntimeState is persisted to disk; adding non-optional fields without #[serde(default)] will make older daemon_state.json files fail to deserialize, causing load_daemon_state() to fall back to a full default state (potentially disabling an active daemon and losing counters). Add #[serde(default)] (or #[serde(default = "...")]) for the new fields to keep backward compatibility.
| pub idle_streak: u32, | |
| pub idle_streak: u32, | |
| #[serde(default)] |
| let effective_poll_interval_ms = state.daemon_state.poll_interval_ms.saturating_mul( | ||
| 2u64.pow( | ||
| state | ||
| .daemon_state | ||
| .idle_streak | ||
| .min(state.daemon_state.max_backoff_multiplier.saturating_sub(1)), | ||
| ), | ||
| ); |
There was a problem hiding this comment.
Using 2u64.pow(exponent) risks overflow/wrap (and can panic in debug) if idle_streak/max_backoff_multiplier allow exponent >= 64. Prefer u64::saturating_pow, checked_pow with a fallback, or cap exponent to a safe maximum (e.g. 0..=63) before exponentiation.
| let effective_poll_interval_ms = state.daemon_state.poll_interval_ms.saturating_mul( | |
| 2u64.pow( | |
| state | |
| .daemon_state | |
| .idle_streak | |
| .min(state.daemon_state.max_backoff_multiplier.saturating_sub(1)), | |
| ), | |
| ); | |
| let backoff_exponent = state | |
| .daemon_state | |
| .idle_streak | |
| .min(state.daemon_state.max_backoff_multiplier.saturating_sub(1)) | |
| .min(63); | |
| let effective_poll_interval_ms = state | |
| .daemon_state | |
| .poll_interval_ms | |
| .saturating_mul(2u64.pow(backoff_exponent)); |
| "idle_streak": state.daemon_state.idle_streak, | ||
| "max_backoff_multiplier": state.daemon_state.max_backoff_multiplier, |
There was a problem hiding this comment.
max_backoff_multiplier is used as an exponent cap (2^exponent), so a value of 8 actually allows up to 128× backoff. Either rename this field to reflect exponent semantics (e.g. max_backoff_exponent) or change the math to cap the multiplier itself (e.g. multiplier = min(2^idle_streak, max_backoff_multiplier)).
| effective_poll_interval_ms | ||
| .saturating_sub(elapsed) | ||
| .clamp(25, 1000) | ||
| } |
There was a problem hiding this comment.
The remaining-wait clamp to <= 1000ms prevents adaptive backoff from ever sleeping longer than 1s; once the effective interval exceeds 1s, the loop will wake every second and background_tick_if_due() will see the base poll_interval_ms as due, effectively negating the backoff. Consider allowing the remainder up to the effective interval (or the same 10_000ms cap) and ensure the tick-due checks use the effective interval as well.
| let effective_poll_interval_ms = state | ||
| .daemon_state | ||
| .poll_interval_ms | ||
| .saturating_mul(2u64.pow(exponent)) |
There was a problem hiding this comment.
Same overflow risk here: 2u64.pow(exponent) can overflow/wrap (and panic in debug) if exponent grows. Use a saturating/checked exponentiation approach or cap exponent before computing the multiplier.
| let effective_poll_interval_ms = state | |
| .daemon_state | |
| .poll_interval_ms | |
| .saturating_mul(2u64.pow(exponent)) | |
| let backoff_multiplier = 1u64.checked_shl(exponent).unwrap_or(u64::MAX); | |
| let effective_poll_interval_ms = state | |
| .daemon_state | |
| .poll_interval_ms | |
| .saturating_mul(backoff_multiplier) |
| let now = now_ms(); | ||
| let next_tick_due_ms = if state.daemon_state.active && state.daemon_state.poll_interval_ms > 0 { | ||
| state | ||
| .daemon_state | ||
| .last_tick_ms | ||
| .map(|last| last.saturating_add(state.daemon_state.poll_interval_ms)) | ||
| } else { | ||
| None | ||
| }; | ||
| let overdue_ms = next_tick_due_ms.map(|due| now.saturating_sub(due)); | ||
| let effective_poll_interval_ms = state.daemon_state.poll_interval_ms.saturating_mul( | ||
| 2u64.pow( | ||
| state | ||
| .daemon_state |
There was a problem hiding this comment.
next_tick_due_ms/overdue_ms are computed using the base poll_interval_ms, but the scheduler uses an adaptive effective interval (idle backoff + clamping). This will make daemon_status diverge from actual scheduling; compute these based on the same effective interval logic used by the scheduler (and apply the same clamps).
Summary
effective_poll_interval_msandidle_streakindaemon_statusnext_tick_due_ms/overdue_msaligned with the effective scheduler intervalValidation
effective_poll_interval_ms/idle_streak/next_tick_due_ms/overdue_msWhy this matters
This makes the daemon scheduler materially smarter before native watcher dependencies arrive. The daemon now backs off when the workspace is quiet, while still exposing exactly when it plans to wake up next.