Skip to content

Add adaptive backoff to the daemon scheduler#68

Open
maxkle1nz wants to merge 2 commits into
mainfrom
codex/m1nd-daemon-scheduler-v1
Open

Add adaptive backoff to the daemon scheduler#68
maxkle1nz wants to merge 2 commits into
mainfrom
codex/m1nd-daemon-scheduler-v1

Conversation

@maxkle1nz
Copy link
Copy Markdown
Owner

Summary

  • add idle-streak tracking and adaptive poll backoff to the daemon scheduler
  • expose effective_poll_interval_ms and idle_streak in daemon_status
  • keep next_tick_due_ms / overdue_ms aligned with the effective scheduler interval

Validation

  • cargo fmt --check
  • cargo check -p m1nd-mcp -p m1nd-ingest
  • cargo test -p m1nd-mcp daemon_ -- --nocapture
  • cargo clippy -p m1nd-mcp -p m1nd-ingest -- -D warnings
  • MCP smoke for effective_poll_interval_ms / idle_streak / next_tick_due_ms / overdue_ms

Why this matters

This makes the daemon scheduler materially smarter before native watcher dependencies arrive. The daemon now backs off when the workspace is quiet, while still exposing exactly when it plans to wake up next.

max kle1nz added 2 commits April 5, 2026 22:27
The daemon scheduler now reports when the next tick is due and whether it is
currently overdue, and the server loop uses remaining-time scheduling instead
of sleeping the full poll interval every time.

Constraint: Improve scheduler visibility and efficiency without introducing new watcher dependencies
Rejected: Waiting for notify/watchman before adding scheduler timing signals | left the current polling scheduler opaque and less efficient than necessary
Rejected: A new scheduler tool | the timing state belongs in daemon_status and the existing server loop
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep next-tick timing derived from persisted daemon state so future watcher backends can share the same status contract
Tested: cargo fmt --check; cargo check -p m1nd-mcp -p m1nd-ingest; cargo test -p m1nd-mcp daemon_ -- --nocapture; cargo clippy -p m1nd-mcp -p m1nd-ingest -- -D warnings; MCP smoke for next_tick_due_ms / overdue_ms
Not-tested: Interaction with future native watcher backends and long-running background scheduler drift over many hours
The daemon scheduler now tracks idle streaks, scales its effective poll interval
under quiet conditions, and reports that effective interval through daemon_status.
This reduces wasted polling while keeping the daemon fully observable through the
existing status contract.

Constraint: Improve daemon efficiency without adding watcher dependencies or changing the existing daemon tool surface
Rejected: Fixed-interval polling until native watchers land | wastes cycles and hides useful scheduler state during the interim phase
Rejected: A separate scheduler configuration tool | unnecessary surface growth before watcher backends exist
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep scheduler backoff state derived from daemon runtime state so native watcher backends can eventually bypass polling without breaking status semantics
Tested: cargo fmt --check; cargo check -p m1nd-mcp -p m1nd-ingest; cargo test -p m1nd-mcp daemon_ -- --nocapture; cargo clippy -p m1nd-mcp -p m1nd-ingest -- -D warnings; MCP smoke for effective_poll_interval_ms / idle_streak / next_tick_due_ms / overdue_ms
Not-tested: Very long idle periods across multiple days and interaction with future watcher-native backends
Copilot AI review requested due to automatic review settings April 5, 2026 20:41
@maxkle1nz maxkle1nz enabled auto-merge (squash) April 5, 2026 20:42
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 28d7059167

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread m1nd-mcp/src/session.rs
Comment on lines +175 to +176
pub idle_streak: u32,
pub max_backoff_multiplier: u32,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add defaults for new daemon-state fields

The new idle_streak and max_backoff_multiplier fields are required during deserialization, so pre-existing daemon_state.json files (written before this commit) will fail to parse in load_daemon_state and silently fall back to DaemonRuntimeState::default(). On upgrade, that drops persisted daemon runtime state (including active status, watch paths, tracked files, and counters), which breaks cross-session daemon persistence.

Useful? React with 👍 / 👎.

state
.daemon_state
.last_tick_ms
.map(|last| last.saturating_add(state.daemon_state.poll_interval_ms))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Base daemon_status due time on effective interval

next_tick_due_ms is computed with last_tick_ms + poll_interval_ms, but the scheduler now waits on a backoff-expanded interval. When idle_streak > 0, daemon_status reports an earlier due time and premature overdue_ms, so monitoring/automation that trusts these fields will see false overdue conditions.

Useful? React with 👍 / 👎.

Comment thread m1nd-mcp/src/server.rs
} else {
effective_poll_interval_ms
.saturating_sub(elapsed)
.clamp(25, 1000)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove 1s cap from adaptive daemon wait

Capping the remaining wait with .clamp(25, 1000) truncates backoff intervals above 1 second. After a few idle ticks (for example, 200ms base with idle_streak >= 3), the loop wakes every second and background_tick_if_due can tick immediately because the base poll interval has elapsed, so adaptive backoff never reaches its intended multi-second range.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds idle-streak tracking and an adaptive (exponential) backoff mechanism to the daemon scheduler, and exposes additional scheduling telemetry (effective_poll_interval_ms, idle_streak, plus next/overdue tick timing) via daemon_status.

Changes:

  • Extend DaemonRuntimeState with idle_streak and max_backoff_multiplier for adaptive scheduling.
  • Add daemon_wait_duration_ms() and use it to drive the server loop’s recv_timeout wait duration.
  • Enhance daemon_status output and update tick logic to increment/reset idle_streak based on whether work happened.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
m1nd-mcp/src/session.rs Adds persisted daemon runtime fields needed for backoff state tracking.
m1nd-mcp/src/server.rs Introduces a helper to compute adaptive wait duration and applies it to the main server loop; adds tests.
m1nd-mcp/src/daemon_handlers.rs Initializes/updates idle streak, and exposes scheduling/backoff fields in daemon_status.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread m1nd-mcp/src/session.rs
pub last_tick_changed_files: usize,
pub last_tick_deleted_files: usize,
pub last_tick_alerts_emitted: usize,
pub idle_streak: u32,
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DaemonRuntimeState is persisted to disk; adding non-optional fields without #[serde(default)] will make older daemon_state.json files fail to deserialize, causing load_daemon_state() to fall back to a full default state (potentially disabling an active daemon and losing counters). Add #[serde(default)] (or #[serde(default = "...")]) for the new fields to keep backward compatibility.

Suggested change
pub idle_streak: u32,
pub idle_streak: u32,
#[serde(default)]

Copilot uses AI. Check for mistakes.
Comment on lines +203 to +210
let effective_poll_interval_ms = state.daemon_state.poll_interval_ms.saturating_mul(
2u64.pow(
state
.daemon_state
.idle_streak
.min(state.daemon_state.max_backoff_multiplier.saturating_sub(1)),
),
);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using 2u64.pow(exponent) risks overflow/wrap (and can panic in debug) if idle_streak/max_backoff_multiplier allow exponent >= 64. Prefer u64::saturating_pow, checked_pow with a fallback, or cap exponent to a safe maximum (e.g. 0..=63) before exponentiation.

Suggested change
let effective_poll_interval_ms = state.daemon_state.poll_interval_ms.saturating_mul(
2u64.pow(
state
.daemon_state
.idle_streak
.min(state.daemon_state.max_backoff_multiplier.saturating_sub(1)),
),
);
let backoff_exponent = state
.daemon_state
.idle_streak
.min(state.daemon_state.max_backoff_multiplier.saturating_sub(1))
.min(63);
let effective_poll_interval_ms = state
.daemon_state
.poll_interval_ms
.saturating_mul(2u64.pow(backoff_exponent));

Copilot uses AI. Check for mistakes.
Comment on lines +227 to +228
"idle_streak": state.daemon_state.idle_streak,
"max_backoff_multiplier": state.daemon_state.max_backoff_multiplier,
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_backoff_multiplier is used as an exponent cap (2^exponent), so a value of 8 actually allows up to 128× backoff. Either rename this field to reflect exponent semantics (e.g. max_backoff_exponent) or change the math to cap the multiplier itself (e.g. multiplier = min(2^idle_streak, max_backoff_multiplier)).

Copilot uses AI. Check for mistakes.
Comment thread m1nd-mcp/src/server.rs
Comment on lines +2189 to +2192
effective_poll_interval_ms
.saturating_sub(elapsed)
.clamp(25, 1000)
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remaining-wait clamp to <= 1000ms prevents adaptive backoff from ever sleeping longer than 1s; once the effective interval exceeds 1s, the loop will wake every second and background_tick_if_due() will see the base poll_interval_ms as due, effectively negating the backoff. Consider allowing the remainder up to the effective interval (or the same 10_000ms cap) and ensure the tick-due checks use the effective interval as well.

Copilot uses AI. Check for mistakes.
Comment thread m1nd-mcp/src/server.rs
Comment on lines +2177 to +2180
let effective_poll_interval_ms = state
.daemon_state
.poll_interval_ms
.saturating_mul(2u64.pow(exponent))
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same overflow risk here: 2u64.pow(exponent) can overflow/wrap (and panic in debug) if exponent grows. Use a saturating/checked exponentiation approach or cap exponent before computing the multiplier.

Suggested change
let effective_poll_interval_ms = state
.daemon_state
.poll_interval_ms
.saturating_mul(2u64.pow(exponent))
let backoff_multiplier = 1u64.checked_shl(exponent).unwrap_or(u64::MAX);
let effective_poll_interval_ms = state
.daemon_state
.poll_interval_ms
.saturating_mul(backoff_multiplier)

Copilot uses AI. Check for mistakes.
Comment on lines +193 to +206
let now = now_ms();
let next_tick_due_ms = if state.daemon_state.active && state.daemon_state.poll_interval_ms > 0 {
state
.daemon_state
.last_tick_ms
.map(|last| last.saturating_add(state.daemon_state.poll_interval_ms))
} else {
None
};
let overdue_ms = next_tick_due_ms.map(|due| now.saturating_sub(due));
let effective_poll_interval_ms = state.daemon_state.poll_interval_ms.saturating_mul(
2u64.pow(
state
.daemon_state
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next_tick_due_ms/overdue_ms are computed using the base poll_interval_ms, but the scheduler uses an adaptive effective interval (idle backoff + clamping). This will make daemon_status diverge from actual scheduling; compute these based on the same effective interval logic used by the scheduler (and apply the same clamps).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants