Skip to content

Prefer merge-base baselines for Git-aware daemon startup#76

Open
maxkle1nz wants to merge 3 commits into
mainfrom
codex/m1nd-daemon-merge-base-v1
Open

Prefer merge-base baselines for Git-aware daemon startup#76
maxkle1nz wants to merge 3 commits into
mainfrom
codex/m1nd-daemon-merge-base-v1

Conversation

@maxkle1nz
Copy link
Copy Markdown
Owner

Summary

  • prefer a merge-base baseline against the configured upstream when a watched Git root has upstream tracking
  • fall back to HEAD when no upstream or merge-base is available
  • keep the baseline decision adapter-local while preserving the current daemon tool surface

Validation

  • cargo fmt --check
  • cargo test -p m1nd-mcp daemon_start_detects_git_root_and_head -- --nocapture
  • cargo test -p m1nd-mcp daemon_start_prefers_merge_base_when_upstream_exists -- --nocapture
  • MCP smoke for git_baseline_kind = merge_base

Why this matters

This improves the daemon's Git-aware changed-set quality for feature branches. The daemon now starts from a more meaningful baseline than HEAD when upstream context exists, which is the next step toward richer SCM-aware semantics.

max kle1nz added 3 commits April 5, 2026 23:42
This adds a Git-aware adapter layer to the daemon. When a watched root is inside a Git worktree and native watching is available, the daemon upgrades to  and uses Git changed sets to decide what to reconcile, while still falling back to filesystem scanning when Git lookup fails.

Constraint: Keep daemon_tick as the single reconciliation path and keep Git awareness advisory instead of mandatory
Rejected: A separate public SCM tool | unnecessary surface area because daemon_status already reports backend state and errors
Rejected: Replacing native watch wakeups with Git polling alone | loses the benefit of low-latency filesystem wakeups
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep SCM semantics in the adapter layer so future Watchman/Sapling integration can plug in without rewriting daemon core logic
Tested: cargo fmt --check; cargo test -p m1nd-mcp daemon_start_detects_git_root_and_head -- --nocapture; cargo test -p m1nd-mcp daemon_tick_uses_git_changed_set_when_available -- --nocapture; MCP smoke for git_native_fs success
Not-tested: Non-Git SCMs and long-running Git root changes during daemon lifetime
The daemon now detects in-progress Git operations such as merge, rebase, cherry-pick, and index-lock churn and defers Git-aware reconciliation instead of reacting mid-operation. This keeps the daemon from treating transient working-tree churn as meaningful drift while preserving the existing tick/status contract.

Constraint: Preserve Git-aware reconciliation while avoiding false work during unstable repository transitions
Rejected: Disabling Git-aware mode entirely during SCM operations | loses useful backend state and observability just when the daemon should explain itself
Rejected: Forcing raw filesystem fallback during active merge/rebase churn | still risks noisy, misleading reconciliation during unstable states
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep Git operation detection advisory and status-visible so future Watchman/Sapling adapters can reuse the same defer contract
Tested: cargo fmt --check; cargo test -p m1nd-mcp daemon_start_detects_git_root_and_head -- --nocapture; cargo test -p m1nd-mcp daemon_tick_uses_git_changed_set_when_available -- --nocapture; cargo test -p m1nd-mcp daemon_tick_defers_when_git_operation_is_in_progress -- --nocapture; MCP smoke for deferred merge state
Not-tested: Long-running rebases/checkouts under concurrent watcher traffic
The Git-aware daemon adapter now prefers a merge-base baseline against the configured upstream when available, falling back to HEAD otherwise. This makes the initial changed-set cursor upstream-aware without changing the daemon's public MCP surface.

Constraint: Improve SCM-aware changed-set quality without introducing a full Watchman protocol or altering daemon_tick authority
Rejected: Always baseline from HEAD | loses useful repo semantics for feature branches tracking an upstream branch
Rejected: Introducing a separate baseline-selection tool | unnecessary public surface for adapter-internal policy
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep baseline selection adapter-local until richer SCM backends share the same contract
Tested: cargo fmt --check; cargo test -p m1nd-mcp daemon_start_detects_git_root_and_head -- --nocapture; cargo test -p m1nd-mcp daemon_start_prefers_merge_base_when_upstream_exists -- --nocapture; MCP smoke for merge_base startup state
Not-tested: Non-Git SCMs and dynamic upstream changes during long-running daemon sessions
Copilot AI review requested due to automatic review settings April 5, 2026 22:04
@maxkle1nz maxkle1nz enabled auto-merge (squash) April 5, 2026 22:04
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4f88a548e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

) -> Result<Vec<PathBuf>, String> {
let mut changed = Vec::new();
let diff_args: Vec<&str> = if let Some(reference) = since_ref {
vec!["diff", "--name-only", reference, "--"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Include untracked files in Git backend change detection

When since_ref is set, git_changed_absolute_paths uses git diff --name-only <ref> --, which does not report untracked files. I verified CLI semantics locally: git diff -h documents the one-commit form (git diff [<options>] [<commit>] [--] [<path>...]), and in a repro repo git diff --name-only HEAD -- returned nothing while git status --porcelain showed ?? new.txt. In this code path, new files in a normal (already-committed) repo never enter changed_entries, so they are not ingested until manually added to Git.

Useful? React with 👍 / 👎.

.find(|entry| entry.file_path == path_str)
.cloned()
{
changed_entries.push(entry);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid reingesting unchanged dirty files on every tick

The Git backend pushes every path returned by git diff directly into changed_entries without comparing against tracked_files. Because git_since_ref is then advanced to HEAD, any file left modified/staged (but not committed) keeps appearing in git diff HEAD on subsequent ticks, so the daemon repeatedly reingests the same file and re-emits proactive alerts. This can create continuous churn and alert noise while a developer is editing but has not committed yet.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates daemon startup and tick behavior to use a Git-aware baseline (prefer merge-base vs upstream when available) and to surface Git-related runtime state, improving changed-set detection for feature branches.

Changes:

  • Persist and expose additional Git-related daemon runtime fields (git root, baseline ref/kind, last scan stats, operation-in-progress state).
  • On daemon start, detect a Git root from watch paths and select an initial baseline (merge-base when upstream exists; otherwise HEAD).
  • On daemon ticks, optionally use git diff/status to drive the changed file set and defer ticks during active Git operations.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
m1nd-mcp/src/session.rs Extends persisted DaemonRuntimeState with Git baseline/scan fields.
m1nd-mcp/src/server.rs Adjusts reported watcher backend string when Git mode is active.
m1nd-mcp/src/daemon_handlers.rs Adds Git root/baseline detection, Git-based changed-set logic, deferral during Git operations, and tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread m1nd-mcp/src/session.rs
Comment on lines +188 to +197
pub git_root: Option<String>,
pub git_baseline_ref: Option<String>,
pub git_baseline_kind: Option<String>,
pub git_since_ref: Option<String>,
pub last_git_scan_ms: Option<u64>,
pub last_git_changed_files: usize,
pub git_backend_error: Option<String>,
pub git_operation_in_progress: bool,
pub git_operation_kind: Option<String>,
pub deferred_ticks: u64,
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DaemonRuntimeState is persisted to daemon_state.json; adding new non-optional fields without #[serde(default)] (on the struct or per-field) will cause deserializing older state files to fail, and load_daemon_state will then silently fall back to Default, effectively wiping prior daemon state on upgrade. Consider adding #[serde(default)] to DaemonRuntimeState (or each newly added field) to keep backwards compatibility while still allowing new fields to be introduced safely.

Copilot uses AI. Check for mistakes.
Comment on lines +149 to +153
let output = Command::new("git")
.args(["rev-parse", "--show-toplevel"])
.current_dir(&root_hint)
.output()
.ok()?;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git_root_for_watch_paths uses .output().ok()? inside the loop; if spawning git fails for the first watch path (e.g., git not on PATH / permission issue), the ? returns early and the function never checks the remaining watch paths. Handle the error per-path (e.g., if let Ok(output) = ...) and continue so other candidate roots can still be detected.

Suggested change
let output = Command::new("git")
.args(["rev-parse", "--show-toplevel"])
.current_dir(&root_hint)
.output()
.ok()?;
let output = if let Ok(output) = Command::new("git")
.args(["rev-parse", "--show-toplevel"])
.current_dir(&root_hint)
.output()
{
output
} else {
continue;
};

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +343
state.daemon_state.git_root = git_root_for_watch_paths(&state.daemon_state.watch_paths)
.map(|root| root.to_string_lossy().to_string());
let (git_baseline_ref, git_baseline_kind, _git_upstream_ref) = state
.daemon_state
.git_root
.as_deref()
.map(|root| git_initial_baseline(Path::new(root)))
.unwrap_or((None, None, None));
state.daemon_state.git_baseline_ref = git_baseline_ref.clone();
state.daemon_state.git_baseline_kind = git_baseline_kind;
state.daemon_state.git_since_ref = git_baseline_ref;
state.daemon_state.last_git_scan_ms = None;
state.daemon_state.last_git_changed_files = 0;
state.daemon_state.git_backend_error = None;
state.daemon_state.git_operation_in_progress = false;
state.daemon_state.git_operation_kind = None;
state.daemon_state.deferred_ticks = 0;
if state.daemon_state.git_root.is_some() {
state.daemon_state.watch_backend = "git_native_fs".into();
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to watch_backend == "git_native_fs" makes daemon_tick rely on a single detected git_root for change detection. If watch_paths contains multiple roots (or paths outside that git repo), changes in non-git roots will be ignored because only git diff/status results under git_root are considered. Consider only enabling the git changed-set path when all watch_paths are under the same git_root (or compute per-root git baselines), otherwise fall back to the existing inventory comparison for correctness.

Copilot uses AI. Check for mistakes.
Comment on lines +501 to +508
for path in paths {
let path_str = path.to_string_lossy().to_string();
if let Some(entry) = live_inventory
.values()
.find(|entry| entry.file_path == path_str)
.cloned()
{
changed_entries.push(entry);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the git-native path, each changed file path is matched by scanning live_inventory.values().find(...), which is O(changed_files * inventory_size) and can get expensive for large repos. Consider building a temporary HashMap<file_path, FileInventoryEntry> (or HashMap<&str, &FileInventoryEntry>) once per tick so path lookups are O(1).

Suggested change
for path in paths {
let path_str = path.to_string_lossy().to_string();
if let Some(entry) = live_inventory
.values()
.find(|entry| entry.file_path == path_str)
.cloned()
{
changed_entries.push(entry);
let live_inventory_by_path: HashMap<&str, &FileInventoryEntry> = live_inventory
.values()
.map(|entry| (entry.file_path.as_str(), entry))
.collect();
for path in paths {
let path_str = path.to_string_lossy().to_string();
if let Some(entry) = live_inventory_by_path.get(path_str.as_str()) {
changed_entries.push((*entry).clone());

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants