Skip to content

feat: add Idle agent state — smart idle detection + wake-on-demand#792

Open
AL-ZiLLA wants to merge 1 commit intoRightNow-AI:mainfrom
AL-ZiLLA:feat/idle-agent-state
Open

feat: add Idle agent state — smart idle detection + wake-on-demand#792
AL-ZiLLA wants to merge 1 commit intoRightNow-AI:mainfrom
AL-ZiLLA:feat/idle-agent-state

Conversation

@AL-ZiLLA
Copy link
Contributor

Summary

  • Adds Idle variant to AgentState enum — agents with no pending work are marked Idle instead of Crashed when they time out
  • Smart idle detection in the heartbeat monitor checks for pending cron jobs, active background tasks, and non-reactive schedules before deciding whether to crash or idle an unresponsive agent
  • Wake-on-demand: idle agents automatically transition back to Running when a message arrives, a cron job fires, or a manual restart is triggered

The Problem

OpenFang treats "idle" and "crashed" identically. If an agent doesn't heartbeat within the timeout, it's marked Crashed and auto-recovered — even if it has nothing to do. This burns LLM credits on pointless recovery calls for agents that are simply idle between tasks.

The Fix

Before marking an agent as Crashed, the heartbeat checker now asks: does this agent have a reason to be alive right now?

An agent is marked Crashed (and auto-recovered) if ANY of:

  • It has a cron job due within the timeout window
  • It has an active background task loop (Continuous/Periodic schedule)
  • It has a non-Reactive schedule mode

If NONE of those are true → marked Idle (new state), NOT Crashed. No auto-recovery.

An Idle agent wakes up when:

  • A message is sent to it (API, channel, or inter-agent)
  • A cron job fires for it
  • Manually restarted via API

Files Changed

File Change
openfang-types/src/agent.rs Added Idle variant to AgentState enum
openfang-kernel/src/heartbeat.rs Include Idle agents in heartbeat scan, mark as non-unresponsive
openfang-kernel/src/cron.rs Added has_due_jobs_soon() method
openfang-kernel/src/background.rs Added has_task() method
openfang-kernel/src/kernel.rs Smart idle detection + wake-on-demand in message send, streaming, and cron dispatch
openfang-cli/src/tui/theme.rs [IDL] badge for idle state
openfang-cli/src/tui/screens/comms.rs Yellow color for idle state
openfang-api/static/js/pages/comms.js Warning badge class for idle state

Test plan

  • cargo build --workspace --lib — compiles clean
  • cargo test --workspace — all tests pass (2200+, 0 failures)
  • cargo clippy -p openfang-types -p openfang-kernel -p openfang-api --all-targets -- -D warnings — zero warnings
  • Verify an agent with no crons/messages goes Idle (not Crashed) after timeout
  • Verify an agent with an active cron stays Running and recovers if it crashes
  • Verify sending a message to an Idle agent wakes it up
  • Verify a cron firing on an Idle agent wakes it up
  • Verify the dashboard/API shows the Idle state correctly

Backward compatible

  • Existing recovery behavior unchanged for agents with active work
  • No config changes required
  • serde(rename_all = "snake_case") ensures Idle serializes as "idle" in JSON
  • Existing match arms with _ => ... catch-alls handle the new variant gracefully

🤖 Generated with Claude Code

Idle agents (no pending crons, active schedules, or background tasks) are
now marked Idle instead of Crashed when they time out. This eliminates
unnecessary auto-recovery cycles that burn LLM credits on agents with no
work to do.

Smart detection: before marking an unresponsive agent as Crashed, the
heartbeat monitor checks for pending cron jobs, active background tasks,
and non-reactive schedules. Only agents with real work are recovered.

Wake-on-demand: idle agents automatically transition back to Running when
a message arrives (API, channel, or inter-agent), a cron job fires, or a
manual restart is triggered.

UI: TUI shows [IDL] badge (yellow), dashboard shows warning badge for
idle agents.

Backward compatible — agents with active work still crash and recover
normally. No config changes required.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant