feat(tasks): clean-sheet rebuild spec for the async tasks plugin by youssefhany-ixo · Pull Request #208 · ixoworld/qiforge

youssefhany-ixo · 2026-06-10T05:18:27Z

Adds specs/tasks-async-system.md as the active design for the bundled tasks
plugin. Supersedes ORA-165 (six task types, Y.Doc pages, 700-line sub-agent
prompt) and the TASK-31 port-as-is plan. Both prior docs gain a SUPERSEDED
header pointing at the new spec.

The new design collapses tasks to one primitive (TaskSpec as YAML+markdown),
puts all storage behind a TaskSpecStore port (Redis today, filesystem when
the runtime grows the API), keeps the approval gate entirely in the plugin
(replacing the deleted tryHandleApprovalResponse plumbing), and ships
preview-before-save plus conditional dedicated rooms in MVP.

Adds specs/tasks-async-system.md as the active design for the bundled tasks plugin. Supersedes ORA-165 (six task types, Y.Doc pages, 700-line sub-agent prompt) and the TASK-31 port-as-is plan. Both prior docs gain a SUPERSEDED header pointing at the new spec. The new design collapses tasks to one primitive (TaskSpec as YAML+markdown), puts all storage behind a TaskSpecStore port (Redis today, filesystem when the runtime grows the API), keeps the approval gate entirely in the plugin (replacing the deleted tryHandleApprovalResponse plumbing), and ships preview-before-save plus conditional dedicated rooms in MVP.

Amends specs/tasks-async-system.md with three simplifications driven by review: - Storage port reframed as a 4-method filesystem (TaskFs: read/write/delete/ list) with paths under /users/<did>/tasks/<id>/. Redis adapter today; a UcanFsAdapter (same auth model as sandbox) when the runtime grows the API. Ephemeral state (run locks, pending approvals, BullMQ) stays on direct Redis and never goes through the port. - The worker no longer rebuilds RuntimeContext. It calls MessagesService.invokeForAutomation — a single thin entry point on the runtime — which reuses every existing agent path (auth, credits middleware, capability gating, checkpointer). BackgroundContextBuilder and the attendant DI of UCAN/Secrets/Sessions/Matrix/AMBIENT are deleted. - Webhook + chain.after triggers deferred out of this rebuild to a follow-up. MVP ships time.once + time.cron only. Phase 1 effort drops from 6d to 5d; total from 8.5d to ~7d.

…istory Three MVP cuts driven by review: - No REST surface. Tasks are agent-managed; the 6 HTTP endpoints (list/get/ patch/pause/resume/cancel) and the controllers folder are gone. Re-add later only when a non-agent consumer wants it. (FOLLOWUP-5.) - Per-run cost/token meter pushed to "good to have". The credits middleware still deducts, but surfacing the numbers from MessagesService.invokeFor- Automation hasn't been verified, so the preview output schema and the AutomationResult shape drop the tokens/costUsd fields for now. (FOLLOWUP-2.) - Per-task monthly budget enforcement pushed (depends on the cost meter). budget.monthlyUsd removed from the TaskSpec frontmatter and example. (FOLLOWUP-4.) - Run history saving pushed. No runs.jsonl on disk in MVP. Matrix room messages are the natural run log; get_task returns a room link plus the last-error block. RunSummary type removed from folder layout. (FOLLOWUP-3.) Self-healing still works in MVP via a Redis failure counter (tasks:failures:<id> hash with count + lastError + lastFailedAt) — reset on success, drives the failed-pending-review threshold and feeds suggest_spec_fix. Sections 14 (REST) and 17 (Cost accounting) deleted; remaining sections renumbered (15→14, 16→15, 18→16, 19→17, 20→18, 21→19, 22→20). Phase 1 effort drops from 5d to 4d; total from ~7d to ~5.5d.

Replaces the BUNDLED_PLUGINS stub with a real TasksPlugin that ships: - TaskSpec — YAML frontmatter + markdown body; round-trips via gray-matter, validated by Zod. Stored at /users/<did>/tasks/<id>/spec.md. - TaskFs port — 4 methods (read / write / delete / list). RedisTaskFs is the only implementation today; UcanFsAdapter (FOLLOWUP-1) is the planned swap once the runtime grows a per-user UCAN filesystem. - EphemeralStateService — direct ioredis for run locks, failure counters (count + lastError + lastFailedAt), preview tokens, and pending approval payloads. Never goes through TaskFs because none of it is user-readable. - SchedulerService + 2 BullMQ queues — `task_run` (5 concurrency, 3 retries exponential) and `task_approval` (10 concurrency, fixed backoff). Cron next-fire computed via cron-parser. - TaskRunWorker — loads spec, calls MessagesService.sendMessage with msgFromMatrixRoom=true, hands result to delivery or approval gate. No parallel buildRuntimeContext, no createMainAgent call. - ApprovalService — owns request / approve / reject / handleTimeout, exposes APPROVAL_GATE_PORT for cross-plugin reuse. - ApprovalGateMiddleware — pre-LLM intercept with fast keyword path → cheap LLM fallback. Re-enqueues the resolution as a BullMQ job so the worker owns the post-message work. Replaces the deleted tryHandleApproval- Response plumbing with zero edits to MessagesService. - IntentClassifier — fast path covers yes/no/ok/cancel/approve/reject and short-prefix variants; slow path is utility-tier LLM with a 1-word return. - RoomResolver + DedicatedRoomService — heuristic decides whether a task gets its own [Task] <title> room (sub-day cron, webhook/chain, "monitor" intents, or explicit). Spec.md posted as opening message. - AutomationInvoker — single call into MessagesService; resolves the user's main room via MatrixManager.getOracleRoomIdWithHomeServer. - 9 main-agent tools — preview_task (with required previewToken handshake), create_task, list_my_tasks, get_task, update_task, pause/resume/cancel _task, suggest_spec_fix. - 3 starter templates — morning-brief, url-monitor, weekly-report. - 20 unit tests covering manifest, auto-detect, configSchema, tool set, spec round-trip, schema validation, paths, hashing, trigger summary, intent classifier fast path, and room resolver truth table. Deps added to oracle-runtime: cron-parser, gray-matter. `apps/app/src/tasks/` is NOT removed in this change — that's coupled to the broader TASK-32 apps/app deletion. The new plugin lives independently.

… plumbing Full review of the first implementation found a fatal bug plus systemic over-engineering, so this is a clean rewrite (24 internal files → 14). The fatal bug: the worker invoked the agent with a synthetic sessionId, but RequestPreparer.prepare() throws NotFoundException for unknown sessions — every task run would have failed, pushing every task to failed-pending-review. Fixed properly: each task now owns ONE real session (SessionManagerService.createSession, anchored in the task's dedicated room when there is one), stored as `sessionId` in the spec frontmatter. Every run continues that LangGraph thread, which also gives tasks memory across runs ("compare to what you reported yesterday") via the checkpointer, for free. No new MessagesService surface needed. Other v1 problems fixed by the rebuild: - Dedicated rooms never worked: matrix-bot-sdk's createRoom returns the room id string, and v1 cast the result to `{room_id?}` — always undefined. (The cast hid it; removed.) - Failure counting was per BullMQ attempt, so one flaky run (3 retries) hit the 3-failure threshold instantly. Now @OnWorkerEvent('failed') records one failure per exhausted run and reschedules the next cron occurrence so transient failures don't kill the task. - The middleware created its own Redis client + Queue via a module-level side-channel map, and resolution took 3 hops (middleware → BullMQ "resolve" job → worker → service). Now: the Nest module hands its wired service bundle back to the plugin instance via onReady; the middleware shares those services and resolves approvals in one call. - The "cheap LLM classifier" slow path was dead code — never wired. The design is now honest: keyword fast path resolves yes/no without any model call (wrapModelCall returns the ack AIMessage directly); nuanced replies go to the main agent with a per-call system-prompt hint and a new resolve_pending_approval tool. The agent IS the classifier. - Dead config and decorative fields removed: TASKS_DEFAULT_TIMEZONE (never read), modelTier (sendMessage has no model override — storing it would lie), delivery.format (the body's "How to report" section is the contract), draft status (nothing ever was one), preview expiresAt (redundant with Redis TTL). TASKS_MAX_PER_USER is now actually enforced in create_task. - Redis key names, spec parse/render, and failed-pending-review transitions each live in exactly one place now (RedisState, TaskStore, and the two services that own them) instead of being duplicated across files. - An 8-key service-locator registry is gone; preview tokens are single- use (GETDEL); run job ids are per-fire-time (taskId:runAtIso) so re-enqueues are idempotent without fighting BullMQ's completed-job id memory. Tools go from 9 to 10 (resolve_pending_approval). Tests go from 20 to 29 and now cover TaskStore against an in-memory TaskFs, nextRunAtFor, and the classifier table. specs/tasks-async-system.md updated to match the implemented design (frontmatter, approval flow, worker shape, folder layout, follow-ups).

Full end-to-end trace of every code path turned up six real bugs. Each fix below is independently scoped; together they fall out of one architectural change. Critical: 1. **Preview pollutes the user's main Matrix room.** preview_task created a session via SessionManagerService.createSession, which posts "New Conversation Started" to the resolved room. With no roomId passed, that's the user's main oracle room. Every preview leaked. 2. **Tasks without a dedicated room appear in the user's session list.** The persistent task session was anchored in the main room when the user opted out of a dedicated room, so SessionsService.listSessions surfaced it. 3. **Dedicated room posted with sessionId: ''.** create_task built the spec, created the dedicated room (posting renderSpec(spec)), THEN filled sessionId. Empty string leaked into the room. Fix for 1/2/3: every run (and every preview) uses AgentInvoker.runOnce, which creates a synthetic session with overrideEventId (skips the Matrix post per SessionManagerService.createSession's `??` short-circuit), runs one agent turn, deletes the session. No persistent task session, no sessionId in the spec frontmatter. The user's main room and session list stay clean. We lose "memory across runs" — accept as a non-feature for MVP (file as FOLLOWUP if ever needed). 4. **Preview token consumed before validation.** consumePreview was a destructive GETDEL called before the task-limit / next-run checks. A user hitting the limit lost their token. Split into peekPreview (non-destructive) + deletePreview (destructive); peek up-front, delete only after the spec is saved and the run enqueued. 5. **delivery.post swallowed errors.** On Matrix outage, the worker thought the run succeeded — resetFailures, schedule next cron — while the user never saw the result. post now throws on failure; the worker bubbles it as a run failure (BullMQ retries; consecutive counter advances on the final attempt). Non-critical posts (approval prompts, reminders, failed-pending-review notice) call delivery.safePost, which keeps the swallow-and-log behaviour where surrounding state shouldn't roll back. 6. **claimApprovalResolution TTL was 5 min while approvals last 49 h.** A crash between claim and clear let the claim expire long before the pending payload — a later resolver could double-deliver. TTL is now passed in; takePending uses APPROVAL_TTL_SEC. Minor: - get_task now returns a curated subset instead of spreading all frontmatter (drops owner and raw trigger). - Dedicated-room post is a friendly markdown summary (title, trigger, approval, body) instead of raw YAML frontmatter. All 654 tests still pass; typecheck and lint clean. Spec updated to match: the frontmatter schema drops sessionId, §10 documents the synthetic-session approach, the glossary swaps "Task session" for "runOnce".

- Introduced `synthetic-session.ts` and `synthetic-session.test.ts` to manage synthetic session IDs for background tasks. - Updated `MessagesService` to handle synthetic session IDs correctly, ensuring they do not interfere with thread relations and post-sync operations. - Enhanced the `TasksPlugin` to include a new approval flow, allowing tasks to wait for user approval before executing actions. - Added middleware for task approval, enabling fast classification of user replies and integration with the approval flow. - Updated task specifications to support the new approval mechanism and ensure proper handling of task states. - Refactored related components to streamline the task execution process and improve overall reliability. All tests have been updated to reflect these changes and ensure functionality remains intact.

- Updated tests for `acquireToolLock` to assert that releasing locks does not throw exceptions. - Added checks for independent lock keys to ensure isolation between different keys. - Improved overall test coverage for lock functionality.

claude added 6 commits June 10, 2026 19:22

youssefhany-ixo force-pushed the claude/bold-johnson-zfUMl branch from 36c99b1 to c4fb104 Compare June 10, 2026 16:22

youssefhany-ixo added 4 commits June 12, 2026 14:32

feat: tasks

137e518

chore: formating

9adbffe

youssefhany-ixo changed the title ~~docs(tasks): clean-sheet rebuild spec for the async tasks plugin~~ feat(tasks): clean-sheet rebuild spec for the async tasks plugin Jun 16, 2026

chore: update pkg version

79c7fdf

youssefhany-ixo merged commit 0bef392 into main Jun 16, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tasks): clean-sheet rebuild spec for the async tasks plugin#208

feat(tasks): clean-sheet rebuild spec for the async tasks plugin#208
youssefhany-ixo merged 11 commits into
mainfrom
claude/bold-johnson-zfUMl

youssefhany-ixo commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants