Small, embeddable agent loop core for any environment that talks to an OpenAI-compatible chat completions endpoint (OpenAI, vLLM, Chutes, SGLang, OpenRouter, Anthropic-via-OpenAI-compat, ...).
Drop into different scenarios that share a single loop:
affine-agents— multi-user web platform: chat UI, per-session Docker sandboxes, schedules, Postgres + Redis.- training / RL rigs that need an agent rollout
- batch eval pipelines (SWE-bench-style harness etc.)
- one-off scripts via
affentctl
Tiny dep graph: uuid + zerolog + stdlib.
The loop (loop.go)
- streaming LLM responses with first-class reasoning channel
(
reasoning_content), persisted onChatMessage.ReasoningContentand surfaced as separatethinking.delta+thinking.doneSSE events. Reasoning is local-only — stripped from outbound requests viawireMessagesince DeepSeek / Kimi / GLM emit it on responses but reject it on inbound. - cancellation mid-turn (parent ctx cancel beats any in-flight retry; visible content already streamed makes a stream-cut error non-retryable to keep the UI's delta accumulator coherent)
- transient-error retry for HTTP 408 / 429 / 5xx, network resets,
mid-stream EOF, per-call timeouts; honors server
Retry-Afterheader (capped atMaxRespectedRetryAfter = 5m) - stream watchdog:
StreamIdleTimeout=60sbetween chunks,StreamPostFinishTimeout=5safterfinish_reasonto defend against upstream proxies that forget to send[DONE] - budgets:
MaxTurnSteps(assistant↔tool round trips per user turn, default 10),PerCallTimeout(per chat completion, default 3m),MaxTransientRetries(default 3, exponential backoff) - tool result caps:
MaxToolResultBytesInContext=8KiB(what the model sees),MaxToolResultPreviewInEvent=4KiB(what the SSE event carries) — full bytes still go to consumers that care - monotonic event ids: every SSE event carries a per-loop
sequential
idso trace consumers can detect drops, order events, and tell when filtered events were skipped (see--trace-skip-deltas)
Context compaction (compaction.go)
Compactorinterface; defaultLLMSummaryCompactorimplements rolling summarization using OpenHands V1'ssummarizing_prompt.j2verbatim — structured fields (USER_CONTEXT/TASK_TRACKING/COMPLETED/PENDING/CODE_STATE/TESTS/CHANGES/DEPS/VERSION_CONTROL_STATUS), example-driven, with hardPRESERVE TASK IDssemantics.- two activation paths: proactive (msg count >
TriggerMsgs, default 240) and reactive (upstream returns context-overflow 4xx — matched against keyword set covering OpenAI / DeepSeek / Kimi / Anthropic phrasings — emergency compact + retry outside the transient-retry budget). - preserves head (
KeepFirst=2) + rolling summary (single user message tagged[summary of earlier work]) + tail (KeepLast=10), with a boundary fixer that refuses to sever anassistant.tool_callsfrom itsrole=toolreplies.
Tools
- builtin:
shell,read_file,write_file,edit_file,list_files. File tools are sandboxed bysafeWorkspacePath: relative paths join onto the workspace, absolute paths are taken literally and must fall inside the workspace (no sentinel / trim-the-leading-slash hacks). shellruns through anexecutor.Executorinterface —LocalExecutorfor in-process / scripts, your own impl for Docker / Firecracker / remote.- MCP: stdio + streamable-http (spec rev 2025-03-26). Plug in any
number of MCP servers — their tools surface as
<server>_<tool>alongside the builtins.
Project context (project_context.go)
- User-authored project knowledge files, auto-loaded from the workspace and inlined into the system prompt at session start. Read-only; affent never writes to them.
- Recognized filenames (in load order):
AGENTS.md,CLAUDE.md,CONVENTIONS.md,.cursorrules,.clinerules,.clinerules.md,GEMINI.md. Multiple files concatenate, each under a## <filename>header. Total capMaxProjectContextBytes = 32 KiB(per-file truncation past the budget). - Enabled by default; toggle via
affentctl --project-context=false, or setLoop.ProjectContextDir = ""when embedding.
Persistent memory (memory.go)
- Off by default. Opt in via
Loop.Memory = affent.NewFileMemoryStore(workspace)(oraffentctl --memory). - Two stores:
MEMORY.mdfor agent notes (env, conventions, lessons learned — workspace-scoped) andUSER.mdfor user profile (preferences, communication style — user-scoped, default$XDG_CONFIG_HOME/affent/USER.md). - Single
memorytool withaction ∈ {add, replace, remove}andtarget ∈ {memory, user}.replace/removeuse a short unique substring (old_text) to identify the entry — no IDs. - Frozen-snapshot semantics: at session start,
MemoryStore.Snapshot()composes the on-disk state into the system prompt once. Mid- session writes update on-disk + live tool responses but do NOT re-snapshot, keeping the prefix cache stable for the rest of the session. - Char-bounded (default
MEMORY=2200,USER=1375; ~800 / ~500 tokens). On overflow, the tool returnsok=falsewithentrieslisting the current state so the agent can consolidate in the same turn. - Atomic writes (tempfile + rename). Minimal security scan blocks
invisible/bidi-override unicode, the literal delimiter sequence,
and
authorized_keyssubstrings — not a full prompt-injection regex list (those are mostly performative).
Session search (session_search.go)
- Registered as the
session_searchtool whenBuiltinDeps.SessionsDiris set (affentctl wires it automatically). The agent searches its own past conversation logs in the workspace fordid we discuss X/what was that command/last week's conclusionquestions. - Term-overlap scoring over JSONL session logs; user + assistant messages only (system and tool results are skipped). The current session is excluded so the agent doesn't match its own in-flight turns.
- Memory and session_search are complementary: memory holds compact facts always present in the system prompt; session_search returns full snippets on demand without paying per-turn token cost.
Persistence
Conversation: append-only JSONL chat log on disk, includes system prompt + user/assistant/tool messages withtool_call_idpreserved for resume.Replace()rewrites atomically (used by the compactor after summarizing earlier turns).
Observability
- 13 SSE event types streamed on a single channel — see below.
affentctl --trace <path>mirrors every event into a JSONL file for replay / regression diffing (-for stdout, empty for stderr).--trace-skip-deltasdropsthinking.delta/message.deltafrom the trace (skipped events still consume sequence ids so consumers can tell what was filtered). Useful for batch eval / training where token-level replay isn't needed; the final text is inthinking.done/message.doneregardless.
loop.go agent loop (LLM <-> tools, streaming, reasoning, cancel)
llm.go OpenAI-compat streaming client (incl. reasoning_content, watchdog, retry classification)
builtins.go shell, read_file, write_file, edit_file, list_files (workspace-sandboxed)
project_context.go LoadProjectContext: read AGENTS.md / CONVENTIONS.md / .cursorrules / .clinerules / CLAUDE.md / GEMINI.md
memory.go MemoryStore + FileMemoryStore (workspace MEMORY.md + user USER.md)
memory_tool.go the single `memory` tool (action × target dispatch)
session_search.go session_search tool: term-overlap retrieval over past JSONL session logs
compaction.go Compactor interface + LLMSummaryCompactor (OpenHands V1 prompt)
conversation.go JSONL-on-disk chat log, append-only + atomic Replace
tool.go Tool + Registry
executor/ Executor interface + LocalExecutor (in-process)
sse/ canonical event type constants + payload structs
mcp/ stdio + streamable-http MCP client; Registry adapter
cmd/affentctl/ CLI: run / chat / sessions
extras/ opt-in helper packages, separate sub-modules
web/ web_fetch + web_search (HTML→markdown, Tavily search default)
The extras/ directory holds opt-in helper packages as separate Go
sub-modules. The root affent library does not import them; callers
choose which extras to register, and consumers that don't import them
don't pull their transitive deps.
Every loop emits these on Loop.Events. UIs, trace files, and tests
all consume the same stream.
Naming: .done means a streaming accumulator is complete (more
events for the same turn may still follow). .end is reserved for
turn-level boundaries (no more events for that turn).
| type | when |
|---|---|
turn.start |
user message accepted, turn starts |
user.message |
echoes the user's text (so SSE replays are full) |
thinking.delta |
model's reasoning channel, token by token |
thinking.done |
reasoning accumulation complete (full text) |
message.delta |
model's visible content, token by token |
message.done |
assistant message complete (full text) |
tool.request |
model called a tool — name + args |
tool.output |
(gateway-only today) live stdout/stderr stream |
tool.result |
tool finished — exit code + truncated preview |
file.changed |
filesystem mutation (gateway watcher hook) |
usage |
input / output token totals for the turn |
turn.end |
reason: completed / cancelled / error |
error |
transient (recoverable=true) or terminal failure |
affentctl run --prompt "..." --workspace ./task # one-shot
affentctl chat --workspace ./task # REPL
affentctl sessions --workspace ./task # list past sessions
run and chat accept --session-id <id> or --continue to resume
an existing conversation. Logs persist as JSONL under
<workspace>/.affentctl/<session_id>.jsonl.
| flag | default | env |
|---|---|---|
--config |
AFFENTCTL_CONFIG |
|
--workspace |
./affent-workspace |
|
--base-url |
AFFENTCTL_BASE_URL |
|
--api-key |
AFFENTCTL_API_KEY |
|
--model |
AFFENTCTL_MODEL |
|
--prompt |
(run) literal, - for stdin, @file |
|
--max-turns |
10 | |
--max-call-timeout |
3m | |
--retry-transient |
3 | |
--retry-backoff |
4s (doubles each attempt) | |
--trace |
stderr; - stdout, <path> JSONL file |
|
--trace-skip-deltas |
false (set true for batch eval — drop deltas) | |
--system-prompt |
builtin (dev-box flavored); - / file / literal |
|
--quiet |
false | |
--project-context |
true (auto-loads AGENTS.md / CLAUDE.md / etc.) | |
--memory |
false (opt in) | |
--memory-only |
false (implies --memory, forces --project-context=false, rejects --mcp-config) |
|
--memory-workspace-store |
<workspace>/.affent/MEMORY.md |
|
--memory-user-store |
$XDG_CONFIG_HOME/affent/USER.md |
|
--memory-max-chars |
2200,1375 (MEM,USER) |
|
--session-id |
new session | |
--continue |
resume newest session under --workspace |
|
--mcp-config |
AFFENTCTL_MCP_CONFIG |
|
--compact-trigger |
240 (matches OpenHands V1 max_size); 0 disables | |
--compact-keep-last |
10 |
affentctl auto-loads recognized project knowledge files from
--workspace and inlines them into the system prompt at session
start. Files are user-authored and read-only; affent never writes
to them. Recognized names (concatenated in this order if multiple
exist):
AGENTS.md
CLAUDE.md
CONVENTIONS.md
.cursorrules
.clinerules
.clinerules.md
GEMINI.md
Default on. Disable with affentctl --project-context=false for
runs that need a clean baseline.
Memory is off by default so existing one-shot and dev-box workflows keep their current tool surface until the caller opts in.
# Real user: memory on. Agent persists notes in <workspace>/.affent/MEMORY.md
# and user profile in $XDG_CONFIG_HOME/affent/USER.md.
affentctl chat --workspace ./project --memory
# Controlled memory run: only the `memory` tool, no shell/file/MCP escape hatches.
affentctl run --memory-only --prompt @question.txtTwo stores: memory holds the agent's own notes (environment facts,
project conventions, lessons learned) and travels with the workspace.
user holds what the agent knows about the user (preferences,
communication style) and crosses workspaces.
Each store has a character cap (default 2200 / 1375, ~800 /
500 tokens). On overflow the tool returns the current entries so
the agent can consolidate without an extra read. The frozen snapshot
goes into the system prompt at session start; mid-session writes
don't re-snapshot, so the prefix cache stays stable.
--config FILE loads JSON configuration before building the loop. CLI
flags override values from the config file.
{
"workspace": "./task",
"base_url": "https://api.openai.com/v1",
"model": "gpt-4o-mini",
"max_turns": 8,
"trace_skip_deltas": true,
"project_context": true,
"memory": {
"enabled": true,
"only": false,
"workspace_store": ".affent/MEMORY.md",
"user_store": "",
"max_chars": "2200,1375"
},
"compact": {
"trigger": 240,
"keep_last": 10
}
}--mcp-config FILE plugs in any number of MCP servers; their tools
are exposed alongside the builtins, namespaced <server>_<tool>.
Each server picks its transport from whichever field is set: URL
for streamable-http, command for stdio. Setting both is an error.
{
"servers": [
{
"name": "fs",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp/task"]
},
{
"name": "git",
"command": "uvx",
"args": ["mcp-server-git", "--repository", "/tmp/task"]
},
{
"name": "verify",
"url": "http://host.docker.internal:8123/mcp",
"headers": {"X-Auth-Token": "secret"}
}
]
}Headers (HTTP only) layers extra HTTP headers onto every request —
useful for auth tokens, version pinning. Sessions are tracked via
Mcp-Session-Id automatically.
go build -o /tmp/affentctl ./cmd/affentctl
/tmp/affentctl run \
--workspace /tmp/task \
--base-url https://api.openai.com/v1 \
--api-key "$OPENAI_API_KEY" \
--model gpt-4o-mini \
--prompt "list files in /tmp/task and summarize what's there"import (
"github.com/affinefoundation/affent"
"github.com/affinefoundation/affent/executor"
"github.com/affinefoundation/affent/sse"
)
// Optional persistent memory. nil = disabled (default).
mem := affent.NewFileMemoryStore("/tmp/task")
reg := affent.NewRegistry()
affent.RegisterBuiltins(reg, affent.BuiltinDeps{
Executor: executor.NewLocalExecutor("session-1", "/tmp/task"),
HostWorkspaceDir: "/tmp/task",
Memory: mem, // registers the `memory` tool too
})
conv, _ := affent.NewConversation("/tmp/task", "session-1")
events := make(chan sse.Event, 256)
llm := affent.NewLLMClient(baseURL, apiKey, model)
loop := &affent.Loop{
LLM: llm,
Tools: reg,
Conv: conv,
Events: events,
Memory: mem, // composes MEMORY.md / USER.md into the system prompt at session start
// Optional: shrink history when it grows beyond TriggerMsgs.
Compactor: &affent.LLMSummaryCompactor{
LLM: llm,
TriggerMsgs: 240,
KeepFirst: 2,
KeepLast: 10,
},
// MaxTurnSteps, PerCallTimeout, MaxTransientRetries are all
// optional — zero falls back to the documented defaults.
}
_ = loop.EnsureSystemPrompt("") // "" = use DefaultSystemPrompt
turnID, err := loop.SendUser(ctx, "list files and summarize")
// drain events until you see turn.end with matching turn_idThe default system prompt assumes a "dev box" environment (a
/home/agent + /workspace bind-mounted into a container) and
mentions schedule_* tools that the gateway registers. If you're
embedding affent outside that environment, pass your own prompt to
EnsureSystemPrompt.
extras/web ships web_fetch and web_search as opt-in tools. It's a
separate Go sub-module — go get github.com/affinefoundation/affent
won't pull any HTML-processing or search-backend deps unless you also
go get .../extras/web.
web_fetch runs the standard reader pipeline:
go-shiori/go-readability
(Mozilla Readability Go port — extracts the article main content,
drops nav/header/footer/sidebar) → JohannesKaufmann/html-to-markdown
(commonmark-spec converter — handles bold/italic/lists/code/tables/
links/images). We don't roll our own HTML processing.
web_search ships a SearchProvider interface with a Tavily-backed
default; swap to Brave / SearXNG / an internal index by implementing
the interface.
import (
"github.com/affinefoundation/affent"
affentweb "github.com/affinefoundation/affent/extras/web"
)
reg := affent.NewRegistry()
affent.RegisterBuiltins(reg, deps)
// Just the fetch tool — no external API key needed.
affentweb.RegisterFetch(reg, affentweb.FetchConfig{})
// Both fetch + search; default Tavily backend reads TAVILY_API_KEY.
affentweb.RegisterAll(reg, affentweb.Options{})
// Custom search provider (Brave, SearXNG, internal index, …):
tool, _ := affentweb.SearchTool(affentweb.SearchConfig{Provider: myProvider})
reg.Add(tool)SearchProvider is the seam — implement Search(ctx, query, n) ([]SearchResult, error)
and pass it via Options.SearchProvider or SearchConfig.Provider.
Working:
- native loop with reasoning streaming (
thinking.delta+thinking.done)- cancel; transient-error retry with
Retry-Afterhonor + stream watchdog
- cancel; transient-error retry with
- multi-turn REPL + session resume
- MCP stdio + streamable-http (spec rev 2025-03-26)
- workspace path sandboxing in builtin file tools (relative + absolute, no sentinel hacks)
- context compaction (OpenHands V1 LLMSummarizingCondenser prompt
verbatim, rolling summary with
[summary of earlier work]marker, proactive + reactive paths, tool-call boundary safety) - persistent memory: two-store
FileMemoryStore(workspace MEMORY.md- user USER.md), single
memorytool, frozen-snapshot system- prompt injection, char-bounded with overflow consolidation
- user USER.md), single
- monotonic event ids +
--trace-skip-deltasfor batch-eval traces wireMessagestripsreasoning_contentfrom outbound requests (DeepSeek / Kimi / GLM compat)
Out of scope (intentional):
- TodoWrite / structured task tracking — Claude Code-style policy,
belongs in the embedder. affent provides the
Registrymechanism; embedders register their own todo tool with whatever state-machine they prefer. - IDE integration / GUI — affent is server-side / batch / training shaped, not IDE-shaped.
- OpenTelemetry traces — wrap externally if needed; the SSE event stream is the in-tree observability.