affent

Small, embeddable agent loop core for any environment that talks to an OpenAI-compatible chat completions endpoint (OpenAI, vLLM, Chutes, SGLang, OpenRouter, Anthropic-via-OpenAI-compat, ...).

Drop into different scenarios that share a single loop:

affine-agents — multi-user web platform: chat UI, per-session Docker sandboxes, schedules, Postgres + Redis.
training / RL rigs that need an agent rollout
batch eval pipelines (SWE-bench-style harness etc.)
one-off scripts via affentctl

Tiny dep graph: uuid + zerolog + stdlib.

What you get

The loop (loop.go)

streaming LLM responses with first-class reasoning channel (reasoning_content), persisted on ChatMessage.ReasoningContent and surfaced as separate thinking.delta + thinking.done SSE events. Reasoning is local-only — stripped from outbound requests via wireMessage since DeepSeek / Kimi / GLM emit it on responses but reject it on inbound.
cancellation mid-turn (parent ctx cancel beats any in-flight retry; visible content already streamed makes a stream-cut error non-retryable to keep the UI's delta accumulator coherent)
transient-error retry for HTTP 408 / 429 / 5xx, network resets, mid-stream EOF, per-call timeouts; honors server Retry-After header (capped at MaxRespectedRetryAfter = 5m)
stream watchdog: StreamIdleTimeout=60s between chunks, StreamPostFinishTimeout=5s after finish_reason to defend against upstream proxies that forget to send [DONE]
budgets: MaxTurnSteps (assistant↔tool round trips per user turn, default 10), PerCallTimeout (per chat completion, default 3m), MaxTransientRetries (default 3, exponential backoff)
tool result caps: MaxToolResultBytesInContext=8KiB (what the model sees), MaxToolResultPreviewInEvent=4KiB (what the SSE event carries) — full bytes still go to consumers that care
monotonic event ids: every SSE event carries a per-loop sequential id so trace consumers can detect drops, order events, and tell when filtered events were skipped (see --trace-skip-deltas)

Context compaction (compaction.go)

Compactor interface; default LLMSummaryCompactor implements rolling summarization using OpenHands V1's summarizing_prompt.j2 verbatim — structured fields (USER_CONTEXT / TASK_TRACKING / COMPLETED / PENDING / CODE_STATE / TESTS / CHANGES / DEPS / VERSION_CONTROL_STATUS), example-driven, with hard PRESERVE TASK IDs semantics.
two activation paths: proactive (msg count > TriggerMsgs, default 240) and reactive (upstream returns context-overflow 4xx — matched against keyword set covering OpenAI / DeepSeek / Kimi / Anthropic phrasings — emergency compact + retry outside the transient-retry budget).
preserves head (KeepFirst=2) + rolling summary (single user message tagged [summary of earlier work]) + tail (KeepLast=10), with a boundary fixer that refuses to sever an assistant.tool_calls from its role=tool replies.

Tools

builtin: shell, read_file, write_file, edit_file, list_files. File tools are sandboxed by safeWorkspacePath: relative paths join onto the workspace, absolute paths are taken literally and must fall inside the workspace (no sentinel / trim-the-leading-slash hacks).
shell runs through an executor.Executor interface — LocalExecutor for in-process / scripts, your own impl for Docker / Firecracker / remote.
MCP: stdio + streamable-http (spec rev 2025-03-26). Plug in any number of MCP servers — their tools surface as <server>_<tool> alongside the builtins.

Project context (project_context.go)

User-authored project knowledge files, auto-loaded from the workspace and inlined into the system prompt at session start. Read-only; affent never writes to them.
Recognized filenames (in load order): AGENTS.md, CLAUDE.md, CONVENTIONS.md, .cursorrules, .clinerules, .clinerules.md, GEMINI.md. Multiple files concatenate, each under a ## <filename> header. Total cap MaxProjectContextBytes = 32 KiB (per-file truncation past the budget).
Enabled by default; toggle via affentctl --project-context=false, or set Loop.ProjectContextDir = "" when embedding.

Persistent memory (memory.go)

Off by default. Opt in via Loop.Memory = affent.NewFileMemoryStore(workspace) (or affentctl --memory).
Two stores: MEMORY.md for agent notes (env, conventions, lessons learned — workspace-scoped) and USER.md for user profile (preferences, communication style — user-scoped, default $XDG_CONFIG_HOME/affent/USER.md).
Single memory tool with action ∈ {add, replace, remove} and target ∈ {memory, user}. replace/remove use a short unique substring (old_text) to identify the entry — no IDs.
Frozen-snapshot semantics: at session start, MemoryStore.Snapshot() composes the on-disk state into the system prompt once. Mid- session writes update on-disk + live tool responses but do NOT re-snapshot, keeping the prefix cache stable for the rest of the session.
Char-bounded (default MEMORY=2200, USER=1375; ~800 / ~500 tokens). On overflow, the tool returns ok=false with entries listing the current state so the agent can consolidate in the same turn.
Atomic writes (tempfile + rename). Minimal security scan blocks invisible/bidi-override unicode, the literal delimiter sequence, and authorized_keys substrings — not a full prompt-injection regex list (those are mostly performative).

Session search (session_search.go)

Registered as the session_search tool when BuiltinDeps.SessionsDir is set (affentctl wires it automatically). The agent searches its own past conversation logs in the workspace for did we discuss X / what was that command / last week's conclusion questions.
Term-overlap scoring over JSONL session logs; user + assistant messages only (system and tool results are skipped). The current session is excluded so the agent doesn't match its own in-flight turns.
Memory and session_search are complementary: memory holds compact facts always present in the system prompt; session_search returns full snippets on demand without paying per-turn token cost.

Persistence

Conversation: append-only JSONL chat log on disk, includes system prompt + user/assistant/tool messages with tool_call_id preserved for resume. Replace() rewrites atomically (used by the compactor after summarizing earlier turns).

Observability

13 SSE event types streamed on a single channel — see below.
affentctl --trace <path> mirrors every event into a JSONL file for replay / regression diffing (- for stdout, empty for stderr).
--trace-skip-deltas drops thinking.delta / message.delta from the trace (skipped events still consume sequence ids so consumers can tell what was filtered). Useful for batch eval / training where token-level replay isn't needed; the final text is in thinking.done / message.done regardless.

Layout

loop.go            agent loop (LLM <-> tools, streaming, reasoning, cancel)
llm.go             OpenAI-compat streaming client (incl. reasoning_content, watchdog, retry classification)
builtins.go        shell, read_file, write_file, edit_file, list_files (workspace-sandboxed)
project_context.go LoadProjectContext: read AGENTS.md / CONVENTIONS.md / .cursorrules / .clinerules / CLAUDE.md / GEMINI.md
memory.go          MemoryStore + FileMemoryStore (workspace MEMORY.md + user USER.md)
memory_tool.go     the single `memory` tool (action × target dispatch)
session_search.go  session_search tool: term-overlap retrieval over past JSONL session logs
compaction.go      Compactor interface + LLMSummaryCompactor (OpenHands V1 prompt)
conversation.go    JSONL-on-disk chat log, append-only + atomic Replace
tool.go            Tool + Registry
executor/          Executor interface + LocalExecutor (in-process)
sse/               canonical event type constants + payload structs
mcp/               stdio + streamable-http MCP client; Registry adapter
cmd/affentctl/     CLI: run / chat / sessions
extras/            opt-in helper packages, separate sub-modules
  web/             web_fetch + web_search (HTML→markdown, Tavily search default)

The extras/ directory holds opt-in helper packages as separate Go sub-modules. The root affent library does not import them; callers choose which extras to register, and consumers that don't import them don't pull their transitive deps.

SSE events (the wire contract)

Every loop emits these on Loop.Events. UIs, trace files, and tests all consume the same stream.

Naming: .done means a streaming accumulator is complete (more events for the same turn may still follow). .end is reserved for turn-level boundaries (no more events for that turn).

type	when
`turn.start`	user message accepted, turn starts
`user.message`	echoes the user's text (so SSE replays are full)
`thinking.delta`	model's reasoning channel, token by token
`thinking.done`	reasoning accumulation complete (full text)
`message.delta`	model's visible content, token by token
`message.done`	assistant message complete (full text)
`tool.request`	model called a tool — name + args
`tool.output`	(gateway-only today) live stdout/stderr stream
`tool.result`	tool finished — exit code + truncated preview
`file.changed`	filesystem mutation (gateway watcher hook)
`usage`	input / output token totals for the turn
`turn.end`	reason: `completed` / `cancelled` / `error`
`error`	transient (recoverable=true) or terminal failure

affentctl (the CLI)

affentctl run --prompt "..." --workspace ./task        # one-shot
affentctl chat --workspace ./task                       # REPL
affentctl sessions --workspace ./task                   # list past sessions

run and chat accept --session-id <id> or --continue to resume an existing conversation. Logs persist as JSONL under <workspace>/.affentctl/<session_id>.jsonl.

Flags

flag	default	env
`--config`		`AFFENTCTL_CONFIG`
`--workspace`	`./affent-workspace`
`--base-url`		`AFFENTCTL_BASE_URL`
`--api-key`		`AFFENTCTL_API_KEY`
`--model`		`AFFENTCTL_MODEL`
`--prompt`	(run) literal, `-` for stdin, `@file`
`--max-turns`	10
`--max-call-timeout`	3m
`--retry-transient`	3
`--retry-backoff`	4s (doubles each attempt)
`--trace`	stderr; `-` stdout, `<path>` JSONL file
`--trace-skip-deltas`	false (set true for batch eval — drop deltas)
`--system-prompt`	builtin (dev-box flavored); `-` / file / literal
`--quiet`	false
`--project-context`	true (auto-loads AGENTS.md / CLAUDE.md / etc.)
`--memory`	false (opt in)
`--memory-only`	false (implies `--memory`, forces `--project-context=false`, rejects `--mcp-config`)
`--memory-workspace-store`	`<workspace>/.affent/MEMORY.md`
`--memory-user-store`	`$XDG_CONFIG_HOME/affent/USER.md`
`--memory-max-chars`	`2200,1375` (MEM,USER)
`--session-id`	new session
`--continue`	resume newest session under `--workspace`
`--mcp-config`		`AFFENTCTL_MCP_CONFIG`
`--compact-trigger`	240 (matches OpenHands V1 max_size); 0 disables
`--compact-keep-last`	10

Project context

affentctl auto-loads recognized project knowledge files from --workspace and inlines them into the system prompt at session start. Files are user-authored and read-only; affent never writes to them. Recognized names (concatenated in this order if multiple exist):

AGENTS.md
CLAUDE.md
CONVENTIONS.md
.cursorrules
.clinerules
.clinerules.md
GEMINI.md

Default on. Disable with affentctl --project-context=false for runs that need a clean baseline.

Persistent memory

Memory is off by default so existing one-shot and dev-box workflows keep their current tool surface until the caller opts in.

# Real user: memory on. Agent persists notes in <workspace>/.affent/MEMORY.md
# and user profile in $XDG_CONFIG_HOME/affent/USER.md.
affentctl chat --workspace ./project --memory

# Controlled memory run: only the `memory` tool, no shell/file/MCP escape hatches.
affentctl run --memory-only --prompt @question.txt

Two stores: memory holds the agent's own notes (environment facts, project conventions, lessons learned) and travels with the workspace. user holds what the agent knows about the user (preferences, communication style) and crosses workspaces.

Each store has a character cap (default 2200 / 1375, ~800 / 500 tokens). On overflow the tool returns the current entries so the agent can consolidate without an extra read. The frozen snapshot goes into the system prompt at session start; mid-session writes don't re-snapshot, so the prefix cache stays stable.

Config file

--config FILE loads JSON configuration before building the loop. CLI flags override values from the config file.

{
  "workspace": "./task",
  "base_url": "https://api.openai.com/v1",
  "model": "gpt-4o-mini",
  "max_turns": 8,
  "trace_skip_deltas": true,
  "project_context": true,
  "memory": {
    "enabled": true,
    "only": false,
    "workspace_store": ".affent/MEMORY.md",
    "user_store": "",
    "max_chars": "2200,1375"
  },
  "compact": {
    "trigger": 240,
    "keep_last": 10
  }
}

MCP config

--mcp-config FILE plugs in any number of MCP servers; their tools are exposed alongside the builtins, namespaced <server>_<tool>. Each server picks its transport from whichever field is set: URL for streamable-http, command for stdio. Setting both is an error.

{
  "servers": [
    {
      "name": "fs",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp/task"]
    },
    {
      "name": "git",
      "command": "uvx",
      "args": ["mcp-server-git", "--repository", "/tmp/task"]
    },
    {
      "name": "verify",
      "url": "http://host.docker.internal:8123/mcp",
      "headers": {"X-Auth-Token": "secret"}
    }
  ]
}

Headers (HTTP only) layers extra HTTP headers onto every request — useful for auth tokens, version pinning. Sessions are tracked via Mcp-Session-Id automatically.

Quickstart

go build -o /tmp/affentctl ./cmd/affentctl

/tmp/affentctl run \
  --workspace /tmp/task \
  --base-url  https://api.openai.com/v1 \
  --api-key   "$OPENAI_API_KEY" \
  --model     gpt-4o-mini \
  --prompt    "list files in /tmp/task and summarize what's there"

Embedding affent in your own program

import (
    "github.com/affinefoundation/affent"
    "github.com/affinefoundation/affent/executor"
    "github.com/affinefoundation/affent/sse"
)

// Optional persistent memory. nil = disabled (default).
mem := affent.NewFileMemoryStore("/tmp/task")

reg := affent.NewRegistry()
affent.RegisterBuiltins(reg, affent.BuiltinDeps{
    Executor:         executor.NewLocalExecutor("session-1", "/tmp/task"),
    HostWorkspaceDir: "/tmp/task",
    Memory:           mem, // registers the `memory` tool too
})

conv, _ := affent.NewConversation("/tmp/task", "session-1")
events := make(chan sse.Event, 256)

llm := affent.NewLLMClient(baseURL, apiKey, model)
loop := &affent.Loop{
    LLM:    llm,
    Tools:  reg,
    Conv:   conv,
    Events: events,
    Memory: mem, // composes MEMORY.md / USER.md into the system prompt at session start
    // Optional: shrink history when it grows beyond TriggerMsgs.
    Compactor: &affent.LLMSummaryCompactor{
        LLM:         llm,
        TriggerMsgs: 240,
        KeepFirst:   2,
        KeepLast:    10,
    },
    // MaxTurnSteps, PerCallTimeout, MaxTransientRetries are all
    // optional — zero falls back to the documented defaults.
}
_ = loop.EnsureSystemPrompt("")  // "" = use DefaultSystemPrompt

turnID, err := loop.SendUser(ctx, "list files and summarize")
// drain events until you see turn.end with matching turn_id

The default system prompt assumes a "dev box" environment (a /home/agent + /workspace bind-mounted into a container) and mentions schedule_* tools that the gateway registers. If you're embedding affent outside that environment, pass your own prompt to EnsureSystemPrompt.

Using extras/web

extras/web ships web_fetch and web_search as opt-in tools. It's a separate Go sub-module — go get github.com/affinefoundation/affent won't pull any HTML-processing or search-backend deps unless you also go get .../extras/web.

web_fetch runs the standard reader pipeline: go-shiori/go-readability (Mozilla Readability Go port — extracts the article main content, drops nav/header/footer/sidebar) → JohannesKaufmann/html-to-markdown (commonmark-spec converter — handles bold/italic/lists/code/tables/ links/images). We don't roll our own HTML processing.

web_search ships a SearchProvider interface with a Tavily-backed default; swap to Brave / SearXNG / an internal index by implementing the interface.

import (
    "github.com/affinefoundation/affent"
    affentweb "github.com/affinefoundation/affent/extras/web"
)

reg := affent.NewRegistry()
affent.RegisterBuiltins(reg, deps)

// Just the fetch tool — no external API key needed.
affentweb.RegisterFetch(reg, affentweb.FetchConfig{})

// Both fetch + search; default Tavily backend reads TAVILY_API_KEY.
affentweb.RegisterAll(reg, affentweb.Options{})

// Custom search provider (Brave, SearXNG, internal index, …):
tool, _ := affentweb.SearchTool(affentweb.SearchConfig{Provider: myProvider})
reg.Add(tool)

SearchProvider is the seam — implement Search(ctx, query, n) ([]SearchResult, error) and pass it via Options.SearchProvider or SearchConfig.Provider.

Status

Working:

native loop with reasoning streaming (thinking.delta + thinking.done)
- cancel; transient-error retry with Retry-After honor + stream watchdog
multi-turn REPL + session resume
MCP stdio + streamable-http (spec rev 2025-03-26)
workspace path sandboxing in builtin file tools (relative + absolute, no sentinel hacks)
context compaction (OpenHands V1 LLMSummarizingCondenser prompt verbatim, rolling summary with [summary of earlier work] marker, proactive + reactive paths, tool-call boundary safety)
persistent memory: two-store FileMemoryStore (workspace MEMORY.md
- user USER.md), single memory tool, frozen-snapshot system- prompt injection, char-bounded with overflow consolidation
monotonic event ids + --trace-skip-deltas for batch-eval traces
wireMessage strips reasoning_content from outbound requests (DeepSeek / Kimi / GLM compat)

Out of scope (intentional):

TodoWrite / structured task tracking — Claude Code-style policy, belongs in the embedder. affent provides the Registry mechanism; embedders register their own todo tool with whatever state-machine they prefer.
IDE integration / GUI — affent is server-side / batch / training shaped, not IDE-shaped.
OpenTelemetry traces — wrap externally if needed; the SSE event stream is the in-tree observability.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
cmd/affentctl		cmd/affentctl
executor		executor
extras/web		extras/web
mcp		mcp
sse		sse
.gitignore		.gitignore
README.md		README.md
builtins.go		builtins.go
builtins_test.go		builtins_test.go
compaction.go		compaction.go
compaction_test.go		compaction_test.go
conversation.go		conversation.go
go.mod		go.mod
go.sum		go.sum
llm.go		llm.go
llm_reasoning_test.go		llm_reasoning_test.go
llm_stream_test.go		llm_stream_test.go
loop.go		loop.go
loop_test.go		loop_test.go
loop_tool_result_test.go		loop_tool_result_test.go
memory.go		memory.go
memory_e2e_test.go		memory_e2e_test.go
memory_test.go		memory_test.go
memory_tool.go		memory_tool.go
project_context.go		project_context.go
project_context_test.go		project_context_test.go
session_search.go		session_search.go
session_search_test.go		session_search_test.go
tool.go		tool.go
utf8.go		utf8.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

affent

What you get

Layout

SSE events (the wire contract)

affentctl (the CLI)

Flags

Project context

Persistent memory

Config file

MCP config

Quickstart

Embedding affent in your own program

Using extras/web

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

affent

What you get

Layout

SSE events (the wire contract)

affentctl (the CLI)

Flags

Project context

Persistent memory

Config file

MCP config

Quickstart

Embedding affent in your own program

Using extras/web

Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages