From 61db2558b782fe8f0c99da60616f03dfcb237ab9 Mon Sep 17 00:00:00 2001
From: Patrick Gray <pgrayy@amazon.com>
Date: Thu, 2 Apr 2026 15:09:57 -0400
Subject: [PATCH 1/5] docs: add state machine design doc (0005)

---
 designs/0005-state-machine.md | 356 ++++++++++++++++++++++++++++++++++
 1 file changed, 356 insertions(+)
 create mode 100644 designs/0005-state-machine.md

diff --git a/designs/0005-state-machine.md b/designs/0005-state-machine.md
new file mode 100644
index 000000000..d8539e60f
--- /dev/null
+++ b/designs/0005-state-machine.md
@@ -0,0 +1,356 @@
+# Strands: State Machine
+
+**Status**: Proposed
+
+**Date**: 2026-03-31
+
+## Overview
+
+This design restructures the Agent loop into discrete steps coordinated by an orchestrator. Today, the Agent class implements its loop in a single `_stream()` method that handles model calls, tool execution, structured output, telemetry, and routing together. Decomposing this into steps simplifies adding new steps, applying cross-cutting concerns uniformly, handling non-linear flow (interrupts, cancellation, async model polling), and checkpointing progress. The public API (`agent.invoke()`, `agent.stream()`, hooks) does not change.
+
+I want to note that this design is a mental model as much as it is an implementation plan. The interfaces and layers don't need to be adopted wholesale, they can be applied incrementally. Even where we don't formalize them in code, this framing can help guide decisions about where new behavior belongs and how to keep the codebase organized as it grows.
+
+## Solution
+
+The agent loop is decomposed into five layers:
+
+- **Clients**: the I/O boundary (e.g., Model, Tool)
+- **Steps**: discrete units of work that use clients and produce typed results
+- **Middleware**: wraps steps with cross-cutting concerns (e.g., telemetry, checkpointing)
+- **Plugins**: register hook callbacks to observe and indirectly influence execution (e.g., cancel, retry)
+- **Orchestrators**: coordinate steps, handle routing, and can nest other orchestrators
+
+Steps and orchestrators share the same `invoke`/`stream` interface, enabling nesting and uniform wrapping. All layers operate on shared **state** (mutable invocation data) and **context** (read-only dependencies).
+
+### State and Context
+
+All layers receive state and context explicitly, giving them a clear, bounded data contract rather than reaching into the Agent instance for what they need.
+
+`AgentState` holds all mutable per-invocation data:
+
+```typescript
+interface AgentState {
+  messages: Message[]
+  metrics: AgentMetric[]
+  traces: AgentTrace[]
+
+  // Sub-state objects
+  interrupt?: InterruptState
+  app: StateStore  // user-facing key-value state
+
+  // Intra-loop temporaries (step-to-step communication)
+  lastModelResult?: StreamAggregatedResult
+  structuredOutputChoice?: ToolChoice
+  ...
+}
+```
+
+`AgentContext` holds read-only dependencies:
+
+```typescript
+interface AgentContext {
+  readonly model: Model
+  readonly toolRegistry: ToolRegistry
+  readonly systemPrompt?: SystemPrompt
+  readonly tracer: Tracer
+  readonly meter: Meter
+  readonly pluginRegistry: PluginRegistry
+  readonly name: string
+  readonly id: string
+}
+```
+
+See [0002-isolated-state](https://github.com/strands-agents/docs/pull/551) for a complementary proposal on AgentState lifecycle management (creation, persistence, invocation keys).
+
+### Clients
+
+The I/O boundary. Unchanged from today. Examples:
+
+| Client | What it does |
+|--------|-------------|
+| `Model` | Sends messages to an LLM, streams back a response |
+| `Tool` | Executes a single tool, streams progress |
+
+Clients are stateless, reusable, and unaware of the agent loop.
+
+### Steps
+
+`Step` is a generic base class for the smallest unit of work in the loop. It provides `invoke` (request/response) derived from `stream` (yields events, returns a result). Subclasses only implement `stream`. For the agent loop, steps extend `AgentStep`, which fills in the shared context and state types:
+
+```typescript
+type AgentStep<TEvent, TResult> = Step<AgentContext, AgentState, TEvent, TResult>
+```
+
+Steps write their full results into state (that's how data flows between steps). The `TResult` return value is a typed convenience that surfaces the notable parts, giving the orchestrator direct, namespaced access without digging through state fields. Here are two examples:
+
+**ModelStep**: calls the LLM, yields streaming events, and returns the stop reason and message.
+
+```typescript
+class ModelStep extends AgentStep<ModelStreamEvent, ModelStepResult> {
+  readonly name = 'model'
+
+  async *stream(ctx, state) {
+    const result = yield* ctx.model.streamAggregated(
+      state.messages,
+      this._buildStreamOptions(ctx, state)
+    )
+    state.lastModelResult = result
+    return { type: 'model', stopReason: result.stopReason, message: result.message }
+  }
+}
+```
+
+**ToolStep**: runs a single tool, yields progress events, and returns the tool result.
+
+```typescript
+class ToolStep extends AgentStep<ToolStreamEvent, ToolStepResult> {
+  readonly name = 'tool'
+
+  async *stream(ctx, state) {
+    const toolUse = state.currentToolUse!
+    const tool = ctx.toolRegistry.get(toolUse.name)
+    if (!tool) {
+      return { type: 'tool', result: this._errorResult(toolUse, 'not found') }
+    }
+    const result = yield* tool.stream({ toolUse, agent: state })
+    return { type: 'tool', result }
+  }
+}
+```
+
+### Middleware
+
+Middleware sits between the orchestrator and a step, wrapping the step's `stream` method with additional behavior. It directly controls execution: it can intercept, skip, retry, or transform the step's result. This is useful for cross-cutting concerns (behavior that applies uniformly across multiple steps, like telemetry or checkpointing) without duplicating logic in each step.
+
+There are two kinds:
+
+**Built-in middleware** ships with the SDK and is always present. It's configured through state or context at runtime. One possible way to manage built-in middleware is via decorator syntax (`@`) on step class methods, though the exact mechanism is an implementation detail. Examples:
+
+| Middleware | What it does |
+|-----------|-------------|
+| `@traced` | Creates a telemetry span around the step, records result or error |
+| `@retryable` | Retries the step on transient errors with configurable backoff |
+
+**Custom middleware** is user-provided via the `middleware` param on the Agent constructor. It implements the `Middleware` interface:
+
+```typescript
+interface Middleware {
+  wrap(step: Step): Step
+}
+```
+
+Example: a rate limiter that throttles step execution.
+
+```typescript
+class RateLimiter implements Middleware {
+  constructor(private _maxPerSecond: number) {}
+
+  wrap(step: Step): Step {
+    return {
+      ...step,
+      async *stream(ctx, state) {
+        await this._acquireToken()
+        return yield* step.stream(ctx, state)
+      },
+    }
+  }
+}
+
+const agent = new Agent({
+  middleware: [new RateLimiter({ maxPerSecond: 10 })],
+})
+```
+
+### Plugins
+
+Plugins register hook callbacks to observe and indirectly influence step execution. The SDK fires lifecycle events (e.g., `BeforeModelCallEvent`, `AfterToolCallEvent`) at the appropriate points, and plugin callbacks react to them by setting flags like `retry` or `cancel` that the step or middleware responds to.
+
+This is the existing hook system, unchanged by this design.
+
+```typescript
+const agent = new Agent({
+  plugins: [myLoggingPlugin, myAnalyticsPlugin],
+})
+```
+
+### Orchestrators
+
+`Orchestrator` is a generic base class that coordinates steps and other orchestrators. Like `Step`, it provides `invoke` derived from `stream`. Orchestrators can nest: a parent orchestrator treats a sub-orchestrator the same as a step.
+
+For the agent loop, orchestrators extend `AgentOrchestrator`:
+
+```typescript
+type AgentOrchestrator = Orchestrator<AgentContext, AgentState, AgentStreamEvent>
+```
+
+**ToolOrchestrator**: runs `ToolStep` for each tool use block.
+
+```typescript
+class ToolOrchestrator extends AgentOrchestrator {
+  async *stream(ctx, state) {
+    const toolUseBlocks = this._extractToolUseBlocks(state)
+    for (const block of toolUseBlocks) {
+      yield* this._toolStep.stream(ctx, { ...state, currentToolUse: block })
+    }
+    return { type: 'tools' }
+  }
+}
+```
+
+**Agent**: the top-level orchestrator. Agent follows the orchestrator pattern internally but doesn't extend `Orchestrator` directly, since its public `stream` method takes `InvokeArgs` rather than `(ctx, state)` for backwards compatibility. It creates the context and state, then runs the loop.
+
+```typescript
+class Agent {
+  async *stream(args: InvokeArgs) {
+    const ctx = this._buildContext()
+    const state = this._buildState(args)
+
+    while (true) {
+      const result = yield* this._model.stream(ctx, state)
+
+      if (result.stopReason !== 'toolUse') {
+        return { type: 'done', result: this._buildResult(state) }
+      }
+
+      yield* this._toolOrchestrator.stream(ctx, state)
+    }
+  }
+}
+```
+
+The public API does not change: `agent.invoke()`, `agent.stream()`, `agent.addHook()`, `agent.messages`, and `agent.appState` all work as before.
+
+The full structure:
+
+```
+Agent (Orchestrator)
+├── ModelStep (Step)
+└── ToolOrchestrator (Sub-orchestrator)
+    ├── ToolStep (Step)
+    ├── ToolStep (Step)
+    └── ToolStep (Step)
+```
+
+
+## Capabilities
+
+The step/orchestrator decomposition enables several capabilities that benefit from discrete, well-bounded execution units.
+
+### Cross-Cutting Middleware
+
+Middleware applies behavior uniformly across steps without each step needing to know about it. Guardrails are a good example: a single middleware can evaluate the result of any step and decide whether to block or let it through.
+
+```typescript
+class GuardrailMiddleware implements Middleware {
+  constructor(private _evaluate: (result: unknown) => 'pass' | 'block') {}
+
+  wrap(step: Step): Step {
+    return {
+      ...step,
+      async *stream(ctx, state) {
+        const result = yield* step.stream(ctx, state)
+        if (this._evaluate(result) === 'block') {
+          throw new GuardrailError('blocked by guardrail')
+        }
+        return result
+      },
+    }
+  }
+}
+
+const agent = new Agent({
+  middleware: [new GuardrailMiddleware(myEvaluator)],
+})
+```
+
+Multiple middleware compose naturally. A guardrail, a rate limiter, and a custom logger can each be separate middleware applied to every step, rather than one monolithic wrapper or duplicated logic inside each step.
+
+### Checkpointing
+
+Because the agent loop is composed of discrete steps, the orchestrator can return after each step with a checkpoint token that records the current position. The caller reinvokes with that token to resume from where it left off. When checkpointing is not enabled, the loop runs normally.
+
+```typescript
+class Agent {
+  private _steps = [this._modelStep, this._toolOrchestrator]
+
+  async *stream(args: InvokeArgs) {
+    const ctx = this._buildContext()
+    const state = args.checkpoint?.state ?? this._buildState(args)
+    let stepIndex = args.checkpoint?.stepIndex ?? 0
+
+    while (true) {
+      const step = this._steps[stepIndex]
+      const result = yield* step.stream(ctx, state)
+
+      if (result.stopReason === 'done') {
+        return { type: 'done', result: this._buildResult(state) }
+      }
+
+      stepIndex = (stepIndex + 1) % this._steps.length
+
+      if (ctx.checkpointing) {
+        return { type: 'checkpoint', checkpoint: { stepIndex, state } }
+      }
+    }
+  }
+}
+```
+
+The checkpoint token is small and serializable: just a step index and the state reference. The caller drives the loop externally:
+
+```typescript
+let result = await agent.invoke({ prompt: 'Hello', checkpointing: true })
+
+while (result.type === 'checkpoint') {
+  // persist state, hand off to another system, sleep, etc.
+  result = await agent.invoke({ checkpoint: result.checkpoint })
+}
+```
+
+This pattern enables durable execution with systems like [Temporal](https://temporal.io/), where each step becomes a separate Activity cached in Temporal's Event History. On crash recovery, completed steps replay from cache and the loop resumes from the last incomplete step. See the [checkpoint mode prototype](https://github.com/strands-agents/sdk-typescript/compare/main...pgrayy:strands-sdk-typescript:prototype/checkpoint-mode?expand=1) for a working reference implementation.
+
+### Sub-Orchestration
+
+Because orchestrators and steps share the same `invoke`/`stream` interface, any slot in the step sequence can be a sub-orchestrator that coordinates its own steps internally. The agent loop doesn't distinguish between the two.
+
+Tool execution is one example. The default `ToolOrchestrator` runs tools sequentially, but swapping in a `ConcurrentToolOrchestrator` changes the execution strategy without touching `ToolStep` or the agent loop:
+
+```typescript
+const agent = new Agent({
+  toolOrchestrator: new ConcurrentToolOrchestrator({ maxConcurrency: 3 }),
+})
+```
+
+The `ToolOrchestrator` is itself composed of `ToolStep` instances. From the agent loop's perspective, it's just another entry in the step sequence that happens to run sub-steps internally.
+
+### Isolated Invocation State
+
+Each invocation gets its own `AgentState` instance. Steps receive state explicitly, so concurrent invocations on the same agent don't share mutable data:
+
+```typescript
+// Each invocation creates its own state
+const [result1, result2] = await Promise.all([
+  agent.invoke({ prompt: 'Summarize this document' }),
+  agent.invoke({ prompt: 'Translate this to French' }),
+])
+// result1 and result2 operated on separate AgentState instances
+```
+
+The agent's context (model, tools, configuration) is shared and read-only. The state (messages, metrics, traces) is per-invocation. Steps don't reach into the agent instance for what they need, they operate on the state they're given. This is the same state/context split proposed in [0002-isolated-state](https://github.com/strands-agents/docs/pull/551), which we discussed in a previous meeting.
+
+## Guidelines
+
+When deciding where new behavior belongs:
+
+| Layer | Need | Role | Example |
+|-------|------|------|---------|
+| Client | External I/O | Talks to an external system | Model, Tool |
+| Step | Unit of work | Performs one discrete task in the loop | ModelStep, ToolStep |
+| Middleware | Wrapping | Intercepts, skips, retries, or transforms a step | `@traced`, `@retryable`, guardrails |
+| Plugin | Observation | Reacts to lifecycle events, signals intent via flags | Logging, cancel/retry via event flags |
+| Orchestrator | Coordination | Decides which steps run and in what order | ToolOrchestrator, Agent |
+
+## Resources
+
+- [0002-isolated-state](https://github.com/strands-agents/docs/pull/551): complementary proposal for state lifecycle management
+- [Durable Execution Provider Integration](https://github.com/strands-agents/docs/pull/584): durable execution proposal that this design enables

From 50c2184a7ecccfc3dbb55ac59b153c18d1facf42 Mon Sep 17 00:00:00 2001
From: Patrick Gray <pgrayy@amazon.com>
Date: Fri, 3 Apr 2026 12:43:13 -0400
Subject: [PATCH 2/5] docs: update cross-cutting middleware example to tracing

---
 designs/0005-state-machine.md | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/designs/0005-state-machine.md b/designs/0005-state-machine.md
index d8539e60f..c38120f0c 100644
--- a/designs/0005-state-machine.md
+++ b/designs/0005-state-machine.md
@@ -238,32 +238,37 @@ The step/orchestrator decomposition enables several capabilities that benefit fr
 
 ### Cross-Cutting Middleware
 
-Middleware applies behavior uniformly across steps without each step needing to know about it. Guardrails are a good example: a single middleware can evaluate the result of any step and decide whether to block or let it through.
+Middleware applies behavior uniformly across steps without each step needing to know about it. Tracing is a good example: a single middleware can create a telemetry span around any step, record its result or error, and emit metrics, all without touching the step's implementation.
 
 ```typescript
-class GuardrailMiddleware implements Middleware {
-  constructor(private _evaluate: (result: unknown) => 'pass' | 'block') {}
-
+class TracingMiddleware implements Middleware {
   wrap(step: Step): Step {
     return {
       ...step,
       async *stream(ctx, state) {
-        const result = yield* step.stream(ctx, state)
-        if (this._evaluate(result) === 'block') {
-          throw new GuardrailError('blocked by guardrail')
+        const span = ctx.tracer.startSpan(step.name)
+        try {
+          const result = yield* step.stream(ctx, state)
+          span.setStatus('ok')
+          return result
+        } catch (error) {
+          span.recordException(error)
+          span.setStatus('error')
+          throw error
+        } finally {
+          span.end()
         }
-        return result
       },
     }
   }
 }
 
 const agent = new Agent({
-  middleware: [new GuardrailMiddleware(myEvaluator)],
+  middleware: [new TracingMiddleware()],
 })
 ```
 
-Multiple middleware compose naturally. A guardrail, a rate limiter, and a custom logger can each be separate middleware applied to every step, rather than one monolithic wrapper or duplicated logic inside each step.
+Every step (model calls, tool calls, sub-orchestrators) gets traced with the same logic. Multiple middleware compose naturally: tracing, a rate limiter, and a guardrail can each be separate middleware applied to every step, rather than duplicated logic inside each one.
 
 ### Checkpointing
 

From d0f1bc8bf31d5c9cafaaaa64035c296f5359ad44 Mon Sep 17 00:00:00 2001
From: Patrick Gray <pgrayy@amazon.com>
Date: Fri, 3 Apr 2026 12:55:09 -0400
Subject: [PATCH 3/5] docs: update cross-cutting middleware example to caching

---
 designs/0005-state-machine.md | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/designs/0005-state-machine.md b/designs/0005-state-machine.md
index c38120f0c..be42a2e6c 100644
--- a/designs/0005-state-machine.md
+++ b/designs/0005-state-machine.md
@@ -238,37 +238,36 @@ The step/orchestrator decomposition enables several capabilities that benefit fr
 
 ### Cross-Cutting Middleware
 
-Middleware applies behavior uniformly across steps without each step needing to know about it. Tracing is a good example: a single middleware can create a telemetry span around any step, record its result or error, and emit metrics, all without touching the step's implementation.
+Middleware applies behavior uniformly across steps without each step needing to know about it. Caching is a good example: a middleware can check for a cached result before a step runs and store the result after it completes, without any step being aware of the cache.
 
 ```typescript
-class TracingMiddleware implements Middleware {
+class CacheMiddleware implements Middleware {
+  constructor(private _cache: Map<string, unknown> = new Map()) {}
+
   wrap(step: Step): Step {
     return {
       ...step,
       async *stream(ctx, state) {
-        const span = ctx.tracer.startSpan(step.name)
-        try {
-          const result = yield* step.stream(ctx, state)
-          span.setStatus('ok')
-          return result
-        } catch (error) {
-          span.recordException(error)
-          span.setStatus('error')
-          throw error
-        } finally {
-          span.end()
+        const key = this._buildKey(step.name, state)
+        const cached = this._cache.get(key)
+        if (cached) {
+          return cached
         }
+
+        const result = yield* step.stream(ctx, state)
+        this._cache.set(key, result)
+        return result
       },
     }
   }
 }
 
 const agent = new Agent({
-  middleware: [new TracingMiddleware()],
+  middleware: [new CacheMiddleware()],
 })
 ```
 
-Every step (model calls, tool calls, sub-orchestrators) gets traced with the same logic. Multiple middleware compose naturally: tracing, a rate limiter, and a guardrail can each be separate middleware applied to every step, rather than duplicated logic inside each one.
+Every step (model calls, tool calls, sub-orchestrators) gets the same caching logic. Multiple middleware compose naturally: a cache, a rate limiter, and a guardrail can each be separate middleware applied to every step, rather than duplicated logic inside each one.
 
 ### Checkpointing
 

From ea5c55f4c6ffdf9f0b7cef2e4c81a2b6dcd44054 Mon Sep 17 00:00:00 2001
From: Patrick Gray <pgrayy@amazon.com>
Date: Thu, 9 Apr 2026 10:24:06 -0400
Subject: [PATCH 4/5] refactor: merge AgentContext into AgentState, update code
 examples and prose

---
 designs/0005-state-machine.md | 87 +++++++++++++++++------------------
 1 file changed, 42 insertions(+), 45 deletions(-)

diff --git a/designs/0005-state-machine.md b/designs/0005-state-machine.md
index be42a2e6c..8aab580f4 100644
--- a/designs/0005-state-machine.md
+++ b/designs/0005-state-machine.md
@@ -20,21 +20,30 @@ The agent loop is decomposed into five layers:
 - **Plugins**: register hook callbacks to observe and indirectly influence execution (e.g., cancel, retry)
 - **Orchestrators**: coordinate steps, handle routing, and can nest other orchestrators
 
-Steps and orchestrators share the same `invoke`/`stream` interface, enabling nesting and uniform wrapping. All layers operate on shared **state** (mutable invocation data) and **context** (read-only dependencies).
+Steps and orchestrators share the same `invoke`/`stream` interface, enabling nesting and uniform wrapping. All layers operate on shared **state** passed explicitly to each layer.
 
-### State and Context
+### State
 
-All layers receive state and context explicitly, giving them a clear, bounded data contract rather than reaching into the Agent instance for what they need.
+All layers receive state explicitly, giving them a clear, bounded data contract rather than reaching into the Agent instance for what they need.
 
-`AgentState` holds all mutable per-invocation data:
+`AgentState` holds all per-invocation data:
 
 ```typescript
 interface AgentState {
+  // Dependencies
+  model: Model
+  toolRegistry: ToolRegistry
+  systemPrompt?: SystemPrompt
+  tracer: Tracer
+  meter: Meter
+  pluginRegistry: PluginRegistry
+  name: string
+  id: string
+
+  // Execution data
   messages: Message[]
   metrics: AgentMetric[]
   traces: AgentTrace[]
-
-  // Sub-state objects
   interrupt?: InterruptState
   app: StateStore  // user-facing key-value state
 
@@ -45,22 +54,7 @@ interface AgentState {
 }
 ```
 
-`AgentContext` holds read-only dependencies:
-
-```typescript
-interface AgentContext {
-  readonly model: Model
-  readonly toolRegistry: ToolRegistry
-  readonly systemPrompt?: SystemPrompt
-  readonly tracer: Tracer
-  readonly meter: Meter
-  readonly pluginRegistry: PluginRegistry
-  readonly name: string
-  readonly id: string
-}
-```
-
-See [0002-isolated-state](https://github.com/strands-agents/docs/pull/551) for a complementary proposal on AgentState lifecycle management (creation, persistence, invocation keys).
+See [0002-isolated-state](https://github.com/strands-agents/docs/pull/551) for the complete proposal on AgentState lifecycle management (creation, persistence, invocation keys).
 
 ### Clients
 
@@ -75,10 +69,10 @@ Clients are stateless, reusable, and unaware of the agent loop.
 
 ### Steps
 
-`Step` is a generic base class for the smallest unit of work in the loop. It provides `invoke` (request/response) derived from `stream` (yields events, returns a result). Subclasses only implement `stream`. For the agent loop, steps extend `AgentStep`, which fills in the shared context and state types:
+`Step` is a generic base class for the smallest unit of work in the loop. It provides `invoke` (request/response) derived from `stream` (yields events, returns a result). Subclasses only implement `stream`. For the agent loop, steps extend `AgentStep`, which fills in the state type:
 
 ```typescript
-type AgentStep<TEvent, TResult> = Step<AgentContext, AgentState, TEvent, TResult>
+type AgentStep<TEvent, TResult> = Step<AgentState, TEvent, TResult>
 ```
 
 Steps write their full results into state (that's how data flows between steps). The `TResult` return value is a typed convenience that surfaces the notable parts, giving the orchestrator direct, namespaced access without digging through state fields. Here are two examples:
@@ -89,10 +83,10 @@ Steps write their full results into state (that's how data flows between steps).
 class ModelStep extends AgentStep<ModelStreamEvent, ModelStepResult> {
   readonly name = 'model'
 
-  async *stream(ctx, state) {
-    const result = yield* ctx.model.streamAggregated(
+  async *stream(state) {
+    const result = yield* state.model.streamAggregated(
       state.messages,
-      this._buildStreamOptions(ctx, state)
+      this._buildStreamOptions(state)
     )
     state.lastModelResult = result
     return { type: 'model', stopReason: result.stopReason, message: result.message }
@@ -106,9 +100,9 @@ class ModelStep extends AgentStep<ModelStreamEvent, ModelStepResult> {
 class ToolStep extends AgentStep<ToolStreamEvent, ToolStepResult> {
   readonly name = 'tool'
 
-  async *stream(ctx, state) {
+  async *stream(state) {
     const toolUse = state.currentToolUse!
-    const tool = ctx.toolRegistry.get(toolUse.name)
+    const tool = state.toolRegistry.get(toolUse.name)
     if (!tool) {
       return { type: 'tool', result: this._errorResult(toolUse, 'not found') }
     }
@@ -124,7 +118,7 @@ Middleware sits between the orchestrator and a step, wrapping the step's `stream
 
 There are two kinds:
 
-**Built-in middleware** ships with the SDK and is always present. It's configured through state or context at runtime. One possible way to manage built-in middleware is via decorator syntax (`@`) on step class methods, though the exact mechanism is an implementation detail. Examples:
+**Built-in middleware** ships with the SDK and is always present. It's configured through state at runtime. One possible way to manage built-in middleware is via decorator syntax (`@`) on step class methods, though the exact mechanism is an implementation detail. Examples:
 
 | Middleware | What it does |
 |-----------|-------------|
@@ -148,9 +142,9 @@ class RateLimiter implements Middleware {
   wrap(step: Step): Step {
     return {
       ...step,
-      async *stream(ctx, state) {
+      async *stream(state) {
         await this._acquireToken()
-        return yield* step.stream(ctx, state)
+        return yield* step.stream(state)
       },
     }
   }
@@ -180,39 +174,38 @@ const agent = new Agent({
 For the agent loop, orchestrators extend `AgentOrchestrator`:
 
 ```typescript
-type AgentOrchestrator = Orchestrator<AgentContext, AgentState, AgentStreamEvent>
+type AgentOrchestrator = Orchestrator<AgentState, AgentStreamEvent>
 ```
 
 **ToolOrchestrator**: runs `ToolStep` for each tool use block.
 
 ```typescript
 class ToolOrchestrator extends AgentOrchestrator {
-  async *stream(ctx, state) {
+  async *stream(state) {
     const toolUseBlocks = this._extractToolUseBlocks(state)
     for (const block of toolUseBlocks) {
-      yield* this._toolStep.stream(ctx, { ...state, currentToolUse: block })
+      yield* this._toolStep.stream({ ...state, currentToolUse: block })
     }
     return { type: 'tools' }
   }
 }
 ```
 
-**Agent**: the top-level orchestrator. Agent follows the orchestrator pattern internally but doesn't extend `Orchestrator` directly, since its public `stream` method takes `InvokeArgs` rather than `(ctx, state)` for backwards compatibility. It creates the context and state, then runs the loop.
+**Agent**: the top-level orchestrator. Agent follows the orchestrator pattern internally but doesn't extend `Orchestrator` directly, since its public `stream` method takes `InvokeArgs` rather than `(state)` for backwards compatibility. It creates the state, then runs the loop.
 
 ```typescript
 class Agent {
   async *stream(args: InvokeArgs) {
-    const ctx = this._buildContext()
     const state = this._buildState(args)
 
     while (true) {
-      const result = yield* this._model.stream(ctx, state)
+      const result = yield* this._model.stream(state)
 
       if (result.stopReason !== 'toolUse') {
         return { type: 'done', result: this._buildResult(state) }
       }
 
-      yield* this._toolOrchestrator.stream(ctx, state)
+      yield* this._toolOrchestrator.stream(state)
     }
   }
 }
@@ -247,14 +240,14 @@ class CacheMiddleware implements Middleware {
   wrap(step: Step): Step {
     return {
       ...step,
-      async *stream(ctx, state) {
+      async *stream(state) {
         const key = this._buildKey(step.name, state)
         const cached = this._cache.get(key)
         if (cached) {
           return cached
         }
 
-        const result = yield* step.stream(ctx, state)
+        const result = yield* step.stream(state)
         this._cache.set(key, result)
         return result
       },
@@ -277,14 +270,18 @@ Because the agent loop is composed of discrete steps, the orchestrator can retur
 class Agent {
   private _steps = [this._modelStep, this._toolOrchestrator]
 
+  /**
+   * Variant of the agent loop that resolves steps by index, enabling checkpoint/resume
+   * at any position. The loop doesn't have to be structured this way though. This is
+   * more demonstrative.
+   */
   async *stream(args: InvokeArgs) {
-    const ctx = this._buildContext()
     const state = args.checkpoint?.state ?? this._buildState(args)
     let stepIndex = args.checkpoint?.stepIndex ?? 0
 
     while (true) {
       const step = this._steps[stepIndex]
-      const result = yield* step.stream(ctx, state)
+      const result = yield* step.stream(state)
 
       if (result.stopReason === 'done') {
         return { type: 'done', result: this._buildResult(state) }
@@ -292,7 +289,7 @@ class Agent {
 
       stepIndex = (stepIndex + 1) % this._steps.length
 
-      if (ctx.checkpointing) {
+      if (state.checkpointing) {
         return { type: 'checkpoint', checkpoint: { stepIndex, state } }
       }
     }
@@ -340,7 +337,7 @@ const [result1, result2] = await Promise.all([
 // result1 and result2 operated on separate AgentState instances
 ```
 
-The agent's context (model, tools, configuration) is shared and read-only. The state (messages, metrics, traces) is per-invocation. Steps don't reach into the agent instance for what they need, they operate on the state they're given. This is the same state/context split proposed in [0002-isolated-state](https://github.com/strands-agents/docs/pull/551), which we discussed in a previous meeting.
+The agent's dependencies and execution data all live in `AgentState`. Steps don't reach into the agent instance for what they need, they operate on the state they're given. See [0002-isolated-state](https://github.com/strands-agents/docs/pull/551) for the full proposal on state lifecycle management.
 
 ## Guidelines
 

From 0a83dff16840eea9aa47222a9c27d457c13eaf79 Mon Sep 17 00:00:00 2001
From: Patrick Gray <pgrayy@amazon.com>
Date: Thu, 9 Apr 2026 10:25:40 -0400
Subject: [PATCH 5/5] fix: remove optional markers from systemPrompt and
 interrupt in AgentState

---
 designs/0005-state-machine.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/designs/0005-state-machine.md b/designs/0005-state-machine.md
index 8aab580f4..a199df18d 100644
--- a/designs/0005-state-machine.md
+++ b/designs/0005-state-machine.md
@@ -33,7 +33,7 @@ interface AgentState {
   // Dependencies
   model: Model
   toolRegistry: ToolRegistry
-  systemPrompt?: SystemPrompt
+  systemPrompt: SystemPrompt
   tracer: Tracer
   meter: Meter
   pluginRegistry: PluginRegistry
@@ -44,7 +44,7 @@ interface AgentState {
   messages: Message[]
   metrics: AgentMetric[]
   traces: AgentTrace[]
-  interrupt?: InterruptState
+  interrupt: InterruptState
   app: StateStore  // user-facing key-value state
 
   // Intra-loop temporaries (step-to-step communication)