Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
14d4d7e
Create OVHcloud AI Endpoints community documentation (#335)
eliasto Mar 20, 2026
58d1cd5
docs(hooks): document exception property on AfterToolCallEvent (#482)
charles-dyfis-net Mar 20, 2026
1dff940
docs: rename StructuredOutputException to StructuredOutputError for T…
pgrayy Mar 20, 2026
db2fcf4
docs: update TypeScript SDK import paths for model subpath exports (#…
pgrayy Mar 23, 2026
8a840a1
docs: update OpenAI TypeScript examples from gpt-4o to gpt-5.4 (#697)
pgrayy Mar 23, 2026
7c1b466
fix: convert metrics output section to collapsible details/summary (#…
zastrowm Mar 24, 2026
6b1bc83
docs: update Kiro references (#702)
awsarron Mar 24, 2026
97aa679
docs: add getTracer custom spans example to traces page (#700)
lizradway Mar 25, 2026
da05c5a
feat: add extension template docs (#704)
mkmeral Mar 25, 2026
10aabab
docs: add TypeScript coverage to contribute, examples sections (#707)
pgrayy Mar 25, 2026
2de3a24
docs: update user guide for TypeScript SDK 1.0 RC (#708)
pgrayy Mar 25, 2026
3156d90
docs: add strands-google and strands-perplexity to community tools (#…
agent-of-mkmeral Mar 26, 2026
a14c10d
docs: update A2AExpressServer import path to sdk/a2a/express (#695)
pgrayy Mar 26, 2026
0b64b59
docs: add TypeScript vended tools documentation (#685)
zastrowm Mar 27, 2026
4fa8c51
feat(telemetry): add local trace docs (#705)
lizradway Mar 27, 2026
9d1bfa9
feat: vercel model provider (#689)
awsarron Mar 27, 2026
10402e9
feat: redesign homepage with code-forward developer experience (#683)
ryanycoleman Mar 27, 2026
b57fe7f
docs: update Tools overview and Agent as tool multi agent page with A…
notowen333 Mar 30, 2026
47b57cb
feat: design doc for stateful model providers (#712)
pgrayy Mar 31, 2026
8827beb
docs: add OpenAI Responses API model provider page (#719)
pgrayy Mar 31, 2026
4363884
Remove community flag from get-featured.mdx (#720)
zastrowm Mar 31, 2026
dfe3561
docs(team): Add AGENT_GUIDELINES.md — conventions for agents on Stran…
mkmeral Mar 31, 2026
1c46a2e
Fix broken samples links (#722)
clareliguori Apr 1, 2026
2623401
ignore log output files (#724)
Unshure Apr 1, 2026
639b9e5
docs: apply prettier to typescript snippets; add ts summarization-con…
notowen333 Apr 2, 2026
6be74c5
docs: add Agent SOP blog post structure and authors (#723)
Unshure Apr 3, 2026
84b82f6
feat(blog): add Model-Driven Approach blog post (#730)
Unshure Apr 3, 2026
2fef973
feat: add TypeScript SDK announcement blog post (#741)
pgrayy Apr 7, 2026
4e42156
docs: add Agent SOP references to prompts page (#743)
Unshure Apr 7, 2026
e3634b3
docs: add wire-safe serialization section for TypeScript streaming ev…
agent-of-mkmeral Apr 8, 2026
56f1247
docs: add documentation for TS agent-as-tool and agent.cancel() (#744)
notowen333 Apr 8, 2026
0dca2d1
docs: add state machine design doc (0005) (#731)
pgrayy Apr 9, 2026
428cdb4
feat(blog): add Claude 4 interleaved thinking blog post (#737)
Unshure Apr 9, 2026
b2a12a7
feat(blog): add Physical AI blog post (#739)
Unshure Apr 9, 2026
0ce30d6
feat(community): add s3-vectors-memory plugin (#740)
nihilg Apr 9, 2026
9b8abae
docs: add Strands Agents 1.0 blog post (#738)
Unshure Apr 9, 2026
4d61cec
docs: add TypeScript API references alongside Python links (#747)
agent-of-mkmeral Apr 9, 2026
aed8b59
docs: update note formatting for default inference model (#729)
alesanfra Apr 9, 2026
74a83b6
feat(community/tools): add strands-sql — multi-dialect SQL tool for S…
NithiN-1808 Apr 11, 2026
827df6e
Update Graph Import in session-management.mdx (#710)
mvijil910 Apr 13, 2026
6582619
docs(designs): cedar auth plugin (#732)
lizradway Apr 15, 2026
3078b7d
fix: remove headings inside tabs that created broken table of content…
lizradway Apr 15, 2026
237bc0c
docs(bidi): migrate stop_conversation to strands_tools.stop and reque…
agent-of-mkmeral Apr 16, 2026
4a9cc77
docs: add correctness, goal success rate, coherence evaluator example…
ybdarrenwang Apr 17, 2026
4b412a3
docs: add required frontmatter guide for community catalog pages (#753)
agent-of-mkmeral Apr 17, 2026
feacfa1
feat: add multiagent session manager doc (#764)
JackYPCOnline Apr 17, 2026
a71c2a5
docs(designs): interventions primitive (#763)
lizradway Apr 20, 2026
893a317
fix: update ci for monorepo path (#773)
lizradway Apr 22, 2026
41e82d1
docs: add Ollama example to Vercel TS provider page (#771)
gautamsirdeshmukh Apr 22, 2026
121b22e
docs(simulator): updated tool_simulator docs (#752)
poshinchen Apr 22, 2026
20903dd
docs: add TypeScript examples to interrupts documentation
zastrowm Apr 23, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,13 @@ jobs:
- name: Typecheck
run: npm run typecheck

- name: Build TypeScript SDK
run: npm install --ignore-scripts && npm run build
working-directory: .build/sdk-typescript

- name: Re-link SDK types
run: npm install

- name: Typecheck snippets
run: npm run typecheck:snippets

Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,5 @@ __*__/
.build

CLAUDE.md
mise.toml
*.log
6 changes: 4 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,8 +124,10 @@ All checks must pass before commit is allowed.
- No semicolons
- Single quotes
- Line length: 120 characters
- Line length for doc snippet files under `src/content/docs/`: 90 characters
- Tab width: 2 spaces
- Trailing commas in ES5 style
- Template literal contents in doc snippets must also stay under 90 characters per line. Prettier does not enforce this automatically.

**Example**:
```typescript
Expand Down Expand Up @@ -288,8 +290,8 @@ const result = await agent.invoke('Hello')
{
"scripts": {
"test": "tsc --noEmit",
"format": "prettier --write docs",
"format:check": "prettier --check docs"
"format": "prettier --write docs 'src/content/docs/**/*.ts'",
"format:check": "prettier --check docs 'src/content/docs/**/*.ts'"
}
}
```
Expand Down
10 changes: 10 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,16 @@ npm run format:check # formatting

Pre-commit hooks run these automatically.

### Sync Docs with Source Code Updates

After merging source code changes, run

```bash
npm run sdk:sync
```

to make the doc types and generated API pages even with the new source code state.
New implementations should link to the API page from the User Guide.

## Reporting Bugs/Feature Requests

Expand Down
223 changes: 223 additions & 0 deletions designs/0004-stateful-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Strands: Stateful Model Providers

**Status**: Proposed

**Date**: 2026-03-26

## Overview

We've been asked to add stateful model provider support to the Strands Python SDK, targeting the OpenAI Responses API on Amazon Bedrock (Project Mantle). The SDK already supports the Responses API in stateless mode via `OpenAIResponsesModel`. The ask is to enable stateful server-side conversation management: the server tracks context across turns, so the SDK sends only the latest message instead of the full history each time. The Responses API on Bedrock also brings compute environment selection, server-side context compaction, and reasoning effort control.

## Background

The OpenAI Responses API is hosted on AWS Bedrock's Mantle endpoint (`bedrock-mantle.{region}.api.aws`). It uses an OpenAI-compatible format and supports stateful server-side conversation management, where the server tracks context across turns so the client only sends the latest message.

### Features

- **Stateful conversations**: Server tracks context across turns (`previous_response_id`, `conversation`)
- **Context management**: Automatic truncation (`truncation`) and server-side compaction (`context_management`) for long conversations
- **Inference controls**: `temperature`, `top_p`, `max_output_tokens`
- **Reasoning**: Effort control from none to xhigh (`reasoning.effort`) with optional summaries (`reasoning.summary`)
- **Tools**: Function tools (client-side, same as today) plus server-side built-in tools like web search, file search, and code interpreter
- **Output format**: Plain text, JSON schema enforcement, JSON mode (`text.format`), verbosity control (`text.verbosity`)
- **Execution**: Streaming (`stream`) and background/async modes (`background`), parallel tool calls (`parallel_tool_calls`, `max_tool_calls`)
- **Storage**: Response persistence (`store`) and metadata tagging (`metadata`)
- **Caching**: Prompt caching (`prompt_cache_key`, `prompt_cache_retention`)
- **Service tiers**: Default, flex, priority (`service_tier`)
- **Compute environments**: e.g., AgentCore Runtime (`compute_environment`)

### Usage

```python
# Turn 1: No conversation ID yet, send full input
request = {
"model": "us.anthropic.claude-sonnet-4-20250514",
"input": [{"role": "user", "content": [{"type": "input_text", "text": "Hello"}]}],
"instructions": "You are a helpful assistant.",
"stream": True
}
# Server responds with id: "resp_abc123"

# Turn 2: Include previous_response_id, send only latest message
request = {
"model": "us.anthropic.claude-sonnet-4-20250514",
"previous_response_id": "resp_abc123",
"input": [{"role": "user", "content": [{"type": "input_text", "text": "What did I just say?"}]}],
"instructions": "You are a helpful assistant.",
"stream": True
}
# Server rebuilds context from the chain, responds with id: "resp_def456"
```

The `previous_response_id` forms a linked list of turns. The server walks the chain to rebuild context. There is also a newer `conversation` parameter that provides a persistent container (similar to the old Assistants API threads), but `previous_response_id` is the established mechanism.

## Solution

What follows is the full vision for stateful model support in Strands. Some of this we may reach iteratively, for example starting with stateful mode on `OpenAIResponsesModel` and adding the `BedrockModel` subpackage later. The goal is to align the team on direction so that incremental work stays on track.

### Model Provider

`BedrockModel` is refactored from a single file (`bedrock.py`) into a subpackage:

```
strands/models/bedrock/
├── __init__.py # exports BedrockModel, backward-compatible imports
├── base.py # shared config, region resolution, boto session, facade logic
├── converse.py # current Converse/ConverseStream (extracted from bedrock.py)
└── responses.py # new Responses API implementation
```

`BedrockModel` becomes a facade. The `api` parameter controls dispatch:

```python
# Converse API (default, current behavior, nothing changes)
model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514")

# Responses API (new, targets Mantle endpoint)
model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514", api="responses")

# Responses API with compute environment
model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514",
api="responses",
compute_environment="agentcore",
)

# Pass-through for any Responses API parameter
model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514",
api="responses",
params={"reasoning": {"effort": "high"}, "truncation": "auto"},
)
```

- The Converse path uses boto3; the Responses path uses the OpenAI Python SDK with SigV4 signing via a custom httpx transport that resolves credentials from the same boto session
- Bedrock API key auth is also supported as a simpler alternative
- Request formatting and streaming event parsing are extracted into shared utilities used by both `bedrock/responses.py` and the existing `OpenAIResponsesModel`
- Provider-specific logic (auth, endpoint, client creation) stays in each provider

### Model State

We introduce a new framework-managed dict called `model_state` that flows between the Agent and model provider. This keeps model providers stateless while enabling stateful conversation tracking.

- Owned by the Agent, not the model provider (providers remain stateless)
- Passed to `model.stream()` as a keyword argument (existing providers ignore it via `**kwargs`)
- Model reads `conversation_id` from `model_state` and writes the updated ID back after each response
- Persisted in sessions via `_internal_state` in `SessionAgent` (works with all session manager implementations)
- Accessible in hooks via `event.model_state`

### Messages

When `model_state` contains a conversation ID, the Agent clears `agent.messages` at the start of each top-level invocation. Within an invocation, messages are appended normally (the event loop needs them for tool execution). After the invocation, `agent.messages` contains only that invocation's messages.

```python
agent = Agent(model=BedrockModel(api="responses"))

result1 = agent("Hello")
# agent.messages has: [user: "Hello", assistant: "Hi there!"]

result2 = agent("What's the weather?")
# agent.messages has: [user: "What's the weather?", assistant: "Let me check..."]
# (previous invocation's messages are cleared)
# Server still has full context via previous_response_id
```

- The server owns conversation history in stateful mode, so clearing locally avoids confusion about what the model sees and prevents unbounded memory growth
- `MessageAddedEvent` hooks still fire for each message during the invocation
- Session managers persist messages as they happen via hooks
- Nothing changes within an invocation; only cross-invocation behavior differs

### Conversations

The Responses implementation maps user-defined conversation IDs to server-generated response IDs in `model_state`. Users work with their own meaningful IDs and never need to manage server-generated ones. By default, all invocations use a `"default"` conversation. Users who need multiple conversations pass their own `conversation_id` on invoke:

```python
agent = Agent(model=BedrockModel(api="responses"))

# Single conversation (uses "default" implicitly)
agent("Hello")
agent("What's the capital of France?")
agent("What river runs through it?") # server knows "it" = Paris

# Multi-conversation with user-defined IDs
agent("Help with billing", conversation_id="billing")
agent("What was my last charge?", conversation_id="billing")

agent("Track my order", conversation_id="orders")
agent("Any updates?", conversation_id="orders")

# Switch back
agent("One more billing question", conversation_id="billing")
```

- `model_state` maintains the mapping (e.g., `{"default": "resp_abc", "billing": "resp_def", "orders": "resp_xyz"}`)
- Session manager persists the mapping automatically, so all conversations survive restarts
- Users never need to capture or manage server-generated IDs
- Defaults to `NullConversationManager` when the model is operating in stateful mode
- If the user provides a different conversation manager, we emit a warning (not an exception)
- `ContextWindowOverflowException` is not retried client-side in stateful mode since the server handles context management

### Session Management

`model_state` (including the full conversation ID mapping) is persisted in `_internal_state` within `SessionAgent`. On session restore, the Agent restores `model_state` and subsequent requests resume their server-side conversations.

```python
# Session 1: Start conversations
session_mgr = RepositorySessionManager(session_id="user-123", ...)
agent = Agent(model=BedrockModel(api="responses"), session_manager=session_mgr)
agent("Help with my order", conversation_id="support")
agent("Check my balance", conversation_id="billing")

# Session 2: Resume (maybe after process restart)
session_mgr = RepositorySessionManager(session_id="user-123", ...)
agent = Agent(model=BedrockModel(api="responses"), session_manager=session_mgr)
agent("Any update on my order?", conversation_id="support") # resumes support conversation
agent("What was my last charge?", conversation_id="billing") # resumes billing conversation
```

- All conversation mappings survive agent restarts
- All session manager implementations (file, S3, DynamoDB, custom) get this automatically since `_internal_state` is already serialized

### Multi-Agent

Each agent in a swarm or graph has its own independent `model_state` and conversation ID mapping. `model_state` is reset alongside `messages` and `state` in `reset_executor_state()`, following the existing reset pattern.

- When `model_state` is reset (no conversation ID), the first request sends the full message history (including prefilled messages and context summaries), starting a new server-side conversation
- Text-based context passing (`_build_node_input`) works unchanged in both swarm and graph
- In graph, `reset_executor_state()` only runs when `reset_on_revisit` is enabled and a node is revisited; on revisit without reset, the agent resumes its existing server-side conversation
- Parallel node execution in graph is safe since `model_state` is per-agent, not per-model

### Plugin Pattern

Rather than the Agent having special-case `if stateful:` logic, the model provider could extend `Plugin` and register hooks for its lifecycle behaviors:

```python
class BedrockModel(Model, Plugin):
name = "strands:bedrock-model"

@hook
def _on_before_invocation(self, event: BeforeInvocationEvent):
if event.agent.model_state.get("conversation_id"):
event.agent.messages.clear()
```

- Keeps the Agent generic with no stateful-mode special cases
- Any stateful provider can self-describe its behaviors through the existing hook/plugin system

## Questions

- **Background/async inference**: Should we support `background: true` (fire-and-forget with polling) in the initial release?
- **Mantle feature parity**: Which Converse features (guardrails, prompt caching) are NOT available through the Responses API?
- **Model availability**: Which models are available on the Mantle endpoint beyond OpenAI GPT OSS?
- **Conversation object**: Does Mantle support the `conversation` parameter, or only `previous_response_id`?
- **Conversation retention**: How long does the server maintain conversation state?

## Resources

- [AWS Bedrock Mantle docs](https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-mantle.html)
- [AWS Bedrock supported APIs](https://docs.aws.amazon.com/bedrock/latest/userguide/apis.html)
- [AWS Bedrock API key usage](https://docs.aws.amazon.com/bedrock/latest/userguide/api-keys-use.html)
- [OpenAI Responses API reference](https://platform.openai.com/docs/api-reference/responses/create)
- [OpenAI conversation state guide](https://platform.openai.com/docs/guides/conversation-state)
- [OpenAI Responses API background mode](https://platform.openai.com/docs/guides/background)
- [Exploring Mantle CLI (blog post)](https://dev.to/aws/exploring-the-openai-compatible-apis-in-amazon-bedrock-a-cli-journey-through-project-mantle-2114)
Loading
Loading