Skip to content

Releases: langwatch/scenario

python: v0.7.24

18 Apr 14:18
9fc103d

Choose a tag to compare

0.7.24 (2026-04-18)

Features

  • add GOAT strategy with dynamic technique selection for RedTeamAgent (#306) (e62c292)
  • python: add async-native scenario.arun for loop-bound resources (#369) (a797773)

python: v0.7.23

10 Apr 11:11
52897ac

Choose a tag to compare

0.7.23 (2026-04-08)

Bug Fixes

  • force verdict on judge discovery exhaustion instead of hard-failing (#315) (197f567)
  • judge off-by-one, auto-run on script exhaustion, assertion criteria, marathon_script cleanup (#289) (91f76d1)

javascript: v0.4.10

10 Apr 11:12
ed92529

Choose a tag to compare

0.4.10 (2026-04-10)

Bug Fixes

  • default scenarioSetId to 'default' for all events (#305) (7bbc8c6)
  • default scenarioSetId to "default" when not provided (7bbc8c6), closes #304
  • force verdict on judge discovery exhaustion instead of hard-failing (#315) (197f567)
  • judge off-by-one, auto-run on script exhaustion, assertion criteria, marathon_script cleanup (#289) (91f76d1)
  • revert audio model and reduce multilingual test turns (#314) (177cdb6)
  • revert audio model to gpt-4o-audio-preview and reduce multilingual test turns (177cdb6)

Miscellaneous

  • use gpt-5-mini everywhere, enable telemetry, fix reasoning model compat (#311) (2384fb2)

python: v0.7.22

22 Mar 23:56
9ae6648

Choose a tag to compare

0.7.22 (2026-03-22)

Features

  • add scenario role and run_id attributes to agent spans (#294) (d7e31cc)

javascript: v0.4.9

22 Mar 23:57
46e0246

Choose a tag to compare

0.4.9 (2026-03-22)

Features

  • add scenario role and run_id attributes to agent spans (#294) (d7e31cc)

python: v0.7.21

13 Mar 13:04
00534da

Choose a tag to compare

0.7.21 (2026-03-13)

Features

  • dual conversation histories for RedTeamAgent (#282) (fa45876)

Bug Fixes

javascript: v0.4.8

13 Mar 12:45
e53bbd1

Choose a tag to compare

0.4.8 (2026-03-13)

Features

  • dual conversation histories for RedTeamAgent (#282) (fa45876)
  • support optional runId in RunOptions (#284) (d5fd769)

python: v0.7.20

10 Mar 09:26
a25d7be

Choose a tag to compare

0.7.20 (2026-03-10)

Features

  • backtracking on hard refusals for RedTeamAgent (#270) (62190a0)

Bug Fixes

  • align red teaming nomenclature between TypeScript and Python (62190a0)
  • resolve CI test failures blocking JS publish (#275) (049a13b)

javascript: v0.4.7

10 Mar 09:18
1125f83

Choose a tag to compare

0.4.7 (2026-03-10)

Features

  • add backtracking on hard refusals for RedTeamAgent (TypeScript) (#271) (79157cd)
  • backtracking on hard refusals for RedTeamAgent (#270) (62190a0)

Bug Fixes

  • align red teaming nomenclature between TypeScript and Python (62190a0)
  • resolve CI test failures blocking JS publish (#275) (049a13b)

python: v0.7.19

08 Mar 12:25
cc8ca90

Choose a tag to compare

0.7.19 (2026-03-07)

Features

  • add langwatch.origin="simulation" span attribute (#264) (30fbdf0)

Miscellaneous