Skip to content

WIP: add partner seconds_stopped observation feature#471

Open
eugenevinitsky wants to merge 6 commits into
emerge/temp_trainingfrom
ev/stopped_feature
Open

WIP: add partner seconds_stopped observation feature#471
eugenevinitsky wants to merge 6 commits into
emerge/temp_trainingfrom
ev/stopped_feature

Conversation

@eugenevinitsky
Copy link
Copy Markdown

@eugenevinitsky eugenevinitsky commented Jun 2, 2026

WIP. Adds a per-partner "how long has this agent been stopped" signal to the
partner observation block.

…FEATURES 8->9)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 2, 2026 20:39
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new per-partner observation feature to expose “how long this partner agent has been stopped” (normalized and capped), aligning partner observations with the existing ego seconds_stopped signal.

Changes:

  • Bump PARTNER_FEATURES from 8 → 9 to reflect the expanded partner observation vector.
  • Extend write_partner_obs to append fminf(1.0f, other->seconds_stopped / MAX_STOPPED_SECONDS) as the 9th partner feature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The interactive obs viewer hardcoded the partner block stride as 8, so with
PARTNER_FEATURES=9 it mis-parsed partners and shifted every subsequent obs
block (lanes/boundaries/traffic). Add partner_features to the replay header and
use H.partner_features (mirroring how target_features is already handled).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@vcharraut vcharraut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the notebooks with the new partners features

eugenevinitsky and others added 4 commits June 3, 2026 07:08
The smoke golden is only bit-reproducible inside the QEMU/Haswell smoke
image, so it cannot be regenerated on an arbitrary dev box. This adds a
push-triggered (marker-gated) CI job that builds the image, runs the
train smoke test with SMOKE_UPDATE_GOLDEN=1, uploads the result as an
artifact, and commits it back to the branch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Regenerate smoke golden in the pinned QEMU image (via CI workflow); it
  now reflects the 9th partner feature plus the obs/reward_components
  metrics the current pipeline logs.
- Mark partner seconds_stopped obs as a temporary hack in drive.h.
- 05_inference.ipynb: add seconds_stopped to partner_labels, drive the
  per-feature loop off len(partner_labels) instead of a literal 8, and
  fix stale shape comments + the markdown obs spec. Also corrects a
  pre-existing length/width label swap to match the C write order.
- Workflow: git add -f the golden (tests/smoke_tests/data is gitignored).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The partner heatmap sets xticks to range(env.partner_features) (now 9) but
xticklabels to partner_labels; without the 9th label the ticks/labels
mismatch. Append seconds_stopped to match.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants