WIP: add partner seconds_stopped observation feature#471
Open
eugenevinitsky wants to merge 6 commits into
Open
WIP: add partner seconds_stopped observation feature#471eugenevinitsky wants to merge 6 commits into
eugenevinitsky wants to merge 6 commits into
Conversation
…FEATURES 8->9) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new per-partner observation feature to expose “how long this partner agent has been stopped” (normalized and capped), aligning partner observations with the existing ego seconds_stopped signal.
Changes:
- Bump
PARTNER_FEATURESfrom 8 → 9 to reflect the expanded partner observation vector. - Extend
write_partner_obsto appendfminf(1.0f, other->seconds_stopped / MAX_STOPPED_SECONDS)as the 9th partner feature.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The interactive obs viewer hardcoded the partner block stride as 8, so with PARTNER_FEATURES=9 it mis-parsed partners and shifted every subsequent obs block (lanes/boundaries/traffic). Add partner_features to the replay header and use H.partner_features (mirroring how target_features is already handled). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Yvonne511
approved these changes
Jun 2, 2026
vcharraut
requested changes
Jun 2, 2026
Collaborator
vcharraut
left a comment
There was a problem hiding this comment.
Update the notebooks with the new partners features
The smoke golden is only bit-reproducible inside the QEMU/Haswell smoke image, so it cannot be regenerated on an arbitrary dev box. This adds a push-triggered (marker-gated) CI job that builds the image, runs the train smoke test with SMOKE_UPDATE_GOLDEN=1, uploads the result as an artifact, and commits it back to the branch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Regenerate smoke golden in the pinned QEMU image (via CI workflow); it now reflects the 9th partner feature plus the obs/reward_components metrics the current pipeline logs. - Mark partner seconds_stopped obs as a temporary hack in drive.h. - 05_inference.ipynb: add seconds_stopped to partner_labels, drive the per-feature loop off len(partner_labels) instead of a literal 8, and fix stale shape comments + the markdown obs spec. Also corrects a pre-existing length/width label swap to match the C write order. - Workflow: git add -f the golden (tests/smoke_tests/data is gitignored). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The partner heatmap sets xticks to range(env.partner_features) (now 9) but xticklabels to partner_labels; without the 9th label the ticks/labels mismatch. Append seconds_stopped to match. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
WIP. Adds a per-partner "how long has this agent been stopped" signal to the
partner observation block.