[WIP - No Merge] Refactor DrivePolicy architecture and configuration#470
Open
vcharraut wants to merge 6 commits into
Open
[WIP - No Merge] Refactor DrivePolicy architecture and configuration#470vcharraut wants to merge 6 commits into
vcharraut wants to merge 6 commits into
Conversation
Collaborator
vcharraut
commented
Jun 2, 2026
- Updated DrivePolicy to use shared network architecture instead of split network.
- Changed input size parameters to specific sizes for ego, partner, lane, boundary, traffic control, and conditioning inputs.
- Modified encoder configuration to include activation functions and layer normalization options.
- Removed gigaflow architecture in favor of a more flexible encoder design.
- Adjusted observation size calculations to include counts of various features.
- Updated environment bindings and configuration files to reflect new parameter names and structures.
- Enhanced the DriveBackbone class to support new encoder configurations and pooling mechanisms.
- Updated the Drive class to accommodate changes in the backbone initialization and observation encoding.
- Updated DrivePolicy to use shared network architecture instead of split network. - Changed input size parameters to specific sizes for ego, partner, lane, boundary, traffic control, and conditioning inputs. - Modified encoder configuration to include activation functions and layer normalization options. - Removed gigaflow architecture in favor of a more flexible encoder design. - Adjusted observation size calculations to include counts of various features. - Updated environment bindings and configuration files to reflect new parameter names and structures. - Enhanced the DriveBackbone class to support new encoder configurations and pooling mechanisms. - Updated the Drive class to accommodate changes in the backbone initialization and observation encoding.
There was a problem hiding this comment.
Pull request overview
Refactors the DrivePolicy/DriveBackbone architecture to support per-input encoders with configurable activations/layer-norm, introduces explicit masking of padded slots via appended per-layer “valid counts”, and updates environment/config/notebook wiring to match the new observation layout and policy kwargs.
Changes:
- Replace the prior (gigaflow vs standard) encoder logic with a unified encoder design supporting configurable activation + optional LayerNorm, and switch to a shared-network actor/critic option.
- Extend the Drive observation vector with 4 appended count features (lane/boundary/partner/traffic) and use these counts for optional padding-masking during pooling.
- Update drive.ini defaults and notebooks/utilities to use the new policy parameter names and updated observation semantics.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pufferlib/ocean/torch.py | Implements the new DriveBackbone/Drive policy wiring: per-encoder widths, activation/LN config, masking-aware pooling, and shared-network actor/critic behavior. |
| pufferlib/ocean/env_binding.h | Exposes OBS_COUNT_FEATURES to Python so Python-side slicing/masking can match the C observation layout. |
| pufferlib/ocean/drive/drive.py | Updates Python Drive env observation sizing and config surface to include appended count features and shared-network flag. |
| pufferlib/ocean/drive/drive.h | Updates C-side observation layout (partner features + appended counts) and writes the per-layer slot counts into the observation buffer. |
| pufferlib/config/ocean/drive.ini | Renames/remaps policy parameters to the new per-encoder sizes + activation/LN options; switches to shared_network. |
| notebooks/notebook_utils.py | Updates notebook policy defaults to the new kwargs. |
| notebooks/06_architecture.ipynb | Refreshes architecture visualization/benchmarking code to the new encoder/backbone configuration surface. |
| notebooks/05_inference.ipynb | Updates observation documentation/visuals for the new partner feature set. |
| notebooks/01_observations.ipynb | Updates manual slicing checks to account for the 4 appended features at the end of the observation vector. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…rchitecture, and utility files - Updated references from `conditioning_dim` to `target_dim` in training and inference notebooks. - Changed `conditioning_input_size` to `target_input_size` in configuration files and utility scripts. - Adjusted encoder creation and usage in the DriveBackbone class to reflect the new target terminology. - Ensured consistency across all relevant files to improve clarity and maintainability.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.