Skip to content

[WIP - No Merge] Refactor DrivePolicy architecture and configuration#470

Open
vcharraut wants to merge 6 commits into
emerge/temp_trainingfrom
vcha/encoders
Open

[WIP - No Merge] Refactor DrivePolicy architecture and configuration#470
vcharraut wants to merge 6 commits into
emerge/temp_trainingfrom
vcha/encoders

Conversation

@vcharraut
Copy link
Copy Markdown
Collaborator

  • Updated DrivePolicy to use shared network architecture instead of split network.
  • Changed input size parameters to specific sizes for ego, partner, lane, boundary, traffic control, and conditioning inputs.
  • Modified encoder configuration to include activation functions and layer normalization options.
  • Removed gigaflow architecture in favor of a more flexible encoder design.
  • Adjusted observation size calculations to include counts of various features.
  • Updated environment bindings and configuration files to reflect new parameter names and structures.
  • Enhanced the DriveBackbone class to support new encoder configurations and pooling mechanisms.
  • Updated the Drive class to accommodate changes in the backbone initialization and observation encoding.

- Updated DrivePolicy to use shared network architecture instead of split network.
- Changed input size parameters to specific sizes for ego, partner, lane, boundary, traffic control, and conditioning inputs.
- Modified encoder configuration to include activation functions and layer normalization options.
- Removed gigaflow architecture in favor of a more flexible encoder design.
- Adjusted observation size calculations to include counts of various features.
- Updated environment bindings and configuration files to reflect new parameter names and structures.
- Enhanced the DriveBackbone class to support new encoder configurations and pooling mechanisms.
- Updated the Drive class to accommodate changes in the backbone initialization and observation encoding.
Copilot AI review requested due to automatic review settings June 2, 2026 20:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the DrivePolicy/DriveBackbone architecture to support per-input encoders with configurable activations/layer-norm, introduces explicit masking of padded slots via appended per-layer “valid counts”, and updates environment/config/notebook wiring to match the new observation layout and policy kwargs.

Changes:

  • Replace the prior (gigaflow vs standard) encoder logic with a unified encoder design supporting configurable activation + optional LayerNorm, and switch to a shared-network actor/critic option.
  • Extend the Drive observation vector with 4 appended count features (lane/boundary/partner/traffic) and use these counts for optional padding-masking during pooling.
  • Update drive.ini defaults and notebooks/utilities to use the new policy parameter names and updated observation semantics.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pufferlib/ocean/torch.py Implements the new DriveBackbone/Drive policy wiring: per-encoder widths, activation/LN config, masking-aware pooling, and shared-network actor/critic behavior.
pufferlib/ocean/env_binding.h Exposes OBS_COUNT_FEATURES to Python so Python-side slicing/masking can match the C observation layout.
pufferlib/ocean/drive/drive.py Updates Python Drive env observation sizing and config surface to include appended count features and shared-network flag.
pufferlib/ocean/drive/drive.h Updates C-side observation layout (partner features + appended counts) and writes the per-layer slot counts into the observation buffer.
pufferlib/config/ocean/drive.ini Renames/remaps policy parameters to the new per-encoder sizes + activation/LN options; switches to shared_network.
notebooks/notebook_utils.py Updates notebook policy defaults to the new kwargs.
notebooks/06_architecture.ipynb Refreshes architecture visualization/benchmarking code to the new encoder/backbone configuration surface.
notebooks/05_inference.ipynb Updates observation documentation/visuals for the new partner feature set.
notebooks/01_observations.ipynb Updates manual slicing checks to account for the 4 appended features at the end of the observation vector.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pufferlib/ocean/torch.py
Comment thread pufferlib/ocean/torch.py
Comment thread pufferlib/ocean/drive/drive.h
Comment thread notebooks/01_observations.ipynb Outdated
vcharraut and others added 5 commits June 2, 2026 23:32
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…rchitecture, and utility files

- Updated references from `conditioning_dim` to `target_dim` in training and inference notebooks.
- Changed `conditioning_input_size` to `target_input_size` in configuration files and utility scripts.
- Adjusted encoder creation and usage in the DriveBackbone class to reflect the new target terminology.
- Ensured consistency across all relevant files to improve clarity and maintainability.
@vcharraut vcharraut changed the title Refactor DrivePolicy architecture and configuration [WIP - No Merge] Refactor DrivePolicy architecture and configuration Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants