Skip to content

Propagate suite-level CLI overrides to constituent task specs#180

Open
finbarrtimbers wants to merge 2 commits into
mainfrom
finbarr/suite-cli-override-propagation
Open

Propagate suite-level CLI overrides to constituent task specs#180
finbarrtimbers wants to merge 2 commits into
mainfrom
finbarr/suite-cli-override-propagation

Conversation

@finbarrtimbers

@finbarrtimbers finbarrtimbers commented May 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • When a -o CLI override targets a suite name, expand the suite and apply the override to every constituent task spec instead of dropping it.
  • Non-suite task specs continue to receive overrides directly.

Example of the bug

olmo-eval run -m <model> -t aime_2022_to_2025 -o sampling_params.max_tokens=16384

Before this fix, the override is keyed by aime_2022_to_2025 but the runner only sees the expanded constituent specs (aime_2022:pass_at_32, aime_2023:pass_at_32, aime_2024:pass_at_32, aime_2025:pass_at_32), so the max_tokens=16384 override is silently dropped and each task runs with its default max_tokens. After the fix, the override is propagated to each child spec.

Test plan

  • Invoke a suite with a CLI override (e.g. aime_2022_to_2025 -o sampling_params.max_tokens=16384) and confirm each constituent task receives the override.

🤖 Generated with Claude Code

…ored-By: Claude Opus 4.7 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2663b2e296

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/olmo_eval/cli/run/config.py Outdated
Comment on lines +344 to +345
for child_spec in get_suite(base_spec).expand():
resolved_cli_overrides.setdefault(child_spec, []).extend(cli_overrides)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve priority suffix when propagating suite overrides

When a suite is invoked with a priority suffix, e.g. -t suite@high -o max_tokens=..., expand_tasks() later runs the children as child@high and _build_task_overrides() looks up overrides by that exact spec. This propagation stores the override under the bare child_spec, so the suite-level override is still silently dropped for prioritized suites; append the original priority suffix to each child override key to match the expanded run specs.

Useful? React with 👍 / 👎.

…-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant