Skip to content

feat: support generated-output flow in run_experiment (no prompt_temp…#486

Open
BipinShetty wants to merge 3 commits intomainfrom
feat/generated-output-flow-sdk
Open

feat: support generated-output flow in run_experiment (no prompt_temp…#486
BipinShetty wants to merge 3 commits intomainfrom
feat/generated-output-flow-sdk

Conversation

@BipinShetty
Copy link
Contributor

@BipinShetty BipinShetty commented Feb 24, 2026

User description

…late)

  • Remove hard ValueError for prompt_template=None in run_experiment(). The API is the authority on flow determination: if the dataset has a generated_output column, the API accepts prompt_template_version_id=None and runs metrics without LLM generation. If not, the API returns 3512.
  • Experiments.run(): only set default prompt_settings when a template is provided; pass prompt_template_id=None for generated-output flow.
  • Jobs.create(): make prompt_template_id and prompt_settings Optional; only include them in CreateJobRequest when non-None so the API contract is respected for both flows.
  • Add 3 tests: generated-output flow, prompt-takes-precedence, and no-prompt-no-dataset raises ValueError.

Shortcut:

Description:

Tests:

  • Unit Tests Added
  • E2E Test Added (if it's a user-facing feature, or fixing a bug)

Generated description

Below is a concise technical summary of the changes proposed in this PR:
Enables the generated-output flow in experiment runs by allowing prompt_template to be optional, shifting flow determination to the API based on dataset content. Updates the experiment and job creation logic to conditionally include prompt-related parameters only when a template is provided.

TopicDetails
Flow Validation Add unit tests to verify the generated-output flow, ensure prompt templates take precedence when provided, and validate error handling for missing datasets.
Modified files (1)
  • tests/test_experiments.py
Latest Contributors(2)
UserCommitDate
jweiler@galileo.aifeat-Add-GalileoMetric...December 19, 2025
david@rungalileo.iofeat-Add-batch-get-dat...December 12, 2025
Generated Output Modify run_experiment, Experiments.run, and Jobs.create to support optional prompt templates and settings, ensuring the API receives None values for generated-output flows.
Modified files (2)
  • src/galileo/experiments.py
  • src/galileo/jobs.py
Latest Contributors(2)
UserCommitDate
vamaq@users.noreply.gi...fix-Define-explicit-er...February 03, 2026
jweiler@galileo.aifeat-Add-GalileoMetric...December 19, 2025
This pull request is reviewed by Baz. Review like a pro on (Baz).

…late)

- Remove hard ValueError for prompt_template=None in run_experiment().
  The API is the authority on flow determination: if the dataset has a
  generated_output column, the API accepts prompt_template_version_id=None
  and runs metrics without LLM generation. If not, the API returns 3512.
- Experiments.run(): only set default prompt_settings when a template is
  provided; pass prompt_template_id=None for generated-output flow.
- Jobs.create(): make prompt_template_id and prompt_settings Optional;
  only include them in CreateJobRequest when non-None so the API contract
  is respected for both flows.
- Add 3 tests: generated-output flow, prompt-takes-precedence, and
  no-prompt-no-dataset raises ValueError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@BipinShetty BipinShetty requested a review from a team as a code owner February 24, 2026 09:50
@BipinShetty BipinShetty requested a review from csurfer February 24, 2026 09:50
…dataset_raises

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.87%. Comparing base (0e364eb) to head (38924d6).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #486      +/-   ##
==========================================
+ Coverage   81.85%   81.87%   +0.02%     
==========================================
  Files          96       96              
  Lines        9164     9166       +2     
==========================================
+ Hits         7501     7505       +4     
+ Misses       1663     1661       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants