Skip to content

Comments

Update pinval.model_run with 2026 model runs#985

Open
jeancochrane wants to merge 10 commits intomasterfrom
jeancochrane/update-pinval-model-run-with-2026-wip-runs
Open

Update pinval.model_run with 2026 model runs#985
jeancochrane wants to merge 10 commits intomasterfrom
jeancochrane/update-pinval-model-run-with-2026-wip-runs

Conversation

@jeancochrane
Copy link
Member

@jeancochrane jeancochrane commented Feb 6, 2026

This PR updates the seed that powers https://github.com/ccao-data/homeval/ so that we can perform HomeVal deployments with 2026 model runs.

I've already run two different HomeVal deployments off of this PR, using different model runs in the pinval.model_run seed. See the latest one here: https://github.com/ccao-data/homeval/actions/runs/22119406898

@jeancochrane jeancochrane force-pushed the jeancochrane/update-pinval-model-run-with-2026-wip-runs branch from f62cfbe to 21c489d Compare February 17, 2026 22:31
@jeancochrane jeancochrane changed the base branch from master to jeancochrane/add-2026-models-to-final-model-table February 17, 2026 22:32
Comment on lines +14 to +18
# Get model metadata for every final model. We do this by inner joining
# The `metadata` table to the `final_model` table instead of filtering
# the metadata table by `run_type == 'final'` to make it easier to run
# tests on this table, since we can control the contents of `final_model`
# via a dbt seed
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came to realize this quirk of the model while testing a staging HomeVal deployment, so I figured I'd persist the change to make future HomeVal staging deployments easier.

Comment on lines +20 to +22
dbt.source("model", "metadata")
.join(
dbt.ref("model.final_model").select("run_id"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to dbt.source() and dbt.ref() for our queries here has the effect of including these upstream models in the directed graph that dbt builds for this model. That means that when we merge any changes to the model.final_model_raw seed in the future, our workflow will also rebuild model.training_data when it builds all children of modified resources.

pin_cd.class_code IS NULL -- Class is not in our class dict
OR NOT pin_cd.regression_class
OR (pin_cd.modeling_group NOT IN ('SF', 'MF'))
OR (pin_cd.modeling_group NOT IN ('SF', 'MF', 'BB'))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do include B&Bs in the training and assessment sets for the model, even though they usually get modeled by hand. That means that a B&B will wind up with is_report_eligible == TRUE and reason_report_ineligible == 'non_regression_class' unless we allow the 'BB' modeling group here. This is not really a huge deal, since it doesn't affect the HomeVal reports that we generate for these PINs, but it means that all B&Bs fail the pinval_assessment_card_reason_report_ineligible_is_null_when_is_report_eligible data test, which adds unnecessary noise to that test.

@jeancochrane jeancochrane marked this pull request as ready for review February 17, 2026 23:24
@jeancochrane jeancochrane requested a review from a team as a code owner February 17, 2026 23:24
@jeancochrane
Copy link
Member Author

Tagging in @wagnerlmichael as a domain expert, and @wrridgeway as a codeowner.

Copy link
Member

@wrridgeway wrridgeway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks.

Copy link
Member

@wagnerlmichael wagnerlmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

'2024-03-17-stupefied-maya',
'2025-02-11-charming-eric'
'2025-02-11-charming-eric',
'2026-02-11-recursing-rob'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Thought, non-blocking]: I wonder if it would make sense to add another column in pinval.model_run.csv such that these could be algorithmically picked from that seed. It irks me we can't use it for this filter here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I've been thinking about that too! I added a note to myself to discuss during our 2026 modeling retrospective.

Base automatically changed from jeancochrane/add-2026-models-to-final-model-table to master February 18, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants