Update pinval.model_run with 2026 model runs#985
Update pinval.model_run with 2026 model runs#985jeancochrane wants to merge 10 commits intomasterfrom
pinval.model_run with 2026 model runs#985Conversation
…` to resolve errors
…neligible_is_null_when_is_report_eligible` to match current failure count
f62cfbe to
21c489d
Compare
…ible_not_non_tri_for_tri`
| # Get model metadata for every final model. We do this by inner joining | ||
| # The `metadata` table to the `final_model` table instead of filtering | ||
| # the metadata table by `run_type == 'final'` to make it easier to run | ||
| # tests on this table, since we can control the contents of `final_model` | ||
| # via a dbt seed |
There was a problem hiding this comment.
I came to realize this quirk of the model while testing a staging HomeVal deployment, so I figured I'd persist the change to make future HomeVal staging deployments easier.
| dbt.source("model", "metadata") | ||
| .join( | ||
| dbt.ref("model.final_model").select("run_id"), |
There was a problem hiding this comment.
Switching to dbt.source() and dbt.ref() for our queries here has the effect of including these upstream models in the directed graph that dbt builds for this model. That means that when we merge any changes to the model.final_model_raw seed in the future, our workflow will also rebuild model.training_data when it builds all children of modified resources.
| pin_cd.class_code IS NULL -- Class is not in our class dict | ||
| OR NOT pin_cd.regression_class | ||
| OR (pin_cd.modeling_group NOT IN ('SF', 'MF')) | ||
| OR (pin_cd.modeling_group NOT IN ('SF', 'MF', 'BB')) |
There was a problem hiding this comment.
We do include B&Bs in the training and assessment sets for the model, even though they usually get modeled by hand. That means that a B&B will wind up with is_report_eligible == TRUE and reason_report_ineligible == 'non_regression_class' unless we allow the 'BB' modeling group here. This is not really a huge deal, since it doesn't affect the HomeVal reports that we generate for these PINs, but it means that all B&Bs fail the pinval_assessment_card_reason_report_ineligible_is_null_when_is_report_eligible data test, which adds unnecessary noise to that test.
|
Tagging in @wagnerlmichael as a domain expert, and @wrridgeway as a codeowner. |
| '2024-03-17-stupefied-maya', | ||
| '2025-02-11-charming-eric' | ||
| '2025-02-11-charming-eric', | ||
| '2026-02-11-recursing-rob' |
There was a problem hiding this comment.
[Thought, non-blocking]: I wonder if it would make sense to add another column in pinval.model_run.csv such that these could be algorithmically picked from that seed. It irks me we can't use it for this filter here
There was a problem hiding this comment.
Yup, I've been thinking about that too! I added a note to myself to discuss during our 2026 modeling retrospective.
This PR updates the seed that powers https://github.com/ccao-data/homeval/ so that we can perform HomeVal deployments with 2026 model runs.
I've already run two different HomeVal deployments off of this PR, using different model runs in the
pinval.model_runseed. See the latest one here: https://github.com/ccao-data/homeval/actions/runs/22119406898