#288: Fix incorrect grade for buildings missing reporting years#292
Open
jacobdiaz wants to merge 3 commits into
Open
#288: Fix incorrect grade for buildings missing reporting years#292jacobdiaz wants to merge 3 commits into
jacobdiaz wants to merge 3 commits into
Conversation
Buildings that stopped reporting had no row at all for missing years (as opposed to a "Not Submitted" row), so they weren't counted as missing. Now we calculate expected years from the building's first appearance through the latest year in the dataset. Fixes vkoves#288
✅ Deploy Preview for radiant-cucurucho-d09bae ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Owner
|
@jacobdiaz - two things. First, take a look at the run all file - if you make a data pipeline change, you have to run that to re-generate the output files (see https://github.com/vkoves/electrify-chicago#run-data-processing) Second, can you add a test that would catch this issue and validate your fix? Thanks for the great work! |
Owner
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Problem
Buildings that stopped reporting were still getting a perfect submission rate (and thus an inflated overall grade). The issue was in
calculate_building_submission_rateit usedlen(group)astotal_years, which only counts rows actually present in the dataframe. If a building simply had no row for a year (the city just doesn't include them rather than marking them "Not Submitted"), that gap was invisible.For example, Digital Lakeside (ID 101185) has data for 2019-2022 but nothing for 2023. Since all 4 rows had a submitted status, it got 4/4 = 100% submission rate and an A for reporting.
Fix
Changed
total_yearsto be calculated as the span from the building's first appearance through the latest year in the dataset (max_year - first_year + 1). Years where the building has no row at all now count as missing alongside explicit "Not Submitted" rows.Digital Lakeside now correctly gets 4/5 = 80% instead of 4/4 = 100%.
Testing
Fixes #288