Skip to content

Conversation

@nm3224
Copy link
Collaborator

@nm3224 nm3224 commented Nov 24, 2025

changes & context

Small tweaks/improvements

  • feat: logs missing "M" course grades oddities
  • feat: add cohort and cohort term breakdown + additional logging statement prints for inference pipeline too
  • fix: capping gateway features at 10 instead of 25
  • fix: <0.1%% has a double percent sign - changed to "<0.1 percent" for sake of ease (couldn't fix it to just print 0.1%, idk why)

questions

  • does anyone know how to get it to print 0.1% instead of 0.1%% ?

feat: logs missing "M" course grades oddities
feat: add cohort and cohort term breakdown + additional logging statement prints for inference pipeline too
fix: capping gateway features at 10 instead of 25
fix: <0.1%% has a double percent sign - changed to "<0.1 percent" for sake of ease (couldn't fix it to just print 0.1%, idk why)
@nm3224 nm3224 changed the base branch from main to develop November 24, 2025 17:57
@nm3224 nm3224 changed the title feats & fixes: additional data audit loggers and reducing gateway course limit from 25 to 10 feats: additional data loggers and enhancements, reducing gateway course limit from 25 to 10 Nov 24, 2025
@nm3224 nm3224 changed the title feats: additional data loggers and enhancements, reducing gateway course limit from 25 to 10 feat: additional data loggers and enhancements, reducing gateway course limit from 25 to 10 Nov 24, 2025
…is triggered

currently, the logs were not being updated for the same model run ID but a new inference run. this should fix this (changed mode = "a" (append) to mode = "w" (write). Need to test still.
except Exception:
pass

# --- IMPORTANT PART 2: open in write mode so we overwrite each run ---
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we overwriting here? When we re-run inference, shouldn't that just produce a new run ID subfolder?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we re-run inference if it's the same training model (so if we didn't need to retrain) the logs for some reason aren't getting updated. I still need to test this to ensure it works but we need the logs to be updated to the most recent inference run.

Copy link
Collaborator Author

@nm3224 nm3224 Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes me think if under the inference folder, we should have additional sub folders for old & new inference runs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so then we are only keeping the latest inference run then? I think I remember discussing this. Ideally, we want to have all historical inference run logs as well.

But yeah, for now, I understand we need to prioritize capturing the latest and ensuring that is done correctly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, we do need to keep old runs

@vishpillai123 vishpillai123 merged commit 1673e58 into develop Dec 1, 2025
6 checks passed
@vishpillai123 vishpillai123 deleted the data_audit_enhancement branch December 1, 2025 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants