Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the package version to 1.9.12 and replaces hardcoded 'Duration' values with 'NA_real_' in the PRISM and PSet importers. While unit tests were added for the PRISM importer, feedback suggests adding similar tests for the PSet importer to ensure consistency. Additionally, it is recommended to retain or update the documentation comments in the PSet importer to explain the initialization of the 'Duration' field.
| if (!duration %in% names(dt)) { | ||
| dt[, (duration) := 72] | ||
| dt[, (duration) := NA_real_] | ||
| } |
There was a problem hiding this comment.
While replacing the hardcoded value with NA_real_ is correct, removing the comment entirely makes the code less self-documenting. It would be beneficial to keep a comment explaining that Duration is a required field in the gDR pipeline and is being initialized with NA_real_ when missing from the source data to maintain semantic honesty.
| expect_true(all(is.na(df_prism$result$Duration))) | ||
| expect_type(df_prism$result$Duration, "double") |
There was a problem hiding this comment.
The added tests correctly verify that Duration is NA and of type double for Level 6 PRISM data. However, similar assertions are missing for the PSet importer tests in tests/testthat/test-pset_to_gdrDF.R. To ensure consistency and prevent regressions, consider adding equivalent checks to the PSet unit tests where the default NA_real_ assignment is expected to occur.
Description
What changed?
Related JIRA issue: GDR-3366
Replaced hardcoded
Durationvalues withNA_real_in PRISM and PSet importers:prism_to_gdrDF.R: removed hardcoded120for control and treatment data inconvert_LEVEL6_prism_to_gDR_input()pset_to_gdrDF.R: removed hardcoded72fallback when Duration is missing from PSet dataWhy was it changed?
The hardcoded values (
120for PRISM level6,72for PSets) were arbitrary and factually incorrect — they do not reflect the actual treatment duration from the source data. UsingNA_real_is semantically honest: the duration is unknown and should not be fabricated.Checklist for sustainable code base
Logistic checklist
Screenshots (optional)