Conversation
There was a problem hiding this comment.
Code Review
This pull request updates gDRimport to version 1.9.10 and introduces support for generating a 'Day 0' template for D300 data via a new day0 parameter in the import_D300 function. The changes include logic for creating a trt_day0.xlsx file, handling plate dimensions from XML tags, and various test improvements. Feedback focuses on improving performance and robustness by vectorizing matrix operations, using more efficient data table manipulations, and ensuring consistent fallback logic for plate dimensions.
| for (m in seq_len(nrow_plate)) { | ||
| for (n in seq_len(ncol_plate)) { | ||
| if (has_meta) { | ||
| # Legacy logic: fill full bounding box (Unit Tests expect this) | ||
| drug_mat[m, n] <- idfs$untreated_tags[[1]] | ||
| conc_mat[m, n] <- 0.0 | ||
| } else { | ||
| # New logic: explicitly respect the outer edge blanking rule | ||
| is_edge <- (m == 1 || m == nrow_plate || n == 1 || n == ncol_plate) | ||
| if (is_edge) { | ||
| drug_mat[m, n] <- "" | ||
| conc_mat[m, n] <- "" | ||
| } else { | ||
| drug_mat[m, n] <- idfs$untreated_tags[[1]] | ||
| conc_mat[m, n] <- 0.0 | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The nested loops for filling drug_mat and conc_mat can be replaced with vectorized operations. This is more idiomatic in R, improves readability, and is more efficient for larger plate sizes.
drug_mat[] <- idfs$untreated_tags[[1]]
conc_mat[] <- 0.0
if (!has_meta) {
# New logic: explicitly respect the outer edge blanking rule
drug_mat[c(1, nrow_plate), ] <- ""
drug_mat[, c(1, ncol_plate)] <- ""
conc_mat[c(1, nrow_plate), ] <- ""
conc_mat[, c(1, ncol_plate)] <- ""
}| # Sort only the plate list numerically to ensure trt_1, trt_2 files generate in chronological order | ||
| uplates <- sort(as.numeric(unique(treatment$D300_Plate_N))) | ||
| # create a list with Gnumber and Concentration | ||
| trt_filt$gn_conc <- apply(trt_filt, 1, function(x) list(x[idfs$drug_identifier], x[idfs$conc_identifier])) |
There was a problem hiding this comment.
Using apply() with MARGIN = 1 on a data.table is inefficient as it coerces the data to a matrix/vector first, which also forces all columns to a common type (usually character). Since trt_filt is a data.table, it is better to use mapply() which is vectorized and preserves data types.
trt_filt$gn_conc <- mapply(list, trt_filt[[idfs$drug_identifier]], trt_filt[[idfs$conc_identifier]], SIMPLIFY = FALSE)| # Extract actual plate dimensions from the XML Dimension tag (e.g. "(8,12)") | ||
| dim_str <- trt_filt$Dimension[1] | ||
| dims <- as.integer(strsplit(gsub("\\(|\\)", "", dim_str), ",")[[1]]) |
There was a problem hiding this comment.
For consistency and robustness, consider adding a fallback for plate dimensions similar to the one implemented in the day0 block (lines 77-80). If the Dimension tag is missing or malformed, the current code might fail when calculating nwells in save_drug_info_per_well.
# Extract actual plate dimensions from the XML Dimension tag (e.g. "(8,12)")
dim_str <- trt_filt$Dimension[1]
dims <- as.integer(strsplit(gsub("\\(|\\)", "", dim_str), ",")[[1]])
# Fallback to absolute max if dimension string is somehow corrupt
if (is.na(dims[1]) || is.na(dims[2])) {
dims <- c(max(as.numeric(treatment$Row), na.rm = TRUE),
max(as.numeric(treatment$Col), na.rm = TRUE))
}
Description
What changed?
Related JIRA issue: GDR-3355
Why was it changed?
Generate day0 template
Checklist for sustainable code base
Logistic checklist
Screenshots (optional)