Allow aggregating expression by multiple columns by rfriedman22 · Pull Request #42 · cole-trapnell-lab/hooke

rfriedman22 · 2025-08-04T21:23:45Z

I extended hooke:::aggregated_expr_data() so that the user can specify multiple columns to aggregate by (e.g. both cell type and perturbation). I tested the updated function on default parameters and confirmed that it still returns the same result -- mean_expression, fraction_expressing, and specificity is equal for all values and the first column of the result is still cell_group if there is only one column. However, if multiple columns are specified, the result now puts all of those columns first, for example:

agg_expr <- hooke:::aggregated_expr_data(cds, c("perturbation", "cell_type_broad_abbrev")) %>%
  head()

returns:

  perturbation cell_type_broad_abbrev            gene_id gene_short_name
1     ctrl-inj                    RPC ENSDARG00000000001         slc35a5
2     ctrl-inj                    RPC ENSDARG00000000002          ccdc80
3     ctrl-inj                    RPC ENSDARG00000000018            nrf1
4     ctrl-inj                    RPC ENSDARG00000000019           ube2h
5     ctrl-inj                    RPC ENSDARG00000000068       slc9a3r1a
6     ctrl-inj                    RPC ENSDARG00000000069             dap
  fraction_expressing mean_expression specificity
1         0.002456332     0.002757203 0.051766341
2         0.002183406     0.001340389 0.075315424
3         0.086244541     0.083973716 0.077649102
4         0.021834061     0.019834487 0.016199531
5         0.003548035     0.002764963 0.008075803
6         0.036026201     0.034195728 0.027027247

…he single column is named

rfriedman22 · 2025-08-18T21:17:57Z

Per discussion with Maddy, the specificity scoring is now optional. This is the slowest step, and if we are aggregating by e.g. cell type and perturbation, we don't want specificity metrics to be stratified by perturbation status.

rfriedman22 added 2 commits August 4, 2025 14:18

Allow aggregating expression by multiple columns

e97cb77

Small edit to keep original behavior of function regardless of what t…

fa1b0d1

…he single column is named

rfriedman22 requested a review from maddyduran August 4, 2025 21:23

Let specificity be optional

9bcbf34

maddyduran merged commit c685628 into develop Aug 19, 2025
1 check failed

maddyduran deleted the agg_multiple_cols branch August 19, 2025 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow aggregating expression by multiple columns#42

Allow aggregating expression by multiple columns#42
maddyduran merged 3 commits into
developfrom
agg_multiple_cols

rfriedman22 commented Aug 4, 2025

Uh oh!

rfriedman22 commented Aug 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rfriedman22 commented Aug 4, 2025

Uh oh!

rfriedman22 commented Aug 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants