Open
Conversation
…ment for MCView. Replace direct variable access with mcv_get and mcv_set functions across the codebase to improve maintainability and testing. Update related functions and documentation accordingly.
Replace ~20 instances of loading full gene×metacell matrices (28K×2.4K) just for metadata (counts, names, existence checks) with O(1) DAF axis queries: - get_mc_sum(): use axis_entries instead of loading mc_mat for names - has_corrected/projection/network/cell_metadata/samples: use DAF has_matrix/has_vector/has_axis instead of loading full data - qc_value_box, projection_qc: use axis_length instead of ncol(mc_mat) - common_genes box: use axis_entries for both datasets - metacell_selector, metacell_names_reactive: use axis_entries - initial_proj_point_size, initial_scatters_point_size: use axis_length - app_server tab checks: use has_matrix for inner_fold/stdev - calc_top_cors: push metacell filter down to DAF query - calc_obs_exp_mc_df: query single-metacell fractions via DAF instead of loading two full corrected/projected matrices - Fix S4 dgeMatrix incompatibility with tgs_cor in calc_top_cors
- Replace c() growth in loop with list accumulation + unlist in daf_query_mc_mat sparse matrix construction (daf_data.R:276-286) - Add comprehensive vignette documenting exactly what the input DAF needs: required axes/vectors/matrices, optional components per tab, configuration scalars, precomputable data recommendations
- Convert mc_egc to dense once in calc_marker_genes instead of calling as.matrix() twice (for rowMaxs and rowMedians) - Combine sparse row filter + marker intersection into single subset before as.matrix() conversion in get_marker_matrix (4 mode branches)
…licate and improve quality 7 refactoring batches across 51 files (917 insertions, 1116 deletions): Batch 1 - Bug fixes (14): parenthesis bug in mod_mc_mc groupB selection, cache key typo in mod_samples, undefined variable refs in plot_mc2d_proj and plot_metadata, parameter shadowing in mod_atlas, strict passthrough in daf_core, Manhattan-to-Euclidean distance fix in utils_mc_mc, typo fixes, epsilon guards on log2 calls, split monolithic observe in app_server. Batch 2 - Dead code removal (~350 lines): 7 legacy cache functions from daf_cache, if(0) block in utils_network, dead functions in utils_gene_modules and utils_dt, unreachable return in utils_markers, commented code in mod_flow. Batch 3 - DAF layer deduplication: shared helpers (daf_query_named_vector, daf_query_gene_agg, convert_daf_fraction_to_umi, coerce_vec_for_daf, set_daf_vectors_from_df), MCVIEW_TAB_NAMES constant replacing 3 copies. Batch 4 - Module deduplication: move_cell_type helper in mod_annotate (~100 lines dedup), observe_group_selection in mod_query, purrr::walk for observers, removed dead variables, simplified tagList patterns, fixed duplicate HTML id. Batch 5 - Utility consolidation: egc_to_fp helper, filter_metadata_field_names helper, .env$ pronoun cleanup, is_gene_color_mode helper, removed redundant assignments and fetches. Batch 6 - Plot deduplication: mc2d_add_graph_edges helper (3x dedup), O(n²) append elimination in plot_vein, config access standardization, removed wasteful metadata_colors fetches. Batch 7 - Quality: vertical_gridlines typo fix, deprecated dplyr replacements (mutate_at/summarise_at/summarise_all -> across), removed self-assignments and duplicate sanitize_plotly_download call.
- Scatter plots: extract 4 shared helpers (apply_gene_axis_scale, apply_scatter_color_layer, resolve_gene_color, resolve_numeric_md_color) from 3 near-identical functions in plot_metadata.R - render_2d_plotly: decompose 280-line monolith into 45-line orchestrator + 16 focused handler functions with clean switch() dispatch - utils_heatmap: split into 3 files (server logic, UI builders, helpers); extract heatmap_tooltip_handler and heatmap_download_handlers - Group management: unify duplicated DT/observer code from mod_mc_mc.R and mod_query.R into shared utils_group_box.R (4 functions) 849 insertions, 1109 deletions across 6 files; 2 new files (313 lines). All tests pass, NAMESPACE unchanged.
Prior DAF fork maintenance: update NAMESPACE exports, regenerate documentation, add smoke and integration test scaffolding, and clean up miscellaneous module imports.
… options - Add plot type selector (boxplot/violin/sina) with ggforce dependency - Add facet-by metadata variable support - Add categorical x-axis variable with category selection - Add coord_flip and log_scale toggles in boxSidebar (#4)
- Add dynamic renderUI for gene, metadata, and gene module selectors - Add shinyjs::toggle observer to show/hide selectors based on color_proj - Fix gene_modules selector clobbering gene selector in master - Fix undefined project reference in atlas tab (#8)
- Add mod_gene_correlation.R (771 lines): gene list input, correlation modes (individual/module/gene-gene), heatmap+barplot+table output - Add 5 helper functions to utils_gene_mc.R: calc_individual_correlations, calc_module_correlations, calc_gene_gene_correlations, plot_correlation_heatmap, plot_correlation_barplot - Wire into app_config.R, daf_contracts.R, daf_core.R - Replace rclipboard clipboard with downloadHandler for gene lists - Wrap sparse DAF matrices with as.matrix() for compatibility (#3)
…s_cor
- Port clipboard_copy_button_ui/server from master to utils_clipboard.R
- Add rclipboard to DESCRIPTION Imports
- Add "Copy Genes" clipboard button alongside download in gene correlation
- Shorten radio button labels ("Find correlated" / "Gene-gene cor.") to
prevent text overflow in justified radioGroupButtons
- Replace cor() with BLAS-accelerated tgs_cor in calc_gene_gene_correlations
and plot_correlation_heatmap for consistent performance
Port the "Copy genes to clipboard" button from master to the markers heatmap sidebar. Adds rclipboard::rclipboardSetup() to the heatmap box UI and wires clipboard_copy_button_ui/server for the markers gene list.
- Pass ns to heatmap_download_handlers to fix "object 'ns' not found" error when rendering the copy genes clipboard button - Add mod_gene_correlation_server smoke tests to test-module-smoke.R - Create test-clipboard.R with unit and integration tests for clipboard copy button (UI function, server observer, heatmap integration, gene correlation integration)
Replace anonymous testServer call with a function signature check, since testServer with inline functions doesn't properly initialize the module input context. All 9 non-DAF tests pass.
…clean imports - Refactor DAF contracts for clarity and add missing contract coverage - Add app_config helpers, update daf_cache and daf_data - Clean up imports and add gene_mc utility functions - Consolidate duplicate skip_if_no_daf() from test files into helper-daf.R - Add roxygen docs for gene correlation module and clipboard exports
…ment Julia offloading infrastructure: - Add julia_helpers.R with R wrappers for Julia-accelerated computations - Add mcview_helpers.jl with EGC cache, correlation, top-gene, and marker gene functions using BLAS and partialsortperm for top-k extraction - Add calc_marker_genes Julia path (daf_obj param, ~6.7s R bottleneck) - Add calc_gg_mc_top_cor Julia path (572s R → 21s Julia, 27x speedup) Test Julia environment fix: - Add tests/run_tests.sh to activate conda env and set Julia env vars - Set dafr.JULIA_HOME from CONDA_PREFIX in helper-daf.R before setup_daf() - Call init_julia_helpers() after successful DAF setup in test helper - Add julia_helpers_ready() test (skips gracefully without conda env) Benchmarking: - Add benchmark suite (benchmark_daf.R, run_benchmarks.sh, compare tool) - Add OPTIMIZATION_REPORT.md documenting profiling results and decisions
- Fix EGC normalization to use fractions (t(t(mc_mat)/mc_sum)) instead of median-scaled values - Fix convert_daf_gene_modules to return tibble(gene, module) with NA filtering - Add session-level EGC matrix caching in get_mc_egc for repeated access - Add 50-gene threshold for per-gene DAF queries in daf_query_mc_mat - Batch metadata loading via get_frame in convert_daf_metadata - Remove unused cache parameter from 6 function signatures and call sites - Remove redundant axis_entries call in daf_query_named_vector - Regenerate stale Rd documentation
Introduce two-DAF architecture that chains metacells and cells DAFs via dafr::chain_reader(), enabling direct cell-level pseudobulk analysis. - Add R/daf_cells.R with 12 functions: composition, DE, QC, pseudobulk - Update Samples tab with interactive grouping field selector - Add Group Comparison UI (multi-select Group A/B with aggregate DE) - Add QC Metrics panel (cells/group, UMIs distribution, Wilson CIs) - Auto-detect cells DAF at startup (sibling directory pattern) - Wire cleanup for cells_daf references in mcview_env - All 333 tests pass, backward compatible via NULL defaults
Replace 28K per-gene DAF round-trips with a single Julia call that does sparse matrix-vector multiplication (mask' * UMIs) for each group. Falls back to per-gene queries when Julia helpers are unavailable. - Add mcview_compute_pseudobulk() Julia function - Add julia_compute_pseudobulk() R wrapper - Update get_group_pseudobulk_mat() and calc_group_diff_expr() to use Julia path
DAF already has `@ group_field %> Sum` for matrix GroupBy+Reduce, making our custom mcview_compute_pseudobulk() redundant. The single DAF query computes grouped sums entirely in Julia with no custom code. Removes 150 lines across 3 files, replaces with 35 lines using DAF's built-in API. Per-gene R fallback retained only for the cell_types filtered case.
Replace manual R aggregation loops and masks with DAF's built-in GroupBy+Reduce queries and viewer() for cell type filtering: - get_group_pseudobulk_mat: single DAF query replaces per-gene loop for both filtered and unfiltered cases via make_cell_type_view() - calc_group_diff_expr: same viewer+query pattern, no per-gene fallback - get_group_gene_expression: DAF GroupBy query for both paths - get_group_qc_stats: 3 DAF queries (Count, Sum, Median) replace manual lapply/tapply aggregation Net reduction: 118 lines (-220/+102). All computation stays in Julia via DAF's query engine.
End-to-end tests using chromote that launch the app in a background process, connect headless Chrome, navigate every tab, and capture screenshots. Covers 21 test blocks with ~90 assertions including DOM element presence, plotly rendering, and interactive controls.
New test file with 16 test blocks exercising UI interactions: Markers heatmap (force_cell_type, lateral/noisy, legends), Diff Expression (MCs/Types mode, hide genes, table toggle), Genes (axis type switch, correlation toggle), QC (ECDF/Density, table toggle), Cell Types (boxplot/violin/sina, coord flip, select/clear all). Also extends helper-browser.R with 9 new interaction utilities (click_radio_button, click_checkbox, etc.).
- Replace N² matrix queries in detect_available_tabs() with 2 targeted checks, cutting tab detection from ~5s to <1ms - Defer future::plan(multisession) to first use, saving 5.7s at startup - Add Julia sysimage auto-detection for ~5.5s faster Julia init - Add R-level memoization for 22 static DAF data types in get_mc_data() - Eliminate double scatter layer in all 2D projection plots, halving plotly JSON payload - Remove incorrect bindCache from Annotate tab (incompatible with annotation workflow) - Add tab guards to defer QC, Genes, Annotate, Projection QC computation - Add defensive req() guards for missing max_expr column - Remove redundant rm_plotly_grid() and scalars_set() calls
Migrate all DAF query strings to v0.2.0 syntax (@ axis, :: matrix, [ mask ], >> reduce, -/ group >- reduce). Replace deprecated dafr::And() with BeginMask()/EndMask(). Update dafr dependency to >= 0.1.0. Add module-level metadata caching to reduce redundant DAF fetches.
Guard post-hoc rownames/colnames/names assignments with is.null() checks so dafr's atomic name attachment is preserved. Replace double-transpose EGC computation with sweep(). Saves ~260MB per get_matrix() call (99.9% memory reduction).
jlview integration: - Use jlview_sweep for EGC normalization (zero-copy division) - Use jlview_log2p for log2+epsilon transforms across 4 modules - Use jlview_t for matrix transpose in cache precomputation - Add jlview to DESCRIPTION Imports General optimizations: - get_cell_grouping_fields: read FilesDaf metadata JSON directly instead of loading every cell vector (7.5s → 0.15s, 31x speedup) - convert_daf_metadata: use individual daf_vec() calls instead of get_frame() for zero-copy columns - Precompute metacell top genes during init to avoid 3.8s cold computation on first access - Vectorize chi-squared test in calc_diff_expr (120ms → 5ms) - Add fast-path cache lookup in get_gene_egc for single genes
Remove 11 as.matrix() calls on jlview ALTREP objects — C code (tgs_cor, matrixStats, TGL_kmeans) reads via REAL() which works directly with ALTREP. Use jlview_fp() for egc_to_fp to avoid 521MB materialization for rowMedians. Saves ~1.5GB peak memory.
Single-gene EGC: use session-cached mc_mat instead of uncached Julia round-trip (23ms → 1ms per gene selection). Metadata: replace left_join with direct column assignment to preserve jlview ALTREP views (4/8 columns stay zero-copy). Remove 3 unnecessary as.numeric() in QC metadata aggregations.
Replace mutate-to-find-top2 pattern with jlview_top2_per_col/row across 3 sites, avoiding as.matrix + transpose + mutation. Use cached mc_mat for 2-metacell DE queries instead of DAF query + sparse→dense conversion.
Pass epsilon directly to jlview_fp in egc_to_fp, avoiding intermediate x+eps materialization. Remove 6 static_vars cache entries that duplicate dafr's new built-in version-counter cache.
Samples: - Add tab guard to defer computation until first visit - Share group_composition reactive with bindCache - Batch 3 QC DAF queries into single Julia call Markers: - Cache per-cell-type clustering results for force_cell_type toggle - Disable slow Julia marker path (45s) in favor of cached R path (5s) - Optimize mat() and heatmap bindCache keys to avoid unnecessary recomputation on legend/metadata-only changes - Fix deterministic tiebreaker for stable Shiny cache Annotate: - Add bindCache to render_2d_plotly with smart Selected-mode key - Replace full DataTable re-render with DT proxy replaceData
- inst/scripts/convert_mcview_app.R: standalone CLI to convert old-format MCView apps to DAF (detects layout, transforms config, generates app.R) - R/convert_project_to_daf.R: fix inner_fold/stdev matrix conversion when gene subset differs from mc_mat (reindex_matrix helper fills missing with 0) - R/julia_helpers.R: add prewarm_dafr_dispatch() that warms JuliaCall bridge JIT at startup via a tiny FilesDaf, and prewarm_julia_cors() for correlations - R/app_config.R: integrate both warmups into init_defs() startup path - dev/create-sysimage.jl: add RCall+Suppressor to sysimage packages and exercise sexp/rcopy for DAF return types (NamedVector, NamedMatrix, etc.) - R/plot_metadata.R: fix zero-variance correlation crash - R/utils_selectors.R: remove redundant req() checks
- R/daf_contracts.R: add optional scalars to core contract (mcview_title, mcview_tabs, mcview_excluded_tabs, mcview_light_version, mcview_about_markdown, mcview_cache_in_daf, mcview_cache_daf_root, mcview_available_tabs). These were implicitly used but not documented. - inst/scripts/convert_mcview_app.R: call store_available_tabs() after conversion so detect_available_tabs() is instant at runtime (0.00s vs 1.3s cold start without pre-stored scalar)
- Fix markers-only checkbox in correlation panel: preserve checkbox state across renderUI re-renders instead of resetting to TRUE (was preventing the toggle from working on Genes, Atlas tabs) - Fix Samples tab not appearing in sidebar: include optimistically in detect_available_tabs when cell axis exists, and don't override when config$tabs already includes it - Fix get_cell_grouping_fields: use raw cells DAF instead of chained DAF for vector listing and Julia cardinality checks (returns all 34 fields including embryo instead of just batch_set_id) - Update contracts and pre-computation stubs for DAF-native caching - Add per-type marker genes and marker correlation support
Strip all fold diagnostic features except Projected-fold: delete mod_inner_fold.R and mod_stdev_fold.R modules, remove tab definitions, contracts, data conversion functions, QC plots/tables, heatmap modes (Inner/Stdev/Outliers), marker gene selection modes, cache keys, and related tests. Keeps Projected-fold tab and its full infrastructure.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.