Is your feature request related to a problem? Please describe.
The littles_mcar_test() function in multivariate_analysis.py currently returns a placeholder {"Hello": "World"}. Missing value mechanisms (MCAR, MAR, MNAR) are critical for choosing the correct imputation strategy, but BioProfileKit provides no actionable guidance on this.
Describe the solution you'd like
Implement the full missing value mechanism pipeline:
- Little's MCAR test: Implement
littles_mcar_test() using scipy chi-squared test on the missing-data pattern matrix. Return chi2, df, p_value, and a mechanism field ("MCAR" if p > 0.05).
- MAR detection: For each column with missing values, test correlation with missingness indicators of other columns using point-biserial correlation (
scipy.stats.pointbiserialr). Flag pairs with |r| > 0.3.
- MNAR heuristic: Apply a KS-test comparing the distribution of an observed column conditioned on whether another column is missing vs present. High KS statistic suggests MNAR.
Populate mcar_result in MultivariateAnalysis with the full results and display in the existing MCAR tab in general_statistics.jinja.
Describe alternatives you've considered
Using the missingno or pyampute libraries. Both add dependencies; the scipy-based implementation covers the core use cases without additional requirements.
Additional context
The MAR and MNAR heuristics are already partially documented in the existing missing_mechanisms field structure used in the template. This issue closes the gap between the frontend rendering and the missing backend implementation.
Is your feature request related to a problem? Please describe.
The
littles_mcar_test()function inmultivariate_analysis.pycurrently returns a placeholder{"Hello": "World"}. Missing value mechanisms (MCAR, MAR, MNAR) are critical for choosing the correct imputation strategy, but BioProfileKit provides no actionable guidance on this.Describe the solution you'd like
Implement the full missing value mechanism pipeline:
littles_mcar_test()usingscipychi-squared test on the missing-data pattern matrix. Returnchi2,df,p_value, and amechanismfield ("MCAR"if p > 0.05).scipy.stats.pointbiserialr). Flag pairs with |r| > 0.3.Populate
mcar_resultinMultivariateAnalysiswith the full results and display in the existing MCAR tab ingeneral_statistics.jinja.Describe alternatives you've considered
Using the
missingnoorpyamputelibraries. Both add dependencies; the scipy-based implementation covers the core use cases without additional requirements.Additional context
The MAR and MNAR heuristics are already partially documented in the existing
missing_mechanismsfield structure used in the template. This issue closes the gap between the frontend rendering and the missing backend implementation.