Skip to content

Comments

Fix: Fixed sampling bias in cell_boundaries by excluding endpoint#21

Merged
pcamara merged 1 commit intoCamaraLab:mainfrom
mrjholt:fix-2d-sampling-bias
Dec 10, 2025
Merged

Fix: Fixed sampling bias in cell_boundaries by excluding endpoint#21
pcamara merged 1 commit intoCamaraLab:mainfrom
mrjholt:fix-2d-sampling-bias

Conversation

@mrjholt
Copy link
Contributor

@mrjholt mrjholt commented Dec 10, 2025

Summary

This PR fixes a logic error in src/cajal/sample_seg.py where the start and end points of closed cell boundaries were being sampled twice, creating duplicate points

Issue

skimage.measure.find_contours returns closed loops where the first and last points are identical. The previous logic used np.linspace(..., endpoint=True), which sampled this geometric vertex twice

Impact

  1. Sampling Bias: as it creates a non-uniform distribution of points with an artificial cluster at the "seam" of the polygon, skewing GW distances
  2. Runtime Crash: Since both endpoints are included it results in a pairwise distance of 0.0 in the output distance matrices which breaks cajal.utilities.avg_shape because the step_size function returns 0 causing a ZeroDivisionError when the code attempts to normalize the matrix (medoid_matrix / step_size)

The Fix

Updated the np.linspace call to use endpoint=False treating the boundary as a periodic cycle

Verification

I verified this locally with the dataset that was previously crashing avg_shape_spt. The fix resolves the crash and prevents the division-by-zero error.

Closes #20

@codecov
Copy link

codecov bot commented Dec 10, 2025

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

@pcamara pcamara merged commit 4707d10 into CamaraLab:main Dec 10, 2025
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sampling bias in cell_boundaries causing division by zero in avg_shape

2 participants