[BUG] Fix zero-distance instability in Hidalgo (#3068) #3115
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[BUG] Fix zero-distance instability in Hidalgo (#3068)
What does this implement/fix? Explain your changes.
This PR resolves a numerical instability in the Hidalgo segmenter that occurs when two or more rows in the input data are identical or extremely close. In such cases, the nearest-neighbor search returns r1 = 0 for some points, and the original implementation computed
mu = r2 / r1. This produced infinite values formu, which then propagated into downstream parameters (such asb1), eventually causing a crash in the Gibbs sampler (sample_d) due to invalid likelihood calculations.To fix this, a small numerical epsilon (1e-12) is introduced when computing
mu, ensuring that the denominator is never zero. This preserves normal behavior for valid datasets, while preventing the zero-division crash that triggered the issue. This approach follows @TonyBagnall's guidance from the issue discussion. A local regression test confirms that the real dataset from issue #3068 now runs without errors.Does your contribution introduce a new dependency?
No new dependencies.
Any other comments?
PR checklist
For all contributions
For new estimators and functions
For developers with write access