Skip to content

Conversation

@aclerc
Copy link
Contributor

@aclerc aclerc commented Oct 15, 2025

I noticed that the auto northing process can be a bit funny when the north correction is near 180deg. I realized the root cause might be that ruptures is not using circular math. In this PR I added a custom cost function for ruptures. Unfortunately I believe this slows down autonorthing a bit, but since overall it's gone from extremely slow to still extremely slow I think that's acceptable for now.

While I was in this code I took the opportunity to improve a few more things:

  • refined the auto northing scoring a bit
  • fixed up missing circular math in a few places, including the northing code itself, which means some test case answers change

I have re-run the smarteole example notebook, adding a print of the wind-up version to the top so it's easy to see if it needs a refresh

@aclerc aclerc marked this pull request as ready for review October 16, 2025 10:43
@aclerc aclerc requested a review from samuelwnaylor October 16, 2025 10:43
@aclerc aclerc marked this pull request as draft October 16, 2025 19:42
@aclerc aclerc marked this pull request as ready for review October 17, 2025 12:23
samuelwnaylor
samuelwnaylor previously approved these changes Oct 20, 2025
Copy link
Collaborator

@samuelwnaylor samuelwnaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one thinking of the performance of the functions 👍
Thanks for adding useful comments 🙂

I've left some optional comments.

return np.mod(angle1 - angle2 + 180, 360) - 180


def circ_median(angles: npt.NDArray, axis: int | None = None, *, range_360: bool = True) -> float | npt.NDArray:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very much optional comment, but a naming suggestion to perhaps use unsigned vs signed,

i.e.

the existing...

def circ_median(angles: npt.NDArray, axis: int | None = None, *, range_360: bool = True) -> float | npt.NDArray:
    ...

could become...

def circ_median(angles: npt.NDArray, axis: int | None = None, *, return_range_format: Literal["unsigned", "signed"] = "unsigned") -> float | npt.NDArray:
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me it's much less obvious what "signed" and "unsigned" would mean compared to "range_360". I think "range_360" isn't great though so open to changing it to something better

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense, it's a tricky one to name well 😆



@pytest.mark.parametrize(("angles", "expected", "range_360"), test_circ_median_data)
def test_circ_median(angles: list, *, expected: float | None, range_360: bool) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional comment...

These tests relating to circ_median could be grouped into a class for readability. I could also be tempted to separate out the tests for those that have range_360=False into it's own method for readability too.

e.g.

class TestCircMedian:
    
    @staticmethod
    @pytest.mark.parametrize(
        ("angles", "expected"), 
        [
            ([0], 0),
            ([0, 20], 10),
            ...
        ]
    )
    def test_range360(angles: list, *, expected: float | None) -> None:
        ...


    @staticmethod
    @pytest.mark.parametrize(
        ("angles", "expected"), 
        [
            ([170, 180, 190], 180),
            ([350, 0, 10], 0),
            ([-10, 0, 10], 0),
        ]
    )
    def test_not_range360(angles: list, *, expected: float | None) -> None:
        ...

    @staticmethod
    def test_with_series() -> None:
        """Test that pandas Series work correctly."""
        angles = pd.Series([350, 0, 10, 5])
        result = circ_median(angles)
        assert isinstance(result, (float, np.floating))
        assert result == pytest.approx(2.5)

    @staticmethod
    def test_with_some_nan() -> None:
        ...

    @staticmethod
    def test_with_all_nan() -> None:
        ...

    @staticmethod
    def test_groupby() -> None:
        ...

    ...

    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only advantage is readability then I'd leave them as they are, putting them in a class doesn't seem more readable to me. Granted as static methods you don't need to check for class context (I guess) but you do get extra indent by doing this, minor but it does matter for readability

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main advantages in this scenario are...

  • organization and grouping of tests logically, making the test suite easier to navigate and understand (here different aspects of the same feature are being tested, so grouping is convenient to a reader)
  • simple to run all tests in the class, pytest -k TestCircMedian (or in your IDE)

raise rpt.exceptions.NotEnoughPoints

if self.signal is None:
msg = "CostCircularL1 not fitted with signal data"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could inform the user to call .fit() method?

@aclerc aclerc merged commit f73198b into main Oct 20, 2025
4 checks passed
@aclerc aclerc deleted the improve-auto-northing branch October 20, 2025 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants