Skip to content

Alternative Trend suggested#60

Draft
KarlDeck wants to merge 3 commits into
mainfrom
KarlDeck/trends_rework
Draft

Alternative Trend suggested#60
KarlDeck wants to merge 3 commits into
mainfrom
KarlDeck/trends_rework

Conversation

@KarlDeck

Copy link
Copy Markdown
Collaborator

Add alternative monotonic trend detector and explorer overlay (Just a draft, please look through to evaluate)

♻️ Current situation & Problem

This PR adds an alternative trend detection approach for evaluating monotonic trends against the existing regression-based TrendDetector.

The goal is to make trend detection easier to inspect and compare in cases where the current detector may miss or over-generalize short structured rises/falls. In particular, this introduces a detector that:

  • finds non-strict monotonic regions
  • merges adjacent regions across a bounded number of controlled outliers
  • trims misleading edge plateaus/zero runs
  • enforces a minimum final trend length after merging

No linked issue yet.

⚙️ Release Notes

  • Add MonotonicTrendDetector as an alternative trend detector for structural event detection experiments.
  • Detect monotonic increasing/decreasing regions while allowing flat steps inside the trend.
  • Merge adjacent monotonic regions when the separating gap contains only a limited number of outliers and those outliers do not exceed the starting trend value in the trend direction.
  • Trim leading/trailing plateau runs so trends include at most one boundary value from those flat regions.
  • Require a minimum final trend length of 10 after merging.
  • Add explorer support to visualize the new detector alongside the existing trend detector with separate overlay colors.

Example behavior:

MonotonicTrendDetector().detect(signal)

The explorer now shows:

  • regression trend overlay
  • monotonic trend overlay
  • separate hit-target navigation for the monotonic detector

📚 Documentation

This change is currently documented through implementation structure and detector-specific tests.

The solution adds a second detector in detectors/trend.py rather than replacing the existing TrendDetector, which keeps the current pipeline stable while enabling side-by-side validation in the explorer.

The explorer was updated to append the monotonic detector locally for channels that already use TrendDetector, so visual comparison can happen without changing dataset-wide detector configuration or existing structural captions.

✅ Testing

Added focused unit tests for the new detector covering:

  • basic non-strict monotonic detection
  • merging across allowed outliers
  • rejection when outliers violate the merge constraint
  • trimming of leading/trailing zero runs
  • trimming of leading/trailing plateaus
  • rejection of flat-only regions
  • enforcement of the default minimum merged trend length
  • acceptance of merged trends reaching the minimum length threshold

Validation run:

conda run -n AML python -m pytest tests/test_monotonic_trend.py

Result:

  • 11 passed

Code of Conduct & Contributing Guidelines

By creating and submitting this pull request, you agree to follow our Code of Conduct and Contributing Guidelines:

If you want, I can also turn this into a shorter, more maintainer-style version.

@coderabbitai

coderabbitai Bot commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 884d9052-ee79-4b88-82c3-fe700cac93de

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch KarlDeck/trends_rework

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
detectors/trend.py (3)

362-392: Simplify _can_merge_gap by tracking positions instead of boolean-indexing into indices.

Each invocation does three signal[indices == X][0] scans (previous end, segment start, previous start) plus a gap_mask comparison over the full indices array. For long signals with many candidate merges this becomes O(n) per merge attempt; it's also easy to misread. If _TrendSegment carried its start_pos/end_pos (or if this method received positions directly), the three scalar lookups collapse to direct indexing and gap_mask can be computed from position ranges.

Also, the branch

gap_mask = (indices > previous.end_minute) & (indices < segment.start_minute)
if not gap_mask.any() and violates_boundary:
    gap_mask = indices == segment.start_minute

treats segment.start_minute as the outlier when no gap exists — but that minute is also the first point of the next segment (and contributes to its delta/score). Worth a brief comment explaining why the outlier check reuses the segment's own start value here; it took a careful read to confirm this is intentional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@detectors/trend.py` around lines 362 - 392, The _can_merge_gap function is
doing repeated boolean-indexing into signal via indices (signal[indices ==
X][0]) and full-array masks (gap_mask) which is O(n) per call; change the
_TrendSegment model to carry start_pos and end_pos (or change _can_merge_gap to
accept integer positions) and replace those three boolean lookups with direct
scalar indexing signal[start_pos], signal[end_pos], signal[segment_start_pos],
and compute the gap range with simple integer comparisons (e.g., gap_start =
previous.end_pos+1; gap_end = segment.start_pos-1) to build a contiguous slice
or count length without scanning indices; keep the existing logic around empty
gap handling (the special-case where gap_mask falls back to
segment.start_minute) but add a short comment near that branch explaining why
the segment's start is considered an outlier when no internal gap exists; update
references to indices, gap_mask, previous.start_minute/end_minute, and
segment.start_minute to use the new start_pos/end_pos names and ensure
max_outliers logic remains the same.

407-430: Overlap check uses inclusive boundaries — touching segments are discarded.

The predicate segment.start_minute <= accepted_segment.end_minute and accepted_segment.start_minute <= segment.end_minute treats two segments that meet at a single minute (e.g. increasing (0, 5) and decreasing (5, 10)) as overlapping, so the lower-scored one is dropped. If a shared boundary minute should be allowed on both directions (mirroring what TrendDetector._split_opposite_overlaps does with the split point), use strict inequalities (< / >). Otherwise, a short note on the boundary behavior would help future readers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@detectors/trend.py` around lines 407 - 430, The overlap check in
_resolve_overlapping_directions currently treats touching segments as
overlapping due to inclusive comparisons; update the overlap predicate inside
_resolve_overlapping_directions (the any(...) lambda that compares
segment.start_minute and segment.end_minute against
accepted_segment.start_minute/end_minute) to use strict inequalities (< and >)
so segments that share a boundary minute are allowed to coexist (consistent with
TrendDetector._split_opposite_overlaps behavior), and keep the rest of the
sorting/accept logic unchanged.

259-282: Outlier values become the start of the next monotonic run (and inflate its score).

When is_ordered(signal[pos], signal[pos-1]) fails, the previous segment is closed at pos - 1 and the next segment starts at pos — i.e., the breaking value itself becomes the first point of the subsequent segment. Combined with _append_segment computing delta = signal[end_pos] - signal[start_pos], the "outlier" contributes to the next segment's score, which for inputs like [1, 2, 3, 2, 4, 5] produces a second segment (3, 5) over values [2, 4, 5] with score=3 rather than score=2. After merging via _merge_with_outliers (which sums scores), the final score is 2 + 3 = 5, versus net delta 5 - 1 = 4.

This is likely intentional ("total monotonic movement"), but it's worth either documenting the score semantic explicitly or switching to net delta so score matches the magnitude a reader would infer from (start_minute, end_minute, direction). The same accumulation also makes per-direction score comparisons in _resolve_overlapping_directions favor merged runs with many steps over a compact opposite-direction run of equal net magnitude.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@detectors/trend.py` around lines 259 - 282, The reported inflation happens
because outlier endpoints become starts of subsequent segments and scores are
summed in _merge_with_outliers, so change the merge logic to compute merged
segment score as the net delta between the merged start and end values (e.g.,
value_at(end_index) - value_at(start_index)) instead of summing child segment
scores; update _merge_with_outliers to rebuild the merged _TrendSegment.score
from the merged start/end (using the same signal/indices access used in
_append_segment), and adjust any comparisons in _resolve_overlapping_directions
to use that net-delta score so merged runs reflect actual net movement rather
than accumulated step counts.
tests/test_monotonic_trend.py (1)

1-118: Solid coverage of the documented behaviors — a few scenarios worth adding.

The suite cleanly validates non-strict monotonic detection, leading/trailing plateau/zero trimming, single-outlier merging in both directions, outlier rejection, flat input, and min_length gating. A few gaps that would be worth covering while this is still fresh:

  • filter_zeros=True: a sequence like [5, 4, 0, 0, 4, 3, 2] would exercise the gap-handling path in _can_merge_gap when the filter removes samples between segments — currently the merge logic treats only surviving samples as outliers, which deserves an explicit test.
  • max_outliers=0: confirms that no merging happens at all when no outliers are tolerated.
  • Multi-segment merge: three runs separated by two allowed outliers, to ensure the loop in _merge_with_outliers chains correctly and score accumulates as intended.
  • Opposite-direction overlap resolution: an input where an increasing and a decreasing segment overlap and the lower-scoring one must be dropped, exercising _resolve_overlapping_directions directly.
  • Score assertions on at least one merged case, to pin down the score semantic (sum of per-segment deltas vs. net delta — see comment in detectors/trend.py).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_monotonic_trend.py` around lines 1 - 118, Add new unit tests to
cover missing merge and scoring behaviors: (1) a test using
MonotonicTrendDetector(filter_zeros=True) with input like [5,4,0,0,4,3,2] to
exercise _can_merge_gap when zeros are removed; (2) a test with max_outliers=0
to assert no merging occurs; (3) a multi-segment merge test (three runs
separated by two allowed outliers) to validate _merge_with_outliers chains
correctly and score accumulation across segments; (4) a case where increasing
and decreasing segments overlap to force _resolve_overlapping_directions to drop
the lower-scoring segment; and (5) at least one assertion that verifies the
detector's score semantics (e.g., comparing per-segment delta sum vs net delta)
for a merged region so the scoring behavior in
detectors.trend.MonotonicTrendDetector is pinned down.
explorer.py (1)

133-144: Minor: unused loop variable channel (Ruff B007).

Line 135 only uses channel_detectors. Rename or iterate detectors.values() to avoid the warning.

🛠️ Proposed fix
 def _explorer_detectors(channel_config: ChannelConfig) -> dict[str, list[object]]:
     detectors = {channel: list(channel_detectors) for channel, channel_detectors in channel_config.detectors.items()}
-    for channel, channel_detectors in detectors.items():
+    for channel_detectors in detectors.values():
         if any(isinstance(detector, MonotonicTrendDetector) for detector in channel_detectors):
             continue

         trend_detector = next((detector for detector in channel_detectors if isinstance(detector, TrendDetector)), None)
         if trend_detector is None:
             continue

         channel_detectors.append(MonotonicTrendDetector(filter_zeros=trend_detector.filter_zeros))
     return detectors

Also worth noting: self.annotator still uses channel_config (and therefore the original channel_config.detectors), so the auto-appended MonotonicTrendDetector is visible only in the explorer UI overlays/hit-targets — not in annotation output. This matches the PR description ("without changing dataset-wide configuration"), but a one-line docstring on _explorer_detectors making that explicit would help future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@explorer.py` around lines 133 - 144, The loop in _explorer_detectors
currently iterates with two variables (channel, channel_detectors) but only uses
channel_detectors, causing Ruff B007; change the loop to iterate over
detectors.values() (or rename the unused variable to _) so the warning is
removed, and update the function docstring to state that this function augments
a copy of channel_config.detectors for explorer overlays only (the original
channel_config and self.annotator remain unchanged), referencing
_explorer_detectors, channel_detectors, MonotonicTrendDetector, TrendDetector,
channel_config, and self.annotator to make intent clear for future maintainers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@detectors/trend.py`:
- Around line 362-392: The _can_merge_gap function is doing repeated
boolean-indexing into signal via indices (signal[indices == X][0]) and
full-array masks (gap_mask) which is O(n) per call; change the _TrendSegment
model to carry start_pos and end_pos (or change _can_merge_gap to accept integer
positions) and replace those three boolean lookups with direct scalar indexing
signal[start_pos], signal[end_pos], signal[segment_start_pos], and compute the
gap range with simple integer comparisons (e.g., gap_start = previous.end_pos+1;
gap_end = segment.start_pos-1) to build a contiguous slice or count length
without scanning indices; keep the existing logic around empty gap handling (the
special-case where gap_mask falls back to segment.start_minute) but add a short
comment near that branch explaining why the segment's start is considered an
outlier when no internal gap exists; update references to indices, gap_mask,
previous.start_minute/end_minute, and segment.start_minute to use the new
start_pos/end_pos names and ensure max_outliers logic remains the same.
- Around line 407-430: The overlap check in _resolve_overlapping_directions
currently treats touching segments as overlapping due to inclusive comparisons;
update the overlap predicate inside _resolve_overlapping_directions (the
any(...) lambda that compares segment.start_minute and segment.end_minute
against accepted_segment.start_minute/end_minute) to use strict inequalities (<
and >) so segments that share a boundary minute are allowed to coexist
(consistent with TrendDetector._split_opposite_overlaps behavior), and keep the
rest of the sorting/accept logic unchanged.
- Around line 259-282: The reported inflation happens because outlier endpoints
become starts of subsequent segments and scores are summed in
_merge_with_outliers, so change the merge logic to compute merged segment score
as the net delta between the merged start and end values (e.g.,
value_at(end_index) - value_at(start_index)) instead of summing child segment
scores; update _merge_with_outliers to rebuild the merged _TrendSegment.score
from the merged start/end (using the same signal/indices access used in
_append_segment), and adjust any comparisons in _resolve_overlapping_directions
to use that net-delta score so merged runs reflect actual net movement rather
than accumulated step counts.

In `@explorer.py`:
- Around line 133-144: The loop in _explorer_detectors currently iterates with
two variables (channel, channel_detectors) but only uses channel_detectors,
causing Ruff B007; change the loop to iterate over detectors.values() (or rename
the unused variable to _) so the warning is removed, and update the function
docstring to state that this function augments a copy of
channel_config.detectors for explorer overlays only (the original channel_config
and self.annotator remain unchanged), referencing _explorer_detectors,
channel_detectors, MonotonicTrendDetector, TrendDetector, channel_config, and
self.annotator to make intent clear for future maintainers.

In `@tests/test_monotonic_trend.py`:
- Around line 1-118: Add new unit tests to cover missing merge and scoring
behaviors: (1) a test using MonotonicTrendDetector(filter_zeros=True) with input
like [5,4,0,0,4,3,2] to exercise _can_merge_gap when zeros are removed; (2) a
test with max_outliers=0 to assert no merging occurs; (3) a multi-segment merge
test (three runs separated by two allowed outliers) to validate
_merge_with_outliers chains correctly and score accumulation across segments;
(4) a case where increasing and decreasing segments overlap to force
_resolve_overlapping_directions to drop the lower-scoring segment; and (5) at
least one assertion that verifies the detector's score semantics (e.g.,
comparing per-segment delta sum vs net delta) for a merged region so the scoring
behavior in detectors.trend.MonotonicTrendDetector is pinned down.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2e5b26be-5236-4226-abc9-19005675a5a7

📥 Commits

Reviewing files that changed from the base of the PR and between 4e856b2 and 20527eb.

📒 Files selected for processing (3)
  • detectors/trend.py
  • explorer.py
  • tests/test_monotonic_trend.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant