Alternative Trend suggested#60
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (5)
detectors/trend.py (3)
362-392: Simplify_can_merge_gapby tracking positions instead of boolean-indexing intoindices.Each invocation does three
signal[indices == X][0]scans (previous end, segment start, previous start) plus agap_maskcomparison over the fullindicesarray. For long signals with many candidate merges this becomes O(n) per merge attempt; it's also easy to misread. If_TrendSegmentcarried itsstart_pos/end_pos(or if this method received positions directly), the three scalar lookups collapse to direct indexing andgap_maskcan be computed from position ranges.Also, the branch
gap_mask = (indices > previous.end_minute) & (indices < segment.start_minute) if not gap_mask.any() and violates_boundary: gap_mask = indices == segment.start_minutetreats
segment.start_minuteas the outlier when no gap exists — but that minute is also the first point of the next segment (and contributes to its delta/score). Worth a brief comment explaining why the outlier check reuses the segment's own start value here; it took a careful read to confirm this is intentional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@detectors/trend.py` around lines 362 - 392, The _can_merge_gap function is doing repeated boolean-indexing into signal via indices (signal[indices == X][0]) and full-array masks (gap_mask) which is O(n) per call; change the _TrendSegment model to carry start_pos and end_pos (or change _can_merge_gap to accept integer positions) and replace those three boolean lookups with direct scalar indexing signal[start_pos], signal[end_pos], signal[segment_start_pos], and compute the gap range with simple integer comparisons (e.g., gap_start = previous.end_pos+1; gap_end = segment.start_pos-1) to build a contiguous slice or count length without scanning indices; keep the existing logic around empty gap handling (the special-case where gap_mask falls back to segment.start_minute) but add a short comment near that branch explaining why the segment's start is considered an outlier when no internal gap exists; update references to indices, gap_mask, previous.start_minute/end_minute, and segment.start_minute to use the new start_pos/end_pos names and ensure max_outliers logic remains the same.
407-430: Overlap check uses inclusive boundaries — touching segments are discarded.The predicate
segment.start_minute <= accepted_segment.end_minute and accepted_segment.start_minute <= segment.end_minutetreats two segments that meet at a single minute (e.g. increasing(0, 5)and decreasing(5, 10)) as overlapping, so the lower-scored one is dropped. If a shared boundary minute should be allowed on both directions (mirroring whatTrendDetector._split_opposite_overlapsdoes with the split point), use strict inequalities (</>). Otherwise, a short note on the boundary behavior would help future readers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@detectors/trend.py` around lines 407 - 430, The overlap check in _resolve_overlapping_directions currently treats touching segments as overlapping due to inclusive comparisons; update the overlap predicate inside _resolve_overlapping_directions (the any(...) lambda that compares segment.start_minute and segment.end_minute against accepted_segment.start_minute/end_minute) to use strict inequalities (< and >) so segments that share a boundary minute are allowed to coexist (consistent with TrendDetector._split_opposite_overlaps behavior), and keep the rest of the sorting/accept logic unchanged.
259-282: Outlier values become the start of the next monotonic run (and inflate its score).When
is_ordered(signal[pos], signal[pos-1])fails, the previous segment is closed atpos - 1and the next segment starts atpos— i.e., the breaking value itself becomes the first point of the subsequent segment. Combined with_append_segmentcomputingdelta = signal[end_pos] - signal[start_pos], the "outlier" contributes to the next segment's score, which for inputs like[1, 2, 3, 2, 4, 5]produces a second segment(3, 5)over values[2, 4, 5]withscore=3rather thanscore=2. After merging via_merge_with_outliers(which sums scores), the final score is2 + 3 = 5, versus net delta5 - 1 = 4.This is likely intentional ("total monotonic movement"), but it's worth either documenting the score semantic explicitly or switching to net delta so
scorematches the magnitude a reader would infer from(start_minute, end_minute, direction). The same accumulation also makes per-direction score comparisons in_resolve_overlapping_directionsfavor merged runs with many steps over a compact opposite-direction run of equal net magnitude.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@detectors/trend.py` around lines 259 - 282, The reported inflation happens because outlier endpoints become starts of subsequent segments and scores are summed in _merge_with_outliers, so change the merge logic to compute merged segment score as the net delta between the merged start and end values (e.g., value_at(end_index) - value_at(start_index)) instead of summing child segment scores; update _merge_with_outliers to rebuild the merged _TrendSegment.score from the merged start/end (using the same signal/indices access used in _append_segment), and adjust any comparisons in _resolve_overlapping_directions to use that net-delta score so merged runs reflect actual net movement rather than accumulated step counts.tests/test_monotonic_trend.py (1)
1-118: Solid coverage of the documented behaviors — a few scenarios worth adding.The suite cleanly validates non-strict monotonic detection, leading/trailing plateau/zero trimming, single-outlier merging in both directions, outlier rejection, flat input, and
min_lengthgating. A few gaps that would be worth covering while this is still fresh:
filter_zeros=True: a sequence like[5, 4, 0, 0, 4, 3, 2]would exercise the gap-handling path in_can_merge_gapwhen the filter removes samples between segments — currently the merge logic treats only surviving samples as outliers, which deserves an explicit test.max_outliers=0: confirms that no merging happens at all when no outliers are tolerated.- Multi-segment merge: three runs separated by two allowed outliers, to ensure the loop in
_merge_with_outlierschains correctly andscoreaccumulates as intended.- Opposite-direction overlap resolution: an input where an increasing and a decreasing segment overlap and the lower-scoring one must be dropped, exercising
_resolve_overlapping_directionsdirectly.- Score assertions on at least one merged case, to pin down the score semantic (sum of per-segment deltas vs. net delta — see comment in
detectors/trend.py).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_monotonic_trend.py` around lines 1 - 118, Add new unit tests to cover missing merge and scoring behaviors: (1) a test using MonotonicTrendDetector(filter_zeros=True) with input like [5,4,0,0,4,3,2] to exercise _can_merge_gap when zeros are removed; (2) a test with max_outliers=0 to assert no merging occurs; (3) a multi-segment merge test (three runs separated by two allowed outliers) to validate _merge_with_outliers chains correctly and score accumulation across segments; (4) a case where increasing and decreasing segments overlap to force _resolve_overlapping_directions to drop the lower-scoring segment; and (5) at least one assertion that verifies the detector's score semantics (e.g., comparing per-segment delta sum vs net delta) for a merged region so the scoring behavior in detectors.trend.MonotonicTrendDetector is pinned down.explorer.py (1)
133-144: Minor: unused loop variablechannel(Ruff B007).Line 135 only uses
channel_detectors. Rename or iteratedetectors.values()to avoid the warning.🛠️ Proposed fix
def _explorer_detectors(channel_config: ChannelConfig) -> dict[str, list[object]]: detectors = {channel: list(channel_detectors) for channel, channel_detectors in channel_config.detectors.items()} - for channel, channel_detectors in detectors.items(): + for channel_detectors in detectors.values(): if any(isinstance(detector, MonotonicTrendDetector) for detector in channel_detectors): continue trend_detector = next((detector for detector in channel_detectors if isinstance(detector, TrendDetector)), None) if trend_detector is None: continue channel_detectors.append(MonotonicTrendDetector(filter_zeros=trend_detector.filter_zeros)) return detectorsAlso worth noting:
self.annotatorstill useschannel_config(and therefore the originalchannel_config.detectors), so the auto-appendedMonotonicTrendDetectoris visible only in the explorer UI overlays/hit-targets — not in annotation output. This matches the PR description ("without changing dataset-wide configuration"), but a one-line docstring on_explorer_detectorsmaking that explicit would help future maintainers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@explorer.py` around lines 133 - 144, The loop in _explorer_detectors currently iterates with two variables (channel, channel_detectors) but only uses channel_detectors, causing Ruff B007; change the loop to iterate over detectors.values() (or rename the unused variable to _) so the warning is removed, and update the function docstring to state that this function augments a copy of channel_config.detectors for explorer overlays only (the original channel_config and self.annotator remain unchanged), referencing _explorer_detectors, channel_detectors, MonotonicTrendDetector, TrendDetector, channel_config, and self.annotator to make intent clear for future maintainers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@detectors/trend.py`:
- Around line 362-392: The _can_merge_gap function is doing repeated
boolean-indexing into signal via indices (signal[indices == X][0]) and
full-array masks (gap_mask) which is O(n) per call; change the _TrendSegment
model to carry start_pos and end_pos (or change _can_merge_gap to accept integer
positions) and replace those three boolean lookups with direct scalar indexing
signal[start_pos], signal[end_pos], signal[segment_start_pos], and compute the
gap range with simple integer comparisons (e.g., gap_start = previous.end_pos+1;
gap_end = segment.start_pos-1) to build a contiguous slice or count length
without scanning indices; keep the existing logic around empty gap handling (the
special-case where gap_mask falls back to segment.start_minute) but add a short
comment near that branch explaining why the segment's start is considered an
outlier when no internal gap exists; update references to indices, gap_mask,
previous.start_minute/end_minute, and segment.start_minute to use the new
start_pos/end_pos names and ensure max_outliers logic remains the same.
- Around line 407-430: The overlap check in _resolve_overlapping_directions
currently treats touching segments as overlapping due to inclusive comparisons;
update the overlap predicate inside _resolve_overlapping_directions (the
any(...) lambda that compares segment.start_minute and segment.end_minute
against accepted_segment.start_minute/end_minute) to use strict inequalities (<
and >) so segments that share a boundary minute are allowed to coexist
(consistent with TrendDetector._split_opposite_overlaps behavior), and keep the
rest of the sorting/accept logic unchanged.
- Around line 259-282: The reported inflation happens because outlier endpoints
become starts of subsequent segments and scores are summed in
_merge_with_outliers, so change the merge logic to compute merged segment score
as the net delta between the merged start and end values (e.g.,
value_at(end_index) - value_at(start_index)) instead of summing child segment
scores; update _merge_with_outliers to rebuild the merged _TrendSegment.score
from the merged start/end (using the same signal/indices access used in
_append_segment), and adjust any comparisons in _resolve_overlapping_directions
to use that net-delta score so merged runs reflect actual net movement rather
than accumulated step counts.
In `@explorer.py`:
- Around line 133-144: The loop in _explorer_detectors currently iterates with
two variables (channel, channel_detectors) but only uses channel_detectors,
causing Ruff B007; change the loop to iterate over detectors.values() (or rename
the unused variable to _) so the warning is removed, and update the function
docstring to state that this function augments a copy of
channel_config.detectors for explorer overlays only (the original channel_config
and self.annotator remain unchanged), referencing _explorer_detectors,
channel_detectors, MonotonicTrendDetector, TrendDetector, channel_config, and
self.annotator to make intent clear for future maintainers.
In `@tests/test_monotonic_trend.py`:
- Around line 1-118: Add new unit tests to cover missing merge and scoring
behaviors: (1) a test using MonotonicTrendDetector(filter_zeros=True) with input
like [5,4,0,0,4,3,2] to exercise _can_merge_gap when zeros are removed; (2) a
test with max_outliers=0 to assert no merging occurs; (3) a multi-segment merge
test (three runs separated by two allowed outliers) to validate
_merge_with_outliers chains correctly and score accumulation across segments;
(4) a case where increasing and decreasing segments overlap to force
_resolve_overlapping_directions to drop the lower-scoring segment; and (5) at
least one assertion that verifies the detector's score semantics (e.g.,
comparing per-segment delta sum vs net delta) for a merged region so the scoring
behavior in detectors.trend.MonotonicTrendDetector is pinned down.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2e5b26be-5236-4226-abc9-19005675a5a7
📒 Files selected for processing (3)
detectors/trend.pyexplorer.pytests/test_monotonic_trend.py
Add alternative monotonic trend detector and explorer overlay (Just a draft, please look through to evaluate)
♻️ Current situation & Problem
This PR adds an alternative trend detection approach for evaluating monotonic trends against the existing regression-based
TrendDetector.The goal is to make trend detection easier to inspect and compare in cases where the current detector may miss or over-generalize short structured rises/falls. In particular, this introduces a detector that:
No linked issue yet.
⚙️ Release Notes
MonotonicTrendDetectoras an alternative trend detector for structural event detection experiments.10after merging.Example behavior:
The explorer now shows:
📚 Documentation
This change is currently documented through implementation structure and detector-specific tests.
The solution adds a second detector in
detectors/trend.pyrather than replacing the existingTrendDetector, which keeps the current pipeline stable while enabling side-by-side validation in the explorer.The explorer was updated to append the monotonic detector locally for channels that already use
TrendDetector, so visual comparison can happen without changing dataset-wide detector configuration or existing structural captions.✅ Testing
Added focused unit tests for the new detector covering:
Validation run:
Result:
11 passedCode of Conduct & Contributing Guidelines
By creating and submitting this pull request, you agree to follow our Code of Conduct and Contributing Guidelines:
If you want, I can also turn this into a shorter, more maintainer-style version.