Currently, each video frame is independently evaluated against the TensorFlow model, which can lead to unstable detections due to motion blur, lighting changes, or transient model errors.
This issue proposes implementing temporal smoothing by maintaining a rolling buffer of detection history to stabilize dice recognition:
What needs to be done:
- Maintain a rolling frame history buffer (10-15 frames at 2 FPS capture rate ≈ 5-7.5 seconds of history)
- Track dice across frames using spatial proximity (Euclidean distance between bounding boxes)
- Implement confidence voting logic: if a die at position (x, y) is detected as "4" in 10+ out of 15 frames, but shows as "5" in 1-2 frames due to noise, lock in the "4" result
- Reduce false positives by requiring temporal consistency rather than single-frame confidence
Technical approach:
- Leverage the existing DiceDetection interface (which already tracks x, y, width, height, confidence)
- Add a time-series buffer to the frame processor state
- Implement spatial matching using Pythagorean distance: sqrt((x2-x1)² + (y2-y1)²) to track die identity across frames
Acceptance Criteria
Currently, each video frame is independently evaluated against the TensorFlow model, which can lead to unstable detections due to motion blur, lighting changes, or transient model errors.
This issue proposes implementing temporal smoothing by maintaining a rolling buffer of detection history to stabilize dice recognition:
What needs to be done:
Technical approach:
Acceptance Criteria