fix(calls): harden video join without pairwise WebRTC churn#2310
Conversation
Cherry-pick safe call improvements from #2307 onto production baseline: query camera permission without a pre-enter getUserMedia probe, reuse applyConstraints for voice preset instead of opening a second mic stream, preserve camera tracks still in `new` state after enter(), add ICE transport diagnostics, and only show voice waves when a live audio track exists. Excludes pairwise hangup/republish loops that broke preview video while production calls already work with server TURN fixed. Co-authored-by: Cursor <cursoragent@cursor.com>
WalkthroughAdds camera permission detection, richer ICE transport telemetry and diagnostics, media track readiness fixes, screenshare capture-root adjustments, and a screenshare takeover flow with UI, tests, and i18n updates. ChangesWebRTC Diagnostics and Call Quality Improvements
Sequence Diagram(s)sequenceDiagram
participant Hook as use-space-group-call
participant PermCheck as isLocalCameraPermissionDenied
participant Nav as navigator.permissions
participant GroupCall as GroupCall
Hook->>PermCheck: Check camera permission
PermCheck->>Nav: Query camera permission state
Nav-->>PermCheck: Return permission state
alt Permission Denied
PermCheck-->>Hook: true
Hook->>Hook: Set cameraAccessBlocked
Hook->>Hook: Force initWithVideoMuted
else Permission Granted or Prompt
PermCheck-->>Hook: false
Hook->>GroupCall: Enter with normal state
end
sequenceDiagram
participant Scheduler as attachGroupCallWebRtcDiagnostics
participant Logger as logIceTransportForPeerConnection
participant Reader as readIceTransportSummary
participant Stats as RTCPeerConnection.getStats
participant Telemetry as logSpaceGroupCallEvent
Scheduler->>Logger: Per interval, each peer connection
Logger->>Reader: Read ICE transport state
Reader->>Stats: Get peer connection stats
Stats-->>Reader: RTCStatsReport
Reader-->>Logger: IceTransportSummary
Logger->>Telemetry: Emit hypha.group_call.ice_transport
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
🔗 Custom preview URL: https://pr-2310.preview-app.hypha.earth |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/core/src/matrix/client/hooks/__tests__/call-camera-access.test.ts`:
- Around line 8-32: The tests for isLocalCameraPermissionDenied mutate the
global navigator.permissions and never restore it; capture the original
navigator.permissions reference before mutating (e.g., in the describe block or
beforeAll) and add an afterEach (or afterAll) that restores it using
Object.defineProperty or deletes the property if it was originally undefined so
other tests don't see the modified value; apply this to the tests that mock
navigator.permissions in the describe('isLocalCameraPermissionDenied') block so
each test resets global state.
In `@packages/core/src/matrix/client/hooks/group-call-webrtc-diagnostics.ts`:
- Around line 386-390: The interval currently calls logFrameSizes() and
logIceTransport() separately causing two getStats() calls per peer per tick;
change the implementation to collect a single RTCPeerConnection.getStats()
snapshot per peer/tick and pass that snapshot into both logFrameSizes and
logIceTransport (or refactor those helpers to accept a stats parameter) so each
peer uses the same time-aligned stats object for both diagnostics and avoids
duplicate getStats() calls.
- Around line 239-240: peerConnection.getStats() can reject during teardown and
currently runs as fire-and-forget, so wrap the call in a local try/catch inside
the function that calls it (the block invoking peerConnection.getStats() and
readIceTransportSummary) to prevent unhandled rejections; specifically, surround
the await peerConnection.getStats() call with try/catch, log or silently ignore
the error (using the existing diagnostics logger) and return early so
readIceTransportSummary is only called on success.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0bf5905d-b741-4d7a-a184-98d0178c0d0a
📒 Files selected for processing (6)
packages/core/src/matrix/client/hooks/__tests__/call-camera-access.test.tspackages/core/src/matrix/client/hooks/call-camera-access.tspackages/core/src/matrix/client/hooks/group-call-webrtc-diagnostics.tspackages/core/src/matrix/client/hooks/space-group-call-telemetry.tspackages/core/src/matrix/client/hooks/use-space-group-call.tspackages/epics/src/common/human-chat-panel/human-chat-panel-call-stage.tsx
| describe('isLocalCameraPermissionDenied', () => { | ||
| it('returns true when the Permissions API reports denied', async () => { | ||
| Object.defineProperty(navigator, 'permissions', { | ||
| configurable: true, | ||
| value: { | ||
| query: vi.fn().mockResolvedValue({ state: 'denied' }), | ||
| }, | ||
| }); | ||
| await expect(isLocalCameraPermissionDenied()).resolves.toBe(true); | ||
| }); | ||
|
|
||
| it('returns false when permission is granted or prompt', async () => { | ||
| Object.defineProperty(navigator, 'permissions', { | ||
| configurable: true, | ||
| value: { | ||
| query: vi | ||
| .fn() | ||
| .mockResolvedValueOnce({ state: 'granted' }) | ||
| .mockResolvedValueOnce({ state: 'prompt' }), | ||
| }, | ||
| }); | ||
| await expect(isLocalCameraPermissionDenied()).resolves.toBe(false); | ||
| await expect(isLocalCameraPermissionDenied()).resolves.toBe(false); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Restore navigator.permissions after each test to prevent global-state bleed.
These tests overwrite a global but never reset it, which can cause order-dependent failures as this file grows.
Suggested fix
-import { describe, expect, it, vi } from 'vitest';
+import { afterEach, describe, expect, it, vi } from 'vitest';
...
+const originalPermissions = navigator.permissions;
+
+afterEach(() => {
+ Object.defineProperty(navigator, 'permissions', {
+ configurable: true,
+ value: originalPermissions,
+ });
+});
+
describe('isLocalCameraPermissionDenied', () => {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/matrix/client/hooks/__tests__/call-camera-access.test.ts`
around lines 8 - 32, The tests for isLocalCameraPermissionDenied mutate the
global navigator.permissions and never restore it; capture the original
navigator.permissions reference before mutating (e.g., in the describe block or
beforeAll) and add an afterEach (or afterAll) that restores it using
Object.defineProperty or deletes the property if it was originally undefined so
other tests don't see the modified value; apply this to the tests that mock
navigator.permissions in the describe('isLocalCameraPermissionDenied') block so
each test resets global state.
| const stats = await peerConnection.getStats(); | ||
| const transport = readIceTransportSummary(peerConnection, stats); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Can RTCPeerConnection.getStats() reject after a connection is closed or during teardown, and what error behavior is documented across major browsers?
💡 Result:
The behavior of RTCPeerConnection.getStats when called on a closed PeerConnection varies by browser implementation, despite the W3C WebRTC Statistics API specification intending for stats to remain available even after a connection is closed [1][2]. According to the W3C specification, monitored objects should generally persist for the duration of the associated RTCPeerConnection, allowing their statistics to be accessed via getStats even after the connection has been closed [2]. However, implementation has historically been inconsistent across major browsers [1][3][4][5]. In Firefox, historically, calling getStats on a closed RTCPeerConnection resulted in a promise rejection with an InvalidStateError or similar DOMException [3][4]. Recent updates (circa April 2026) have moved toward allowing getStats on closed connections, mirroring the intended specification behavior and avoiding the previous rejection/hanging behavior [1][6][5]. In Chromium-based browsers, the behavior generally allows calling getStats on a closed connection, which typically returns a resolution with the final state of the statistics, rather than an error [5]. Due to these historical inconsistencies, many WebRTC applications use polyfills (such as those found in the adapter.js library) to normalize this behavior across browsers, often by guarding the call or manually resolving a promise with an empty map if the connection state is 'closed' [4]. Summary of error behavior: - Specification: Intends for stats to be accessible after closing; getStats should not reject solely because the connection is closed [2]. - Chromium: Generally allows the call and returns the final available statistics [5]. - Firefox: Historically rejected with an InvalidStateError, but recent versions have been updated to support calling getStats after closure without throwing [1][4][6]. If your application relies on consistent behavior, check the connectionState of the RTCPeerConnection before invoking getStats and handle the state accordingly [4].
Citations:
- 1: https://bugzilla.mozilla.org/show_bug.cgi?id=1056433
- 2: https://www.w3.org/TR/webrtc-stats/
- 3: https://bugzilla.mozilla.org/show_bug.cgi?id=1773067
- 4: Firefox guard getStats() shim against closed PeerConnection webrtcHacks/adapter#1179
- 5: https://phabricator.services.mozilla.com/D293467
- 6: mozilla-firefox/firefox@3878b06
Handle RTCPeerConnection.getStats() rejections in fire-and-forget logging
RTCPeerConnection.getStats() has historically been inconsistent across browsers when the connection is closed/tearing down (e.g., Firefox has rejected with InvalidStateError in past versions). Since callers fire-and-forget with void, a rejection can surface as unhandled/noisy async errors—wrap the getStats() call with a local try/catch.
Proposed fix
async function logIceTransportForPeerConnection(options: {
roomId: string;
groupCallId: string;
userId: string | null;
peerConnection: RTCPeerConnection;
}): Promise<void> {
const { roomId, groupCallId, userId, peerConnection } = options;
- const stats = await peerConnection.getStats();
- const transport = readIceTransportSummary(peerConnection, stats);
- logSpaceGroupCallEvent({
- name: 'hypha.group_call.ice_transport',
- roomId,
- groupCallId,
- remoteUserId: userId ?? undefined,
- iceConnectionState: transport.iceConnectionState,
- iceGatherState: transport.iceGatheringState,
- connectionState: transport.connectionState,
- localCandidateType: transport.localCandidateType,
- remoteCandidateType: transport.remoteCandidateType,
- pairState: transport.pairState,
- inboundAudioBytes: transport.inboundAudioBytes,
- inboundVideoBytes: transport.inboundVideoBytes,
- });
+ try {
+ const stats = await peerConnection.getStats();
+ const transport = readIceTransportSummary(peerConnection, stats);
+ logSpaceGroupCallEvent({
+ name: 'hypha.group_call.ice_transport',
+ roomId,
+ groupCallId,
+ remoteUserId: userId ?? undefined,
+ iceConnectionState: transport.iceConnectionState,
+ iceGatherState: transport.iceGatheringState,
+ connectionState: transport.connectionState,
+ localCandidateType: transport.localCandidateType,
+ remoteCandidateType: transport.remoteCandidateType,
+ pairState: transport.pairState,
+ inboundAudioBytes: transport.inboundAudioBytes,
+ inboundVideoBytes: transport.inboundVideoBytes,
+ });
+ } catch {
+ // Peer connection may close between interval ticks; ignore transient diagnostics failure.
+ }
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/matrix/client/hooks/group-call-webrtc-diagnostics.ts`
around lines 239 - 240, peerConnection.getStats() can reject during teardown and
currently runs as fire-and-forget, so wrap the call in a local try/catch inside
the function that calls it (the block invoking peerConnection.getStats() and
readIceTransportSummary) to prevent unhandled rejections; specifically, surround
the await peerConnection.getStats() call with try/catch, log or silently ignore
the error (using the existing diagnostics logger) and return early so
readIceTransportSummary is only called on success.
| logIceTransport(); | ||
| frameLogInterval = setInterval(() => { | ||
| logFrameSizes(); | ||
| logIceTransport(); | ||
| }, inboundRtpFrameLogIntervalMs); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Avoid double getStats() calls per peer per interval tick.
In the support-debug interval path, logFrameSizes() and logIceTransport() each call getStats() for the same peer. Reusing one stats snapshot per peer/tick would cut diagnostics overhead and keep measurements time-aligned.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/matrix/client/hooks/group-call-webrtc-diagnostics.ts`
around lines 386 - 390, The interval currently calls logFrameSizes() and
logIceTransport() separately causing two getStats() calls per peer per tick;
change the implementation to collect a single RTCPeerConnection.getStats()
snapshot per peer/tick and pass that snapshot into both logFrameSizes and
logIceTransport (or refactor those helpers to accept a stats parameter) so each
peer uses the same time-aligned stats object for both diagnostics and avoids
duplicate getStats() calls.
Crop tab capture to the main content column so call UI is not mirrored, and wire the takeover request/approve flow so a second participant can share. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@packages/epics/src/common/human-chat-panel/__tests__/call-regression-manual-gates.test.ts`:
- Around line 128-130: The current assertions check for exact spacing in the
string "sendScreenshareTakeoverEvent(\n 'request'" which is
brittle; update the two expectations that reference that exact formatting (the
assertions against the variable source in the test file) to be
whitespace-tolerant by either using a RegExp that ignores whitespace (e.g.,
/\bsendScreenshareTakeoverEvent\(\s*'request'/) or normalizing source before
asserting (e.g., collapse whitespace with .replace(/\s+/g,' ') and assert the
normalized string contains "sendScreenshareTakeoverEvent('request'"). Apply the
same change to both occurrences of the assertion in the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 195df230-33d2-4a39-a2b7-e47c6df36d9e
📒 Files selected for processing (18)
packages/core/src/matrix/client/hooks/__tests__/screenshare-capture-exclusion.test.tspackages/core/src/matrix/client/hooks/__tests__/screenshare-capture.test.tspackages/core/src/matrix/client/hooks/index.tspackages/core/src/matrix/client/hooks/screenshare-capture-exclusion.tspackages/core/src/matrix/client/hooks/screenshare-capture.tspackages/core/src/matrix/client/hooks/use-space-group-call.tspackages/epics/src/common/global-call-dock-overlay.tsxpackages/epics/src/common/human-chat-panel/__tests__/call-regression-manual-gates.test.tspackages/epics/src/common/human-chat-panel/human-chat-panel-call-screenshare-menu.tsxpackages/epics/src/common/human-chat-panel/use-screenshare-tab-audio-prompt.tsxpackages/epics/src/common/human-right-panel.tsxpackages/epics/src/common/panel-main-column-scroll-bridge.tsxpackages/epics/src/common/panel-scroll-inset.tsxpackages/i18n/src/messages/de.jsonpackages/i18n/src/messages/en.jsonpackages/i18n/src/messages/es.jsonpackages/i18n/src/messages/fr.jsonpackages/i18n/src/messages/pt.json
| expect(source).toContain( | ||
| "sendScreenshareTakeoverEvent(\n 'request'", | ||
| ); |
There was a problem hiding this comment.
Make source-scan assertions whitespace-tolerant to avoid false CI failures.
Line 129 and Line 146 assert exact formatting/spacing of source text. Non-functional reformatting can break this test even when takeover behavior is unchanged.
Suggested update
- expect(source).toContain(
- "sendScreenshareTakeoverEvent(\n 'request'",
- );
+ expect(source).toMatch(/sendScreenshareTakeoverEvent\(\s*'request'/);
- expect(menu).toContain('const shareStartDisabled = disabled;');
+ expect(menu).toMatch(/\bshareStartDisabled\s*=\s*disabled\b/);Also applies to: 146-146
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@packages/epics/src/common/human-chat-panel/__tests__/call-regression-manual-gates.test.ts`
around lines 128 - 130, The current assertions check for exact spacing in the
string "sendScreenshareTakeoverEvent(\n 'request'" which is
brittle; update the two expectations that reference that exact formatting (the
assertions against the variable source in the test file) to be
whitespace-tolerant by either using a RegExp that ignores whitespace (e.g.,
/\bsendScreenshareTakeoverEvent\(\s*'request'/) or normalizing source before
asserting (e.g., collapse whitespace with .replace(/\s+/g,' ') and assert the
normalized string contains "sendScreenshareTakeoverEvent('request'"). Apply the
same change to both occurrences of the assertion in the file.
fix(calls): apply CodeRabbit diagnostics and test hygiene from #2310
Summary
mainafter server TURN fix).navigator.permissionsbeforeenter()instead of pre-openinggetUserMedia.applyConstraintson the existing mic track (no second capture stream) and preserves camera tracks still innewstate afterenter().hypha.group_call.ice_transportdiagnostics (ICE state, candidate types, inbound RTP bytes) for support sessions.Explicitly excluded from #2307: pairwise hangup/republish loops (
restartAllPairwiseCallsForVideo,schedulePairwiseVideoResync,CallsChangedrepublish) — those broke preview video while production worked.Context
Production calls work; PR #2307 added ~2,200 lines of pairwise WebRTC repair logic that destroyed ICE sessions mid-negotiation. This PR keeps production’s SDK-managed pairwise path and lands only the join-quality fixes.
Test plan
pairwise_restartspam; optionalice_transportshowsinboundVideoBytesgrowingrequestLocalCameraAccessoncepnpm --filter @hypha-platform/core exec vitest run src/matrix/client/hooks/__tests__/call-camera-access.test.tsMade with Cursor
Summary by CodeRabbit
Bug Fixes
New Features
Tests