Skip to content

Conversation

@DTCurrie
Copy link
Member

@DTCurrie DTCurrie commented Oct 31, 2025

The dial timeout is not stored or cleared. We also set the successful flag asynchronously. This creates a potential race condition where the timeout could be evoked before successful is updated. This change stores the dial timeout and ensures it is cleared after a connection attempt is made. This should avoid the potential termination of already successful connections.

This PR also introduces some unit tests around dialWebRTC. The tests rely heavily on mocks, but they at least add some layer of assertion that the code behaves as expected. Ideally, as more tests are added, additional fixtures can be added to harden our tests.

Finally, just a little cleanup to avoid ignoring eslint rules.

Note: This is not a permanent fix, but more of a band-aid. Ultimately, this logic is frail. Mixing timeouts with the results of promises can lead to unexpected behavior. We should consider refactoring this code path to avoid that.

@DTCurrie DTCurrie self-assigned this Oct 31, 2025
@DTCurrie DTCurrie requested a review from a team as a code owner October 31, 2025 21:03
@DTCurrie DTCurrie requested review from lia-viam and njooma October 31, 2025 21:03
@DTCurrie DTCurrie changed the title Clear dial timeout to terminate signal exchange timeout on error/success Clear dial timeout to terminate signal exchange on error/success Oct 31, 2025
@DTCurrie DTCurrie changed the title Clear dial timeout to terminate signal exchange on error/success RSDK-12206 Clear dial timeout to terminate signal exchange on error/success Nov 5, 2025
@DTCurrie DTCurrie merged commit a571f70 into main Nov 6, 2025
3 checks passed
@DTCurrie DTCurrie deleted the fix-connection-timeout-race-condition branch November 6, 2025 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants