Skip to content

Conversation

@LittorWired
Copy link
Contributor

@LittorWired LittorWired commented Sep 17, 2025

Description

Creates an e2e test utility, the TestContext to help with e2e tests debugging by collecting and categorizing WS events.

Type of change

  • Internal refactoring
  • Bug fix (bugfix - non-breaking)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Code snippets

In case of new feature or breaking changes, please include code snippets.

@changeset-bot
Copy link

changeset-bot bot commented Sep 17, 2025

⚠️ No Changeset found

Latest commit: 1be5820

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@claude
Copy link
Contributor

claude bot commented Sep 17, 2025

Claude finished @LittorWired's task —— View job


Code Review Complete ✅

I've reviewed the TestContext for E2E Debugging implementation. Overall, this is a well-structured addition that provides valuable debugging capabilities for e2e tests.

Todo List:

  • Read repository guidelines (CLAUDE.md - not present)
  • Analyze TestContext.ts (new file, 282 lines)
  • Review fixtures.ts changes (40 additions, 2 deletions)
  • Review test-reporter.ts changes (58 additions, 18 deletions)
  • Review utils.ts changes (174 additions, 4 deletions)
  • Provide comprehensive feedback on code quality, bugs, performance, security, and test coverage

Code Quality & Best Practices ⭐

Strengths:

  • Clean separation of concerns with TestContext class handling event tracking
  • Good TypeScript typing throughout with proper interfaces
  • Consistent error handling and logging patterns
  • Well-structured fixture integration with Playwright

Areas for Improvement:

  1. TestContext.ts:260 - Consider making categorizeMethod more robust:

    private categorizeMethod(method: JSONRPCMethod | '') {
      return (method.split('.').at(0) as EventCategory) || 'other'
    }

    This could fail if method is undefined. Consider:

    return ((method || '').split('.').at(0) as EventCategory) || 'other'
  2. fixtures.ts:66 - The startTestContext() function is called immediately but could be confusing. Consider renaming for clarity.

Potential Bugs & Issues 🐛

  1. Memory Leak Risk - TestContext.sdkEvents array grows unbounded. Consider adding a max size limit or cleanup mechanism for long-running tests.

  2. WebSocket Monitoring - In utils.ts:376-414, WebSocket event handlers don't check if testContext is still valid. Could cause issues if context is destroyed before WebSocket cleanup.

  3. JSON Parsing Safety - Multiple places parse JSON without error handling:

    • utils.ts:384, 397: WebSocket frame parsing
    • test-reporter.ts:119: Context attachment parsing

    Consider adding try-catch blocks around these.

Performance Considerations ⚡

  1. Event Filtering Performance - TestContext.getStats() method filters the entire events array multiple times. Consider caching stats or using counters that increment on event add.

  2. Large Event Arrays - The getAllEvents() method returns a copy of the entire events array. For large test suites, this could be memory-intensive.

  3. String Operations - getEventTimeline() performs string operations on all events. Consider lazy evaluation or pagination for large event sets.

Security Assessment 🔒

Low Risk Areas:

  • No external network calls or user input handling
  • Proper use of environment variables for tokens
  • No credential exposure in logged events

Recommendations:

  • Ensure sensitive data in WebSocket payloads doesn't get logged to console
  • Consider sanitizing event payloads before attachment to test reports

Test Coverage Analysis 🧪

Missing Test Coverage:

  1. No unit tests for TestContext class methods
  2. No integration tests for WebSocket monitoring setup
  3. Edge cases not covered (e.g., malformed WebSocket frames, missing test context)

Suggested Tests:

// Example test structure
describe('TestContext', () => {
  it('should categorize SDK events correctly')
  it('should handle malformed payloads gracefully')  
  it('should prevent memory leaks with large event counts')
})

Specific Code Issues 📝

  1. utils.ts:119: createContextDumpText formatting could be improved - uses basic string concatenation instead of template literals
  2. TestContext.ts:198: UUID generation on every event could be expensive - consider simpler ID generation for high-frequency events
  3. test-reporter.ts:34-98: Long if-else chain could be refactored into a map-based approach for better maintainability

Recommendations Summary 📋

High Priority:

  • Add memory management for sdkEvents array
  • Improve error handling around JSON parsing
  • Add null checks for WebSocket monitoring

Medium Priority:

  • Optimize stats calculation performance
  • Add unit tests for core functionality
  • Refactor error detection logic in test reporter

Low Priority:

  • Improve string formatting efficiency
  • Consider event payload sanitization
  • Add configuration options for context behavior

Changeset Notice 📦

The bot correctly identified that this PR needs a changeset. Since this adds new e2e testing utilities, consider adding:

"@sw-internal/e2e-client": patch

Overall Assessment: This is a solid implementation that will significantly improve e2e test debugging capabilities. The main concerns are around memory management and error handling robustness, but these are addressable in follow-up iterations.

@LittorWired
Copy link
Contributor Author

@jpsantosbh , @RafaelPChequer can you please take a look at this branch and let me know your thoughts on how we can better leverage this to improve the e2e debugging experience for the SDK/server events?

This is just a preliminary idea and still needs work to improve the debugging process but I'd like to get early feedback to see if things are on track.

image image

Comment on lines 4 to 12
type EventCategory =
| 'connection'
| 'call'
| 'room'
| 'member'
| 'chat'
| 'verto'
| 'voice'
| 'other'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should have fewer categories: transport(WS), connection(signalwire.), callSession(others !conversations), conversations(conversations.), tests(artificial events describing the tests)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduced the categories to the one's mentioned transport , connection, callSession , conversations , and tests

}
}

private getCategory(payload: Payload) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could have a concrete implementation for the SDKEvent interface with these utilities

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the feedback, I implemented the SDKEvent as its own class.

await use(maker)
} finally {
// attach test context if failed test
if (testInfo.status !== 'passed' && testInfo.status !== 'skipped') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any drawbacks to always attaching the context, but only rendering the summary in the failure reports? Will the attachments be available to download in GHA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any drawbacks to always attaching the context, but only rendering the summary in the failure reports?

I am not aware of anything other than potentially having some minor perf issues?

Will the attachments be available to download in GHA?

That should be possible if the trace artifact is uploaded to GH see: https://playwright.dev/docs/ci-intro#setting-up-github-actions

Example

 - uses: actions/upload-artifact@v4
      if: ${{ !cancelled() }}
      with:
        name: playwright-report
        path: playwright-report/
        retention-days: 30

Copy link
Collaborator

@jpsantosbh jpsantosbh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm like this so far...
Is it possible to add an utility to create manual events inside the test implementations.
Also, is it possible ass some context expectation, not to be asserted but just as a contrast in the final report, like: events: received 10 - expected 11

@claude
Copy link
Contributor

claude bot commented Sep 23, 2025

Claude finished @LittorWired's task —— View job


I'm reviewing the TestContext for E2E Debugging implementation. Let me analyze the code changes and provide comprehensive feedback.

Review Tasks

  • Read repository guidelines and project structure
  • Analyze TestContext.ts implementation (new file, 282 lines)
  • Review fixtures.ts changes (40 additions, 2 deletions)
  • Review test-reporter.ts changes (58 additions, 18 deletions)
  • Review utils.ts changes (174 additions, 4 deletions)
  • Provide feedback on code quality, bugs, performance, security, and test coverage

Code Review Complete ✅

I've reviewed the TestContext for E2E Debugging implementation. Overall, this is a well-structured addition that provides valuable debugging capabilities for e2e tests. The implementation follows SignalWire's event-driven architecture patterns and integrates cleanly with the existing Playwright setup.

Code Quality & Best Practices ⭐

Strengths:

  • Clean separation of concerns with TestContext class handling event tracking
  • Good TypeScript typing throughout with proper interfaces
  • Consistent error handling and logging patterns
  • Well-structured fixture integration with Playwright
  • Follows existing codebase patterns (event-driven communication, Redux saga style)

Areas for Improvement:

  1. TestContext.ts:260 - The categorizeMethod function could be more robust:

    private categorizeMethod(method: JSONRPCMethod | '') {
      return (method.split('.').at(0) as EventCategory) || 'other'
    }

    Consider adding null safety:

    return ((method || '').split('.').at(0) as EventCategory) || 'other'
  2. fixtures.ts:66 - The startTestContext() function call could be clearer about when it's invoked.

  3. utils.ts:119 - createContextDumpText uses basic string concatenation; template literals would be more readable.

Potential Bugs & Issues 🐛

High Priority:

  1. Memory Leak Risk - TestContext.sdkEvents array grows unbounded. For long-running tests, this could cause memory issues. Consider adding:

    • Maximum event limit with FIFO cleanup
    • Optional event filtering by importance
  2. WebSocket Monitoring Safety - In utils.ts:384,397, JSON parsing lacks error boundaries:

    const payload = JSON.parse(payloadStr) // Could throw

    While there are try-catch blocks, malformed WebSocket frames could still cause issues.

  3. TestContext.ts:198 - UUID generation on every event could be expensive for high-frequency scenarios.

Medium Priority:

  1. Race Conditions - WebSocket event handlers don't verify if testContext is still valid, could cause issues during cleanup.

  2. Event Categorization - getEventsByType method has potential inconsistencies between boolean-based context checks and category-based filtering.

Performance Considerations ⚡

  1. Stats Calculation - getStats() filters the entire events array 12+ times. Consider:

    // Instead of multiple filters, use a single pass
    private updateStatsCounters(event: SDKEvent) {
      this.stats.totalEvents++
      if (event.direction === 'send') this.stats.sentEvents++
      // ... etc
    }
  2. Large Event Arrays - getAllEvents() returns a deep copy. For large test suites with thousands of events, this could be memory-intensive.

  3. Timeline Generation - getEventTimeline() processes all events each time. Consider caching or lazy evaluation.

Security Assessment 🔒

Low Risk Overall:

  • No external network calls or user input handling in core logic
  • Proper use of environment variables for API tokens
  • No credential exposure in logged events

Recommendations:

  • Consider sanitizing WebSocket payloads before logging to prevent accidental credential leakage
  • Ensure test context attachments don't expose sensitive data in CI environments

Test Coverage Analysis 🧪

Missing Coverage:

  1. Unit Tests - No tests for TestContext class methods
  2. Error Scenarios - Edge cases like malformed payloads, WebSocket disconnections
  3. Integration Tests - End-to-end validation of context capture and reporting

Suggested Test Structure:

describe('TestContext', () => {
  it('should categorize SDK events correctly')
  it('should handle malformed WebSocket payloads gracefully')  
  it('should prevent memory leaks with event limits')
  it('should maintain stats accuracy across different event types')
})

Addressing Previous Review Comments 📝

@jpsantosbh's feedback alignment:

  1. Event Categories - The current implementation has 8 categories. Consider the suggested simplification:

    • transport (WebSocket events)
    • connection (signalwire.* methods)
    • callSession (call., calling. methods)
    • conversations (conversation.* methods)
    • tests (artificial test events)
  2. Utility Functions - Consider creating a concrete SDKEvent implementation with utility methods as suggested.

  3. Always Attach Context - Current implementation only attaches on failure. Consider always attaching but only showing summary on failure.

  4. Manual Events & Expectations - Great suggestions for future enhancements:

    • testContext.addManualEvent(description)
    • testContext.setExpectedEvents({ received: 10, expected: 11 })

Specific Code Issues 📍

  1. test-reporter.ts:34-98 - Long if-else chain for error detection could use a map-based approach
  2. TestContext.ts:101-143 - Multiple array filters in getStats() - performance concern for large datasets
  3. utils.ts:2190 - JSON formatting with 6-space indentation is unusual (consider 2 spaces)

Recommendations Summary 🎯

Immediate (Before Merge):

  • Add memory management for sdkEvents array (max size limit)
  • Improve null safety in categorizeMethod
  • Add try-catch around individual JSON parsing operations

Short Term:

  • Optimize getStats() performance with incremental counters
  • Add unit tests for core TestContext functionality
  • Consider event category simplification per @jpsantosbh's feedback

Future Enhancements:

  • Manual event injection capability
  • Event expectation tracking
  • Configurable event filtering/sampling
  • Integration with existing Redux saga monitoring

Changeset Requirement 📦

The changeset bot correctly identified this needs a changeset. Recommend:

"@sw-internal/e2e-client": patch

Overall Assessment: This is a solid foundation for e2e debugging that will significantly improve developer experience. The main concerns around memory management and performance optimization are addressable in follow-up iterations. The implementation demonstrates good understanding of the SignalWire architecture and testing patterns.


Base automatically changed from tl/cp-15995-refactor-e2e-client-tests-new-util-pattern to main September 25, 2025 23:27
@claude
Copy link
Contributor

claude bot commented Sep 26, 2025

Claude finished @LittorWired's task —— View job


I'll review this TestContext for E2E Debugging implementation and provide comprehensive feedback.

Review Tasks

  • Read repository guidelines and project structure
  • Analyze TestContext.ts implementation (new file, 282 lines)
  • Review fixtures.ts changes (40 additions, 2 deletions)
  • Review test-reporter.ts changes (58 additions, 18 deletions)
  • Review utils.ts changes (174 additions, 4 deletions)
  • Provide feedback on code quality, bugs, performance, security, and test coverage

Code Review Complete ✅

I've reviewed the TestContext for E2E Debugging implementation. Overall, this is a well-structured addition that provides valuable debugging capabilities for e2e tests. The implementation follows SignalWire's event-driven architecture patterns and integrates cleanly with the existing Playwright setup.

Code Quality & Best Practices ⭐

Strengths:

  • Clean separation of concerns with TestContext class handling event tracking
  • Good TypeScript typing throughout with proper interfaces
  • Consistent error handling and logging patterns
  • Well-structured fixture integration with Playwright
  • Follows existing codebase patterns (event-driven communication, Redux saga style)

Areas for Improvement:

  1. TestContext.ts:254 - The categorizeMethod function could be more robust:

    private categorizeMethod(payload: Payload): EventCategory {
      const method = payload.method || ''
      // This could fail if method is undefined despite the || ''
      return (method.split('.').at(0) as EventCategory) || 'other'
    }

    Consider adding null safety:

    return ((method || '').split('.').at(0) as EventCategory) || 'callSession'
  2. fixtures.ts:66 - The startTestContext() function call could be clearer about when it's invoked.

  3. utils.ts:2120 - createContextDumpText uses basic string concatenation; template literals would be more readable.

Potential Bugs & Issues 🐛

High Priority:

  1. Memory Leak Risk - TestContext.sdkEvents array grows unbounded. For long-running tests or test suites, this could cause memory issues. Consider adding:

    • Maximum event limit with FIFO cleanup
    • Optional event filtering by importance
    • Periodic cleanup based on time/size thresholds
  2. WebSocket Monitoring Safety - In utils.ts:384,397, JSON parsing lacks comprehensive error boundaries:

    const payload = JSON.parse(payloadStr) // Could throw for malformed JSON

    While there are try-catch blocks, malformed WebSocket frames could still cause issues.

  3. TestContext.ts:193 - UUID generation on every event could be expensive for high-frequency scenarios. Consider simpler ID generation:

    const id = `${timestamp}-${++this.eventCounter}` // Simple counter instead of UUID

Medium Priority:

  1. Race Conditions - WebSocket event handlers don't verify if testContext is still valid, could cause issues during cleanup.

  2. Event Categorization Inconsistency - The getStats() method has potential inconsistencies between boolean-based context checks and category-based filtering that could lead to confusing statistics.

Performance Considerations ⚡

  1. Stats Calculation Inefficiency - getStats() filters the entire events array 12+ times on each call:

    // Current approach - multiple array iterations
    sentEvents: this.sdkEvents.filter((event) => event.isSent()).length,
    receivedEvents: this.sdkEvents.filter((event) => event.isReceived()).length,

    Consider maintaining counters that increment on event addition:

    private stats = { totalEvents: 0, sentEvents: 0, /* ... */ }
    
    addSDKEvent(payload: Payload, direction: 'send' | 'recv') {
      // ... create event
      this.stats.totalEvents++
      if (direction === 'send') this.stats.sentEvents++
      // ... etc
    }
  2. Large Event Arrays - getAllEvents() returns a deep copy via spread operator. For large test suites with thousands of events, this could be memory-intensive.

  3. Timeline Generation - getEventTimeline() processes all events each time it's called. Consider caching or lazy evaluation for frequently accessed data.

Security Assessment 🔒

Low Risk Overall:

  • No external network calls or user input handling in core logic
  • Proper use of environment variables for API tokens
  • No credential exposure in logged events

Recommendations:

  • Consider sanitizing WebSocket payloads before logging to prevent accidental credential leakage
  • Ensure test context attachments don't expose sensitive data in CI environments
  • Add payload filtering options to exclude sensitive method calls if needed

Test Coverage Analysis 🧪

Missing Coverage:

  1. Unit Tests - No tests for TestContext class methods, event categorization logic, or statistics calculations
  2. Error Scenarios - Edge cases like malformed payloads, WebSocket disconnections during monitoring
  3. Integration Tests - End-to-end validation of context capture and reporting workflow
  4. Performance Tests - Behavior under high event volume scenarios

Suggested Test Structure:

describe('TestContext', () => {
  it('should categorize SDK events correctly')
  it('should handle malformed WebSocket payloads gracefully')  
  it('should prevent memory leaks with event limits')
  it('should maintain accurate statistics across different event types')
  it('should handle concurrent WebSocket connections')
})

Addressing Previous Review Comments 📝

@jpsantosbh's feedback alignment:

  1. Event Categories - Current implementation has 5 categories vs. suggested simplification. The suggestion to reduce to transport, connection, callSession, conversations, tests is solid and would improve clarity.

  2. Concrete SDKEvent Implementation - ✅ Already implemented with utility methods like isCallRelated(), isSent(), etc.

  3. Always Attach Context - Currently only attaches on failure. Consider the suggestion to always attach but show summary only on failures for better debugging visibility.

  4. Manual Events & Expectations - Great suggestions for future enhancements:

    • testContext.addManualEvent(description) for test-specific annotations
    • testContext.setExpectedEvents({ received: 10, expected: 11 }) for validation

Specific Code Issues 📍

  1. test-reporter.ts:34-98 - Long if-else chain for error detection could use a map-based approach for better maintainability
  2. TestContext.ts:148-177 - Multiple array filters in getStats() - performance concern for large datasets
  3. utils.ts:2190 - JSON formatting with 6-space indentation is unusual (consider 2 spaces)
  4. fixtures.ts:91-94 - Test context attachment only on failure might miss valuable debugging info for passing tests

Recommendations Summary 🎯

Immediate (Before Merge):

  • Add memory management for sdkEvents array (max size limit)
  • Improve null safety in categorizeMethod
  • Add comprehensive try-catch around individual JSON parsing operations
  • Consider event category simplification per @jpsantosbh's feedback

Short Term:

  • Optimize getStats() performance with incremental counters
  • Add unit tests for core TestContext functionality
  • Implement event expectation tracking capability
  • Add configuration options for context behavior

Future Enhancements:

  • Manual event injection capability (addManualEvent)
  • Event expectation validation system
  • Configurable event filtering/sampling for performance
  • Integration with existing Redux saga monitoring patterns

Changeset Requirement 📦

The changeset bot correctly identified this needs a changeset. Recommend:

"@sw-internal/e2e-client": patch

Overall Assessment: This is a solid foundation for e2e debugging that will significantly improve developer experience when troubleshooting test failures. The main concerns around memory management and performance optimization are addressable in follow-up iterations. The implementation demonstrates good understanding of the SignalWire architecture and testing patterns.

The screenshots show a clean, informative debug interface that will be very helpful for developers. This feature addresses a real pain point in e2e test debugging and follows good software engineering practices.


@LittorWired LittorWired changed the title WIP: TestContext for E2E Debugging TestContext for E2E Debugging Sep 26, 2025
@LittorWired
Copy link
Contributor Author

I'm like this so far... Is it possible to add an utility to create manual events inside the test implementations. Also, is it possible ass some context expectation, not to be asserted but just as a contrast in the final report, like: events: received 10 - expected 11

Thank you. I think the additional comments will need to be handed over to someone else to complete at this point, as I am directed to transition my work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants