Skip to content

feat(core): integrate TAPI backoff v2 into upload flow (3/3)#1151

Closed
abueide wants to merge 3 commits intomasterfrom
feature/tapi-integration
Closed

feat(core): integrate TAPI backoff v2 into upload flow (3/3)#1151
abueide wants to merge 3 commits intomasterfrom
feature/tapi-integration

Conversation

@abueide
Copy link
Contributor

@abueide abueide commented Mar 6, 2026

Overview

This PR integrates the backoff infrastructure and error classification system into the SegmentDestination upload flow, completing the TAPI backoff v2 implementation.

Related PRs:

SegmentDestination Changes

Upload Gate

Before processing any batches, the upload flow now checks if uploads are allowed based on the rate limit state:

if (this.backoffInitialized && this.uploadStateMachine) {
  const canUpload = await this.uploadStateMachine.canUpload();
  if (!canUpload) {
    return; // Defer upload until rate limit period expires
  }
}

Batch Metadata Lifecycle

Batch metadata is now created lazily to avoid unnecessary persistence overhead:

  1. Before first upload attempt: No metadata exists
  2. On first transient error: Metadata created with createBatch()
  3. On subsequent errors: Metadata updated with handleRetry()
  4. On success: Metadata removed with removeBatch()

Error Handling by Type

Rate Limit (429):
When a 429 response is received, the upload state machine is updated with the retry-after duration and the upload loop halts immediately to prevent wasted API calls.

Transient Errors (5xx, network failures):
Batch metadata is created or updated with exponential backoff timing. The upload loop continues to the next batch, allowing other batches to be attempted.

Permanent Errors (4xx):
The batch is immediately dropped and removed from the queue. No retry is attempted.

Retry Count Header

The implementation now sends an X-Retry-Count header with each upload request, using either the per-batch retry count or the global retry count:

const retryCount = batchRetryCount || globalRetryCount;
await uploadEvents({ writeKey, url, events: batch, retryCount });

Dynamic Module Import

The backoff module is loaded dynamically to avoid circular dependencies:

import('../backoff')
  .then(({ UploadStateMachine, BatchUploadManager }) => {
    // Initialize components
    this.backoffInitialized = true;
    this.settingsResolve();
  });

This approach ensures the settings promise only resolves after backoff components are initialized, preventing race conditions.

API Changes

keepalive Flag

The fetch request now includes the keepalive flag to ensure uploads can complete even if the app is backgrounded or closed:

fetch(url, {
  method: 'POST',
  keepalive: true,
  body: ...
});

X-Retry-Count Header

A new header is included in upload requests to help with debugging and monitoring:

headers: {
  'Content-Type': 'application/json; charset=utf-8',
  'Authorization': authHeader,
  'X-Retry-Count': retryCount.toString()
}

Logger Changes

The logger constructor now defaults to disabled in production environments:

constructor(isDisabled: boolean = process.env.NODE_ENV === 'production')

This prevents unexpected logging output in production builds while maintaining debug capability in development.

Code Quality Improvements

This PR includes several improvements from code review:

  • Removed all debug console.log statements
  • Added error logging to all catch blocks
  • Used appropriate log levels (info for normal operations, error for failures)
  • Simplified upload gate logic for better readability
  • Simplified retry count logic using logical OR operator

Initialization Flow

  1. SegmentDestination receives settings from Settings CDN
  2. Backoff module is dynamically imported
  3. Config validation is applied to rate limit and backoff configs
  4. UploadStateMachine and BatchUploadManager are instantiated
  5. backoffInitialized flag is set to true
  6. Settings promise resolves, allowing uploads to proceed

If initialization fails at any step, uploads proceed without backoff (graceful degradation).

Testing

Manual testing has verified the following scenarios:

  • Successful uploads do not create batch metadata
  • 429 responses trigger rate limiting with proper wait times
  • 5xx responses create batch metadata with exponential backoff
  • App restart validates and drops corrupt metadata
  • All error paths include appropriate logging

Dependencies

This PR depends on #1149 and #1150 being merged first, as it integrates the components added in those PRs.

@abueide
Copy link
Contributor Author

abueide commented Mar 6, 2026

Note: Integration tests for the upload flow will be added in a separate PR to keep the review focused on the implementation.

@abueide abueide marked this pull request as draft March 6, 2026 20:22
@abueide abueide marked this pull request as draft March 6, 2026 20:22
@abueide abueide self-assigned this Mar 6, 2026
Integrates backoff infrastructure and error classification into the
SegmentDestination upload flow. This completes the TAPI backoff v2
implementation.

SegmentDestination changes:
- Upload gate: Check canUpload() before processing batches
- Batch lifecycle: Create metadata on first failure only
- Error handling:
  - 429 rate limit: Call handle429() and halt upload loop
  - Transient errors: Create/update batch metadata for retry
  - Permanent errors: Drop batch immediately
  - Network errors: Treat as transient
- Retry count header: Send X-Retry-Count (batch or global)
- Dynamic import: Lazy load backoff module to avoid circular deps
- Clean initialization: Info-level logs for normal operations
- Error logging: All catch blocks now log errors appropriately

API changes:
- Add keepalive: true to fetch for background completion
- Add retryCount parameter for X-Retry-Count header

Logger changes:
- Default to disabled in production (process.env.NODE_ENV check)

Key improvements from code review:
- No console.log statements
- Simplified upload gate logic
- Proper error logging throughout
- Appropriate log levels (info for normal, error for failures)

This PR depends on PRs #1 (infrastructure) and #2 (error classification).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@abueide abueide force-pushed the feature/tapi-integration branch from 8aab67f to a0f8449 Compare March 6, 2026 20:44
abueide added a commit that referenced this pull request Mar 6, 2026
Adds comprehensive extended test suites for edge cases and implementation
details that supplement the core tests in the feature PRs:

- UploadStateMachine.extended.test.ts: disabled config, multiple retries,
  getter tests, persistence tests
- BatchUploadManager.extended.test.ts: unique IDs, disabled config, getter
  tests, detailed exponential backoff algorithm tests, persistence tests
- SegmentDestination.extended.test.ts: state reset after success, legacy
  behavior, Retry-After header parsing

These tests provide thorough coverage of edge cases, backwards compatibility,
and implementation details without adding bulk to the core feature PRs.

Related to PRs: #1150, #1151, #1152, #1153

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
abueide and others added 2 commits March 6, 2026 15:07
Removed nice-to-have TAPI integration tests:
- resets state after successful upload (happy path detail)
- uses legacy behavior when httpConfig.enabled = false (backwards compat)
- parses Retry-After header correctly (implementation detail)
- uses default retry-after when header missing (edge case)

These tests now live in SegmentDestination.extended.test.ts on the
feature/tapi-extended-tests branch for separate review.

Reduces test file from 945 to 821 lines (-124 lines).

Related: feature/tapi-extended-tests

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Removed all inline comments from SegmentDestination.ts and errors.ts.
Documentation should live in docs, not in code comments.

Reduces SegmentDestination.ts from 409 to 372 lines (-37 lines)
Reduces errors.ts from 123 to 119 lines (-4 lines)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@abueide
Copy link
Contributor Author

abueide commented Mar 6, 2026

Closing this integration PR as we're keeping the implementation PRs separate for easier review. The integration will be done after all component PRs (#1153, #1152, #1150) have been reviewed and merged.

@abueide abueide closed this Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant