fix: BatchResult completion reason logic by yaythomas · Pull Request #286 · aws/aws-durable-execution-sdk-python

yaythomas · 2026-01-30T01:52:45Z

Description of changes:
Fix BatchResult.from_items() to match JavaScript SDK behavior. The previous implementation incorrectly checked conditions in the wrong order, causing incorrect completion reasons for concurrent operations with failure tolerance configurations.

The checkpointed completion reason is preserved during replay, so existing executions are unaffected. Code with conditional logic based on completion_reason might see different values after this update.

Example: With 3 items (1 success, 2 failures) and tolerated_failure_count=1:

Before: ALL_COMPLETED (incorrect - all items finished)
After: FAILURE_TOLERANCE_EXCEEDED (correct - tolerance breached)

Changes:

Extract completion reason logic to _get_completion_reason() method
Check failure tolerance BEFORE checking if all completed
Implement proper fail-fast when no completion config provided
Add comprehensive unit tests covering all edge cases

This ensures tolerance checks take precedence over success criteria, preventing operations from incorrectly reporting ALL_COMPLETED when failure tolerance has been exceeded.

new behaviour

Different completion_reason values for map/parallel operations in these scenarios:

No completion config with failures - Now returns FAILURE_TOLERANCE_EXCEEDED (previously ALL_COMPLETED)
Empty completion config with failures - Now returns FAILURE_TOLERANCE_EXCEEDED (previously ALL_COMPLETED)
Tolerance exceeded with all completed - Now returns FAILURE_TOLERANCE_EXCEEDED (previously ALL_COMPLETED)
Both min_successful and tolerance met - Now returns FAILURE_TOLERANCE_EXCEEDED (previously MIN_SUCCESSFUL_REACHED)

reference logic

https://github.com/aws/aws-durable-execution-sdk-js/blob/main/packages/aws-durable-execution-sdk-js/src/handlers/concurrent-execution-handler/concurrent-execution-handler.ts#L41-L75

  private getCompletionReason<T, R>(
    failureCount: number,
    successCount: number,
    completedCount: number,
    items: ConcurrentExecutionItem<T>[],
    config: ConcurrencyConfig<R>,
  ): "ALL_COMPLETED" | "MIN_SUCCESSFUL_REACHED" | "FAILURE_TOLERANCE_EXCEEDED" {
    // Check tolerance first, before checking if all completed
    const completion = config.completionConfig;

    // Handle fail-fast behavior (no completion config or empty completion config)
    if (!completion) {
      if (failureCount > 0) return "FAILURE_TOLERANCE_EXCEEDED";
    } else {
      const hasAnyCompletionCriteria = Object.values(completion).some(
        (value) => value !== undefined,
      );
      if (!hasAnyCompletionCriteria) {
        if (failureCount > 0) return "FAILURE_TOLERANCE_EXCEEDED";
      } else {
        // Check specific tolerance thresholds
        if (
          completion.toleratedFailureCount !== undefined &&
          failureCount > completion.toleratedFailureCount
        ) {
          return "FAILURE_TOLERANCE_EXCEEDED";
        }
        if (completion.toleratedFailurePercentage !== undefined) {
          const failurePercentage = (failureCount / items.length) * 100;
          if (failurePercentage > completion.toleratedFailurePercentage) {
            return "FAILURE_TOLERANCE_EXCEEDED";
          }
        }
      }
    }

Issue #, if available:
closes #280

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Fix BatchResult.from_items() to match JavaScript SDK behavior. The previous implementation incorrectly checked conditions in the wrong order, causing incorrect completion reasons for concurrent operations with failure tolerance configurations. The checkpointed completion reason is preserved during replay, so existing executions are unaffected. Code with conditional logic based on completion_reason might see different values after this update. Example: With 3 items (1 success, 2 failures) and tolerated_failure_count=1: - Before: ALL_COMPLETED (incorrect - all items finished) - After: FAILURE_TOLERANCE_EXCEEDED (correct - tolerance breached) Changes: - Extract completion reason logic to _get_completion_reason() method - Check failure tolerance BEFORE checking if all completed - Implement proper fail-fast when no completion config provided - Add comprehensive unit tests covering all edge cases This ensures tolerance checks take precedence over success criteria, preventing operations from incorrectly reporting ALL_COMPLETED when failure tolerance has been exceeded. closes #280

src/aws_durable_execution_sdk_python/concurrency/models.py

yaythomas requested a review from ParidelPooya January 30, 2026 02:07

yaythomas force-pushed the completion-reason branch from 068f6f3 to 497bee2 Compare February 3, 2026 22:50

ParidelPooya approved these changes Feb 4, 2026

View reviewed changes

dhegberg reviewed Feb 4, 2026

View reviewed changes

src/aws_durable_execution_sdk_python/concurrency/models.py Show resolved Hide resolved

yaythomas merged commit e5824f1 into main Feb 6, 2026
14 of 15 checks passed

yaythomas deleted the completion-reason branch February 6, 2026 17:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: BatchResult completion reason logic#286

fix: BatchResult completion reason logic#286
yaythomas merged 1 commit intomainfrom
completion-reason

yaythomas commented Jan 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yaythomas commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

new behaviour

reference logic

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yaythomas commented Jan 30, 2026 •

edited

Loading