Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 15, 2025

Plan to Fix DTensor isinstance Check Issue

  • Add use_dtensor_execute attribute to FusionDefinitionWrapper class
  • Determine use_dtensor_execute value at trace construction time in create_fusion_definition_wrapper
  • Update FusionDefinitionWrapper.__call__ to use use_dtensor_execute instead of runtime isinstance check
  • Fix potential NameError when torch.distributed is not available
  • Add tests for the changes
  • Fix test to only run for nvfuser executor

Summary

Successfully moved the DTensor type check from runtime to trace construction time in the nvFuser executor. The changes ensure that the dispatch to DTensor-based execution is determined by the proxy types in the trace rather than by checking runtime arguments.

Key Changes

  1. Added use_dtensor_execute field to FusionDefinitionWrapper (line 537):

    • Boolean field that determines at initialization whether DTensor execution path should be used
    • Defaults to False for backward compatibility
  2. Compute use_dtensor_execute at trace construction (line 623):

    • In create_fusion_definition_wrapper, check if any sorted_unique_inputs is a DTensorProxy
    • This check happens once during fusion definition creation, not on every call
  3. Use pre-computed flag instead of runtime check (line 556):

    • Changed from: if dist.is_available() and any(isinstance(t, torch.distributed.tensor.DTensor) for t in args)
    • Changed to: if self.use_dtensor_execute
    • Eliminates the need to check every argument at runtime
  4. Fixed guard for distributed availability (line 548):

    • Changed from: if dist.is_available()
    • Changed to: if IS_TORCH_DISTRIBUTED_AVAILABLE
    • Prevents potential NameError when torch.distributed is not available

Tests Added

  1. test_use_dtensor_execute_flag in test_nvfuser.py:

    • Tests that use_dtensor_execute is False for regular tensor inputs
    • Verifies the flag is accessible from the FusionDefinitionWrapper
  2. test_use_dtensor_execute_flag_with_dtensor in test_dtensor.py:

    • Tests that use_dtensor_execute is True for DTensor inputs
    • Ensures the flag correctly reflects DTensor usage
    • Updated: Now skips for non-nvfuser executors to avoid false assertion failures

Benefits

  • Performance: Eliminates runtime isinstance checks on all fusion inputs
  • Correctness: Dispatch decision based on trace types, not runtime types
  • Maintainability: More explicit about when the dispatch decision is made
  • Robustness: Fixed potential NameError when distributed is not available
Original prompt

This section details on the original issue you should resolve

<issue_title>Do not perform DTensor isinstance check for all nvFuser inputs at runtime</issue_title>
<issue_description>## 🐛 Bug

Current nvFuser executor in Thunder checks every argument at runtime if it's DTensor or not. This check should be performed at trace construction time because the dispatch to DTensor-based execution or not should be done solely based on the types recorded in the trace.

Place where DTensor check is performed:

if dist.is_available() and any(isinstance(t, torch.distributed.tensor.DTensor) for t in args):

This would involve updating the construction of FusionDefinitionWrapper to include a new attribute "use_dtensor_execute" (or some better name) to control the dispatch.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #2651


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI changed the title [WIP] Fix DTensor isinstance check for nvFuser inputs Move DTensor isinstance check from runtime to trace construction time in nvFuser executor Oct 15, 2025
Copilot AI requested a review from IvanYashchuk October 15, 2025 14:44
@IvanYashchuk
Copy link
Collaborator

@copilot, the changes look good. Please add a test for this change.

Copy link
Contributor Author

Copilot AI commented Oct 15, 2025

@copilot, the changes look good. Please add a test for this change.

Added tests in commit 2c67caa:

  1. test_use_dtensor_execute_flag in test_nvfuser.py - verifies the flag is False for regular tensors
  2. test_use_dtensor_execute_flag_with_dtensor in test_dtensor.py - verifies the flag is True for DTensor inputs

Both tests check that the use_dtensor_execute attribute is correctly set at trace construction time based on the input proxy types.

Copy link
Collaborator

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly what I wanted.

@IvanYashchuk IvanYashchuk marked this pull request as ready for review October 15, 2025 15:18
@IvanYashchuk
Copy link
Collaborator

@kshitij12345, do you agree with the change?



@common_utils.parametrize("executor", tuple(executors_map.keys()))
def test_use_dtensor_execute_flag_with_dtensor(self, executor):
Copy link
Collaborator

@kshitij12345 kshitij12345 Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot, This test should run only for nvfuser executor otherwise assert len(fusions) > 0 will be false

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot, apply the suggestion above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the suggestion in commit 0ed6e23. The test now skips for non-nvfuser executors to avoid assertion failures when there are no fusions.

Copilot AI requested a review from IvanYashchuk October 21, 2025 14:58
@IvanYashchuk
Copy link
Collaborator

@copilot test test are you active?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Do not perform DTensor isinstance check for all nvFuser inputs at runtime

2 participants