Skip to content

Commit b451635

Browse files
kausvmeta-codesync[bot]
authored andcommitted
Enable NCCL Debug info on tests (#3578)
Summary: Pull Request resolved: #3578 Hoping to get sandcastle debug results for all tests with this change to debug flaky test failures Reviewed By: iamzainhuda Differential Revision: D87943452 fbshipit-source-id: 1ad9cc9bbc9bde9a8c8ba6dafa9586c312dfd4df
1 parent f6dece7 commit b451635

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

torchrec/distributed/test_utils/multi_process.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ def setUp(self) -> None:
117117
os.environ["MASTER_PORT"] = str(get_free_port())
118118
os.environ["GLOO_DEVICE_TRANSPORT"] = "TCP"
119119
os.environ["NCCL_SOCKET_IFNAME"] = "lo"
120+
os.environ["NCCL_DEBUG"] = "INFO"
120121

121122
torch.use_deterministic_algorithms(True)
122123
if torch.cuda.is_available():

0 commit comments

Comments
 (0)