-
torch-distributed-gpu-test.py - this a
torch.distributed
diagnostics script that checks that all GPUs in the cluster (one or many nodes) can talk to each other and allocate gpu memory. -
NicerTrace - this is an improved
trace
python module with multiple additional flags added to the constructor and more useful output.
debug
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||