Heterogeneous Operations on CUDA and ROCm Nodes Using UCX/UCC #9985
Unanswered
RafalSiwek
asked this question in
Q&A
Replies: 1 comment 2 replies
-
UCX supports both Cuda and ROCm, and in theory, should support such an environment. However, that scenario was never tested or optimized. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello UCX Team,
I'm working on a high-performance computing project involving nodes with different GPU setups—some nodes with NVIDIA GPUs running CUDA and others with AMD GPUs running ROCm. I am exploring ways to perform efficient MPI operations across these heterogeneous nodes.
Is it possible to use UCX and UCC to facilitate communication and collective operations between nodes with CUDA and ROCm environments? Specifically, can UCX and UCC act as a middleware to bridge the communication between RCCL (for ROCm) and NCCL (for CUDA)? If so, are there any specific configurations or build steps required to enable this interoperability?
Thank you for your guidance and support.
Beta Was this translation helpful? Give feedback.
All reactions