You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The test case 17. SM-modelparallelv2, uses a custom pytorch binaries pytorch="2.2.0=sm_py3.10_cuda12.1_cudnn8.9.5_nccl_pt_2.2_tsm_2.3_cuda12.1_0 which declared dependency on aws-ofi-nccl >=1.7.1,<2.0. The expectation was that the aws-ofi-nccl package will be consumed from the AWS PyTorch conda channel (https://aws-pytorch-doc.com/).
The following package could not be installed
└─ pytorch ==2.2.0 sm_py3.10_cuda12.1_cudnn8.9.5_nccl_pt_2.2_tsm_2.3_cuda12.1_0 is not installable because it requires
└─ aws-ofi-nccl >=1.7.1,<2.0 , which does not exist (perhaps a missing channel).
The conda channel has been deprecated, as mentioned in deprecation annoucement, it is recommended for the team who built pytorch="2.2.0=sm_py3.10_cuda12.1_cudnn8.9.5_nccl_pt_2.2_tsm_2.3_cuda12.1_0 to rebuild this binary and remove dependency on aws-ofi-nccl >=1.7.1,<2.0.
The text was updated successfully, but these errors were encountered:
junpuf
changed the title
17.SM-modelparallelv2 conda script doesn't work
17.SM-modelparallelv2 uses pytorch binary that depends on deprecated conda packages
Oct 11, 2024
The test case
17. SM-modelparallelv2
, uses a custom pytorch binariespytorch="2.2.0=sm_py3.10_cuda12.1_cudnn8.9.5_nccl_pt_2.2_tsm_2.3_cuda12.1_0
which declared dependency onaws-ofi-nccl >=1.7.1,<2.0
. The expectation was that theaws-ofi-nccl
package will be consumed from the AWS PyTorch conda channel (https://aws-pytorch-doc.com/).The conda channel has been deprecated, as mentioned in deprecation annoucement, it is recommended for the team who built
pytorch="2.2.0=sm_py3.10_cuda12.1_cudnn8.9.5_nccl_pt_2.2_tsm_2.3_cuda12.1_0
to rebuild this binary and remove dependency onaws-ofi-nccl >=1.7.1,<2.0
.The text was updated successfully, but these errors were encountered: