Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix torch.compile error for PyTorch v2.3 #5463

Merged
merged 1 commit into from
Apr 25, 2024

Conversation

tohtana
Copy link
Contributor

@tohtana tohtana commented Apr 25, 2024

PyTorch v2.3 throws an error when it tries to compile iter_params used for ZeRO3.
This PR excludes the function from the compilation targets.

After this PR is merged, we can unpin the torch version for unit tests.

@tohtana tohtana marked this pull request as ready for review April 25, 2024 00:08
@tjruwase tjruwase added this pull request to the merge queue Apr 25, 2024
Merged via the queue into microsoft:master with commit fcc731f Apr 25, 2024
14 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Apr 30, 2024
…compile_zero tests on v100 (#5459)

Torch updating to 2.3.0 broke some test_compile_zero tests, we pinned
it, @tohtana pushed fixes in #5463, this should un-pin and move us back
to the latest.

Failing test that indicates the generated code cannot run bf16 on V100
[here](https://github.com/microsoft/DeepSpeed/actions/runs/8838672379/job/24270349996?pr=5459#step:8:5157).
umchand pushed a commit to umchand/DeepSpeed that referenced this pull request May 20, 2024
PyTorch v2.3 throws an error when it tries to compile `iter_params` used
for ZeRO3.
This PR excludes the function from the compilation targets.

After this PR is merged, we can [unpin the torch version for unit
tests](microsoft#5459).
umchand pushed a commit to umchand/DeepSpeed that referenced this pull request May 20, 2024
…compile_zero tests on v100 (microsoft#5459)

Torch updating to 2.3.0 broke some test_compile_zero tests, we pinned
it, @tohtana pushed fixes in microsoft#5463, this should un-pin and move us back
to the latest.

Failing test that indicates the generated code cannot run bf16 on V100
[here](https://github.com/microsoft/DeepSpeed/actions/runs/8838672379/job/24270349996?pr=5459#step:8:5157).
dbyoung18 pushed a commit to dbyoung18/DeepSpeed that referenced this pull request Jun 11, 2024
PyTorch v2.3 throws an error when it tries to compile `iter_params` used
for ZeRO3.
This PR excludes the function from the compilation targets.

After this PR is merged, we can [unpin the torch version for unit
tests](microsoft#5459).
dbyoung18 pushed a commit to dbyoung18/DeepSpeed that referenced this pull request Jun 11, 2024
…compile_zero tests on v100 (microsoft#5459)

Torch updating to 2.3.0 broke some test_compile_zero tests, we pinned
it, @tohtana pushed fixes in microsoft#5463, this should un-pin and move us back
to the latest.

Failing test that indicates the generated code cannot run bf16 on V100
[here](https://github.com/microsoft/DeepSpeed/actions/runs/8838672379/job/24270349996?pr=5459#step:8:5157).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants