Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimate_zero2_model_states_mem_needs: fixing memory estiamtion #5099

Merged
merged 6 commits into from
Jun 4, 2024

Conversation

nelyahu
Copy link
Contributor

@nelyahu nelyahu commented Feb 8, 2024

was considering 4 bytes per model param, and 4 bytes per gradient.
fixed it to 2 bytes - under the assumption of FP16/BF16

was considering 4 bytes per model param, and 4 bytes per gradient.
fixed it to 2 bytes - under the assumption of FP16/BF16
nelyahu and others added 2 commits April 22, 2024 10:08
Applying code review suggestion to have more close estimation of memory consumption

Co-authored-by: Olatunji Ruwase <[email protected]>
update comment

Co-authored-by: Olatunji Ruwase <[email protected]>
@nelyahu
Copy link
Contributor Author

nelyahu commented May 13, 2024

@tjruwase @stas00 - What do you say about the updated patch?

@tjruwase
Copy link
Contributor

@tjruwase @stas00 - What do you say about the updated patch?

@nelyahu, LGTM. Thanks!

@nelyahu
Copy link
Contributor Author

nelyahu commented May 27, 2024

@tjruwase @stas00 can you please re-run validation? the failure in "cpu-torch-latest" does not seem related

@nelyahu
Copy link
Contributor Author

nelyahu commented Jun 2, 2024

@tjruwase can be merged?

@tjruwase tjruwase added this pull request to the merge queue Jun 4, 2024
Merged via the queue into microsoft:master with commit f4cb866 Jun 4, 2024
15 checks passed
sfc-gh-reyazda pushed a commit to Snowflake-Labs/DeepSpeed that referenced this pull request Jun 10, 2024
…osoft#5099)

was considering 4 bytes per model param, and 4 bytes per gradient. 
fixed it to 2 bytes - under the assumption of FP16/BF16

---------

Co-authored-by: Olatunji Ruwase <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants