-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Pull requests: microsoft/DeepSpeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add the missing view operations from sequence parallel(async).
#6750
opened Nov 14, 2024 by
inkcherry
Loading…
Training ops kernels: Speeding up the Llama-based MoE architectures
#6734
opened Nov 8, 2024 by
RezaYazdaniAminabadi
•
Draft
Allow launcher to include
--include=node3
, not just --include=node3:1,2,3,4,5,6,7,8
#6698
opened Nov 1, 2024 by
stephen-nju
Loading…
Reduce the device bubble introduced by heavy loop synchronization in coalesced fetch/release(z3_leaf_module)
#6694
opened Oct 31, 2024 by
inkcherry
Loading…
A faster and more memory-efficient implementation of
zero_to_fp32
#6658
opened Oct 23, 2024 by
xu-song
Loading…
Support the parallel conversion from ZeRO checkpoints to FP32/FP16/BF16 param weight
#6655
opened Oct 23, 2024 by
xylian86
Loading…
5 tasks done
Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models
#6553
opened Sep 18, 2024 by
gyou2021
Loading…
Change compile for pipeline module torch.compile
#6478
opened Sep 2, 2024 by
NirSonnenschein
Loading…
Unpin tests that previously used a pinned version of transformers
#6387
opened Aug 20, 2024 by
loadams
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.