-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement HooksMixin #917
base: main
Are you sure you want to change the base?
Implement HooksMixin #917
Conversation
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. |
ec59d6c
to
45953c4
Compare
840a41b
to
0bc7bae
Compare
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
793ae75
to
55f69d6
Compare
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
e2e tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We briefly looked at the implications of using hooks with FSDP - are we taking care of that already or through this PR?
@dsikka I consider that to be out of scope for this PR. I consider FSDP to be unsupported as of now, although this PR makes it easier to support FSDP in the future. Modifying a module's parameter requires being in special FSDP contexts. @torch.no_grad()
def pre_hook(module, _args):
# modifying both training and handle training states is required
with model._use_training_state(TrainingState.IDLE, HandleTrainingState.IDLE):
with FullyShardedDataParallel.summon_full_params(model):
# modify module weight. Doing so outside of the contexts will raise a non-contiguous tensor error
module.weight *= 0 We can bake these contexts into the |
Purpose
Changes
HooksMixin
_HOOKS_DISABLED
attribute is a global variable attached to the class which is used to disable hooks globally_hooks
attribute is a local variable attached to each modifier which lists all of the hooks created by that modifierQuantizationModifier
, refactor calibration functions to reference the same function rather than generating hook functionsSmoothQuantModifier
WandaPruningModifier
andSparseGPTModifier
MagnitudePruningModifier
andConstantPruningModifier
viaLayerParamMasking
LayerCompressor
since this will be handled by future data pipelines and doing so would all theBaseModel
inheritance to theLayerCompressor
class, which add unnecessary complexity to this PRTesting
tests/llmcompressor/modifiers/utils/test_hooks.py