Implement HooksMixin #917

kylesayrs · 2024-11-14T23:57:20Z

Purpose

Precursor to Kylesayrs/gptq partition #914
Create a shared API for adding hooks to modules
Allow code which handles data pipelines to selectively disable hooks for certain passes. This will be needed in cases with custom datapipelines (GPTQ/Wanda/SparseGPTQ) and when multiple modifiers are active at the same time.
- This is needed for GPTQ-style sequential algorithms which require one pass with hooks in order to accumulate the hessians and compress, and then a second pass without hooks in order to compute compressed (weight-quantized) outputs
- This is also a tool for research users to be able to control when hooks are enabled from within the data pipelines

for layer in model_layers:
    # accumulate hessians
    unquantized_outputs = layer(*args, **kwargs)

    # get sequential outputs
    with HooksMixin.disable_hooks():
        quantized_outputs = layer(*args, **kwargs)
    
    print(f"Mean error from quantization: {get_loss(unquantized_outputs, quantized_outputs)}")

Changes

Implement HooksMixin
- The _HOOKS_DISABLED attribute is a global variable attached to the class which is used to disable hooks globally
- The _hooks attribute is a local variable attached to each modifier which lists all of the hooks created by that modifier
Integrate with QuantizationModifier, refactor calibration functions to reference the same function rather than generating hook functions
Integrate with SmoothQuantModifier
Integrate with WandaPruningModifier and SparseGPTModifier
Integrate with MagnitudePruningModifier and ConstantPruningModifier via LayerParamMasking
Purposefully did not integrate with LayerCompressor since this will be handled by future data pipelines and doing so would all the BaseModel inheritance to the LayerCompressor class, which add unnecessary complexity to this PR

Testing

Added tests in tests/llmcompressor/modifiers/utils/test_hooks.py

github-actions · 2024-11-14T23:57:34Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs · 2024-11-17T20:00:05Z

e2e tests
nightly: https://github.com/neuralmagic/llm-compressor-testing/actions/runs/11897900649 ✅

dsikka

We briefly looked at the implications of using hooks with FSDP - are we taking care of that already or through this PR?

kylesayrs · 2024-11-18T22:57:57Z

@dsikka I consider that to be out of scope for this PR. I consider FSDP to be unsupported as of now, although this PR makes it easier to support FSDP in the future.

Modifying a module's parameter requires being in special FSDP contexts.

@torch.no_grad()
def pre_hook(module, _args):
  # modifying both training and handle training states is required
  with model._use_training_state(TrainingState.IDLE, HandleTrainingState.IDLE):
    with FullyShardedDataParallel.summon_full_params(model):
      # modify module weight. Doing so outside of the contexts will raise a non-contiguous tensor error
      module.weight *= 0

We can bake these contexts into the HooksMixin.register_hook function, although there's implementation details associated with that I'd like to leave that for a separate task/PR.

kylesayrs force-pushed the kylesayrs/HooksMixin branch from ec59d6c to 45953c4 Compare November 15, 2024 00:06

kylesayrs self-assigned this Nov 15, 2024

kylesayrs force-pushed the kylesayrs/HooksMixin branch from 840a41b to 0bc7bae Compare November 15, 2024 20:45

kylesayrs added 7 commits November 15, 2024 21:50

Implement HooksMixin

2690e10

Signed-off-by: Kyle Sayers <[email protected]>

add docstring

004f5c7

Signed-off-by: Kyle Sayers <[email protected]>

integrate with smoothquant

d3058f0

Signed-off-by: Kyle Sayers <[email protected]>

integrate with QuantizationModifier

1ae3ce0

Signed-off-by: Kyle Sayers <[email protected]>

update hooks in tests

fc2488f

Signed-off-by: Kyle Sayers <[email protected]>

integrate with wanda

d0dc807

Signed-off-by: Kyle Sayers <[email protected]>

integrate with magnitude and constant

55f69d6

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs force-pushed the kylesayrs/HooksMixin branch from 793ae75 to 55f69d6 Compare November 15, 2024 21:50

kylesayrs added 3 commits November 15, 2024 21:56

integrate with SparseGPTModifier

59ffe44

Signed-off-by: Kyle Sayers <[email protected]>

add hooksmixin to modifier

21fe61b

Signed-off-by: Kyle Sayers <[email protected]>

Merge remote-tracking branch 'origin' into kylesayrs/HooksMixin

ba01137

kylesayrs requested review from rahul-tuli, dsikka and horheynm November 18, 2024 19:23

Merge remote-tracking branch 'origin' into kylesayrs/HooksMixin

3771a89

dsikka reviewed Nov 18, 2024

View reviewed changes

kylesayrs mentioned this pull request Nov 19, 2024

Kylesayrs/gptq partition #914

Draft

Merge branch 'main' into kylesayrs/HooksMixin

7fd142b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement HooksMixin #917

Implement HooksMixin #917

kylesayrs commented Nov 14, 2024 •

edited

Loading

github-actions bot commented Nov 14, 2024

kylesayrs commented Nov 17, 2024 •

edited

Loading

dsikka left a comment

kylesayrs commented Nov 18, 2024

Implement HooksMixin #917

Are you sure you want to change the base?

Implement HooksMixin #917

Conversation

kylesayrs commented Nov 14, 2024 • edited Loading

Purpose

Changes

Testing

github-actions bot commented Nov 14, 2024

kylesayrs commented Nov 17, 2024 • edited Loading

dsikka left a comment

Choose a reason for hiding this comment

kylesayrs commented Nov 18, 2024

kylesayrs commented Nov 14, 2024 •

edited

Loading

kylesayrs commented Nov 17, 2024 •

edited

Loading