-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] First version of fusion optimizations for transformers #1938
base: main
Are you sure you want to change the base?
Conversation
❌ 10 Tests Failed:
View the top 1 failed tests by shortest run time
View the full list of 2 ❄️ flaky tests
To view more test analytics, go to the Test Analytics Dashboard |
The last two axes of the key-embedding are then swapped (using a Reshape/Transpose/Reshape sequence) | ||
|
||
The dot-product attention is then computed using SDPA | ||
|
Check warning
Code scanning / lintrunner
EDITORCONFIG-CHECKER/editorconfig Warning
The last two axes of the key-embedding are then swapped (using a Reshape/Transpose/Reshape sequence) | ||
|
||
The dot-product attention is then computed using SDPA | ||
|
Check warning
Code scanning / lintrunner
RUFF/W293 Warning
See https://docs.astral.sh/ruff/rules/blank-line-with-whitespace
|
||
|
||
def _skip_normalization(op, input, skip, gamma, epsilon, stash_type): | ||
normalized, mean, inv_std_var, skip_sum = op.SkipSimplifiedLayerNormalization( |
Check warning
Code scanning / lintrunner
RUFF/F841 Warning
See https://docs.astral.sh/ruff/rules/unused-variable
|
||
|
||
def _skip_normalization(op, input, skip, gamma, epsilon, stash_type): | ||
normalized, mean, inv_std_var, skip_sum = op.SkipSimplifiedLayerNormalization( |
Check warning
Code scanning / lintrunner
RUFF/F841 Warning
See https://docs.astral.sh/ruff/rules/unused-variable
if len(node.outputs) == 1: | ||
return output | ||
else: | ||
true_tensor = onnx.helper.make_tensor("true", onnx.TensorProto.BOOL, [1], [True]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this IR? If so
true_tensor = onnx.helper.make_tensor("true", onnx.TensorProto.BOOL, [1], [True]) | |
true_tensor = ir.tensor([True]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. But when I look at the signature here, it is not clear this is supported. The example illustrates it, though. I see it eventually calls np.array constructor if nothing else works, so I understand it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. We can update the signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually covered by npt.ArrayLike (the first)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I tried and it failed, rejecting a list. BTW, I have moved the independent parts of this PR into a separate PR: #1947
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I need to fix that then
@@ -0,0 +1,38 @@ | |||
# Copyright (c) Microsoft Corporation. |
Check warning
Code scanning / lintrunner
RUFF-FORMAT/format Warning
@@ -0,0 +1,152 @@ | |||
# Copyright (c) Microsoft Corporation. |
Check warning
Code scanning / lintrunner
RUFF-FORMAT/format Warning
Still TODO: