Add unique op #1547

a-gardner1 · 2024-05-15T22:03:21Z

Add support for exporting torch.unique following the conclusion of pytorch/pytorch#113118.

onnxscript/function_libs/torch_lib/ops/core.py

codecov · 2024-05-15T22:26:27Z

Codecov Report

Attention: Patch coverage is 57.14286% with 18 lines in your changes missing coverage. Please review.

Project coverage is 77.50%. Comparing base (69ae7f4) to head (f9885f1).
Report is 171 commits behind head on main.

Files with missing lines	Patch %	Lines
onnxscript/function_libs/torch_lib/ops/core.py	57.14%	16 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1547      +/-   ##
==========================================
- Coverage   77.56%   77.50%   -0.07%     
==========================================
  Files         214      216       +2     
  Lines       23186    23381     +195     
  Branches     3975     4033      +58     
==========================================
+ Hits        17984    18121     +137     
- Misses       4433     4477      +44     
- Partials      769      783      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

justinchuby

Thanks for your contribution! Could you follow the CLA bot's instruction to get that cleared?

justinchuby · 2024-05-15T22:26:59Z

onnxscript/function_libs/torch_lib/ops/core.py

+    except Exception as e:
+        # try to provide a more informative error message
+        if _NOT_IMPLEMENTED_UNIQUE.search(str(e)) is not None:
+            raise NotImplementedError(
+                f"'onnxruntime' does not yet support Unique(11) operator with dtype={self.dtype}'"
+                ) from e


I would remove this try-catch as the function here is symbolic; we don't expect them to raise any errors

Addressed in b528a6a

a-gardner1 · 2024-05-15T22:29:42Z

Thanks for your contribution! Could you follow the CLA bot's instruction to get that cleared?

Yea, I may have jumped the gun a bit. Working on officially getting permission from my employer.

a-gardner1 · 2024-05-16T21:58:31Z

@a-gardner1 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
@microsoft-github-policy-service agree [company="{your company}"]
Options:

(default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
(when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"
Contributor License Agreement

@microsoft-github-policy-service agree [company="Radiance Technologies"]

@microsoft-github-policy-service agree company="Radiance Technologies"

a-gardner1 · 2024-05-16T21:59:47Z

@microsoft-github-policy-service agree company="Radiance Technologies"

a-gardner1 · 2024-05-16T22:02:26Z

tests/function_libs/torch_lib/ops_test_data.py

@@ -438,6 +438,34 @@ def _where_input_wrangler(
    return args, kwargs


+def _unique_unsorted_xfail_matcher(


@justinchuby I'm not sure what the preferred behavior is here. Should we match torch.unique and ignore the sorted argument (i.e., always sort in aten_unique) or respect the argument and deviate in accordance with this matcher?

I wonder if the argument leads to different behavior in cuda/cpu etc? I assume sorted=False means it can be sorted, but it doesn't need to be; and there are some potential performance gain by turning it off. If that's the interpretation I would keep the argument. Otherwise ignoring the argument and matching behavior would also be nice.

I am investigating differences in behavior between cuda/cpu and have found at least one already (unique_dim on CPU ignores the return_inverse and return_counts arguments whereas the CUDA impl does not). How should these differences be handled? Can the op registration be conditioned by the device somehow, or should I favor CUDA over CPU?

Matching CUDA for now is preferable. Thanks!

a-gardner1 · 2024-05-17T20:56:18Z

onnxscript/function_libs/torch_lib/ops/core.py

+    # HACK: force indices to be in the graph so that it gets a name during optimization
+    # Otherwise an error will be raised in `onnxscript.Scope.lookup_or_create`
+    indices_size = op.Shape(indices)
+    counts = op.Reshape(counts, indices_size)


I want to note that the way that this function was written in 1d74d59 is functionally equivalent but yields an error in onnxscript.Scope.lookup_or_create because it causes modified to be True in onnxscript.optimizer.optimize, thus causing a second loop of optimization that crashes in the first call to inline_simple_functions.

This seems indicative of a potential bug to me, but I am not knowledgeable enough about the codebase to suggest a cause or fix.

cc @gramalingam

justinchuby · 2024-05-18T00:52:43Z

Thanks for completing the CLA. I will take a look next week

onnxscript/function_libs/torch_lib/ops/core.py

justinchuby · 2024-05-20T06:20:55Z

onnxscript/function_libs/torch_lib/ops/core.py

+        result = unique_values, counts
+    else:
+        result = unique_values
+    return result


I think we need to always return the same number of values. Consider returning None when they are not available?

Doing so deviates from the behavior of torch.unique and causes this assertion in the unit tests to fail:

onnxscript/tests/function_libs/torch_lib/ops_test.py

Line 251 in 69ae7f4

assert len(flattened_torch_outputs) == len(flattened_function_outputs)

Please advise on how to address this.

Does torch.ops.aten.unique exhibit the same behavior? If it always returns three variables, consider creating a new OpInfo for torch.ops.aten.unique similar to

onnxscript/tests/function_libs/torch_lib/extra_opinfo.py

Lines 2105 to 2114 in a5ed079

opinfo_core.OpInfo(

"ops.aten._native_batch_norm_legit.no_stats",

aten_name="_native_batch_norm_legit.no_stats",

dtypes=common_dtype.floating_types_and(torch.bfloat16),

dtypesIfCUDA=common_dtype.floating_types_and(torch.float16, torch.bfloat16),

supports_forward_ad=True,

supports_fwgrad_bwgrad=True,

assert_jit_shape_analysis=True,

sample_inputs_func=sample_inputs__native_batch_norm_legit_no_stats,

),

. You may remove the xfail with the custom OpInfo too because you may simply remove the xfail cases.

You may adapt the sample function from https://github.com/pytorch/pytorch/blob/b948b1ad7a9cf61c9692506c60c295fd40e00f43/torch/testing/_internal/common_methods_invocations.py#L3346-L3372

Thanks for the pointer to extra_opinfo. It turns out torch.ops.aten.unique does not exist, but torch.ops.aten._unique does. Added OpInfo for it, _unique2, and unique_dim in 14d03b5

onnxscript/function_libs/torch_lib/ops/core.py

justinchuby · 2024-05-20T06:25:23Z

onnxscript/function_libs/torch_lib/ops/core.py

+    """unique(Tensor self, bool sorted=True, bool return_inverse=False, bool return_counts=False) -> (Tensor, Tensor, Tensor)"""
+
+    unique_values, indices, inverse_indices, counts = op.Unique(self, axis=None, sorted=sorted)
+    # HACK: force indices to be in the graph so that it gets a name during optimization


~~Is this a bug we should fix elsewhere?~~ saw comment below

I think this could possibly be considered a different bug. The other one is a side-effect of onnxscript.optimizer.constant_folding.fold_constants, whereas this one is a side-effect of the function linked below, which converts the names of unused outputs to empty strings but only removes them if they are trailing. Since inverse_indices and counts are used, it leads to an error being raised in onnxscript.Scope.lookup_or_create due to the empty string name given to indices.

onnxscript/onnxscript/optimizer/remove_unused.py

Line 14 in 69ae7f4

def remove_unused_optional_outputs(

tests/function_libs/torch_lib/ops_test_data.py

onnxscript/function_libs/torch_lib/ops/core.py

…port to succeed

Co-authored-by: Justin Chu <[email protected]>

onnxscript/function_libs/torch_lib/ops/core.py

justinchuby · 2024-05-21T22:17:02Z

onnxscript/function_libs/torch_lib/ops/core.py

@@ -8380,8 +8380,21 @@ def aten__unique(
 ) -> tuple[TensorType, TensorType]:
    """_unique(Tensor self, bool sorted=True, bool return_inverse=False) -> (Tensor, Tensor)"""

-    unique_values, _, inverse_indices, _ = op.Unique(self, axis=None, sorted=True)
+    unique_values, indices, inverse_indices, _ = op.Unique(self, axis=None, sorted=True)
+    # HACK: force indices to be in the graph so that it gets a name during optimization


I suggest removing all hacks. I will go fix what's necessary where the bug is. We are also moving to prefer trace_only=True for new functions so if you can include the flag in @torch_op that would be awesome.

That would be awesome. The hacks are definitely getting out of hand. I'll wait for that fix so that I can continue to test with this locally.

Do you have a short script handy that will reproduce the error?

if __name__ == '__main__': import logging import torch import numpy as np import onnx import onnxruntime as ort for i in range(16): sorted = bool(i & 1) return_inverse = bool((i & 2) > 1) return_counts = bool((i & 4) > 1) dim = 0 if bool((i & 8) > 1) else None print( f"Testing sorted={sorted}, return_inverse={return_inverse}, return_counts={return_counts}, dim={dim}" ) def test_function( x: torch.Tensor, s: bool = sorted, ri: bool = return_inverse, rc: bool = return_counts, d: int | None = dim) -> Any: result = torch.unique( x, sorted=s, return_inverse=ri, return_counts=rc, dim=d) return result onnx_program = torch.onnx.dynamo_export( test_function, torch.arange(10), export_options=torch.onnx.ExportOptions( dynamic_shapes=True, diagnostic_options=torch.onnx.DiagnosticOptions( verbosity_level=logging.DEBUG))) onnx_program.save("torch_unique.onnx") onnx_inputs = onnx_program.adapt_torch_inputs_to_onnx(torch.arange(10)) onnx_outputs = onnx_program(*onnx_inputs) loaded_onnx_program = onnx.load("torch_unique.onnx") onnx.checker.check_model(loaded_onnx_program) ort_session = ort.InferenceSession("torch_unique.onnx") inputs = np.random.randint(0, 10, 10) print(f"Inputs: {inputs}") outputs = ort_session.run(None, {"l_x_": inputs}) print(f"Outputs: {outputs}") print("Success")

Oh, you should also test using the nightly release of PyTorch with the changes in pytorch/pytorch#126561.

Is trace_only=True expected to require significant changes to the way one implements an op? It appears that enabling the flag breaks passing a value to op.ConstantOfShape and also breaks indexing a shape.

For example, op.ConstantOfShape([0], value=[0]) must become op.Cast(op.ConstantOfShape([0]), to=INT64.dtype), and output_size[dim] must become op.Slice(output_size, [dim], [dim+1]).

Your observation is correct. This may be the case because the gaps in implementation we have. Bridging the gaps is in our roadmap but is not the highest priority for the team.

onnxscript/function_libs/torch_lib/ops/core.py

Follow-up to #113118 and #124306. Developed in coordination with the solution to microsoft/onnxscript#1547 This PR adds the missing fake tensor implementation for `aten.unique_dim`, thus enabling tracing and compilation of `torch.unique` when `dim` is not None. Local testing has proceeded with the following simple script (provided that one has checked out the changes in microsoft/onnxscript#1547): ```python import onnx import onnxruntime as ort import logging import numpy as np onnx_program = torch.onnx.dynamo_export( lambda x: torch.unique(x, dim=0, return_inverse=True), torch.arange(10), export_options=torch.onnx.ExportOptions( dynamic_shapes=True, diagnostic_options=torch.onnx.DiagnosticOptions( verbosity_level=logging.DEBUG))) onnx_program.save("torch_unique.onnx") onnx_inputs = onnx_program.adapt_torch_inputs_to_onnx(torch.arange(10)) onnx_outputs = onnx_program(*onnx_inputs) loaded_onnx_program = onnx.load("torch_unique.onnx") onnx.checker.check_model(loaded_onnx_program) ort_session = ort.InferenceSession("torch_unique.onnx") inputs = np.random.randint(0, 10, 10) print(f"Inputs: {inputs}") outputs = ort_session.run(None, { "l_x_": inputs }) print(f"Outputs: {outputs}") print("Success") ``` Co-authored-by: Edward Z. Yang <[email protected]> Pull Request resolved: #126561 Approved by: https://github.com/ezyang

Follow-up to pytorch#113118 and pytorch#124306. Developed in coordination with the solution to microsoft/onnxscript#1547 This PR adds the missing fake tensor implementation for `aten.unique_dim`, thus enabling tracing and compilation of `torch.unique` when `dim` is not None. Local testing has proceeded with the following simple script (provided that one has checked out the changes in microsoft/onnxscript#1547): ```python import onnx import onnxruntime as ort import logging import numpy as np onnx_program = torch.onnx.dynamo_export( lambda x: torch.unique(x, dim=0, return_inverse=True), torch.arange(10), export_options=torch.onnx.ExportOptions( dynamic_shapes=True, diagnostic_options=torch.onnx.DiagnosticOptions( verbosity_level=logging.DEBUG))) onnx_program.save("torch_unique.onnx") onnx_inputs = onnx_program.adapt_torch_inputs_to_onnx(torch.arange(10)) onnx_outputs = onnx_program(*onnx_inputs) loaded_onnx_program = onnx.load("torch_unique.onnx") onnx.checker.check_model(loaded_onnx_program) ort_session = ort.InferenceSession("torch_unique.onnx") inputs = np.random.randint(0, 10, 10) print(f"Inputs: {inputs}") outputs = ort_session.run(None, { "l_x_": inputs }) print(f"Outputs: {outputs}") print("Success") ``` Co-authored-by: Edward Z. Yang <[email protected]> Pull Request resolved: pytorch#126561 Approved by: https://github.com/ezyang

github-advanced-security bot found potential problems May 15, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems May 15, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

a-gardner1 marked this pull request as draft May 15, 2024 22:27

justinchuby reviewed May 15, 2024

View reviewed changes

justinchuby added the topic: torch_lib Related to the torch/aten function lib in development label May 15, 2024

a-gardner1 commented May 16, 2024

View reviewed changes

a-gardner1 mentioned this pull request May 17, 2024

Add fake impl for aten.unique_dim pytorch/pytorch#126561

Closed

a-gardner1 force-pushed the wip-113118-add-unique-ops branch from 453783f to b528a6a Compare May 17, 2024 20:35

a-gardner1 marked this pull request as ready for review May 17, 2024 20:35

a-gardner1 commented May 17, 2024

View reviewed changes

justinchuby self-assigned this May 18, 2024

github-advanced-security bot found potential problems May 18, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

justinchuby reviewed May 20, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Outdated Show resolved Hide resolved

justinchuby reviewed May 20, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Show resolved Hide resolved

justinchuby reviewed May 20, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Outdated Show resolved Hide resolved

justinchuby reviewed May 20, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Outdated Show resolved Hide resolved

justinchuby reviewed May 20, 2024

View reviewed changes

tests/function_libs/torch_lib/ops_test_data.py Outdated Show resolved Hide resolved

justinchuby reviewed May 20, 2024

View reviewed changes

tests/function_libs/torch_lib/ops_test_data.py Outdated Show resolved Hide resolved

justinchuby reviewed May 20, 2024

View reviewed changes

tests/function_libs/torch_lib/ops_test_data.py Outdated Show resolved Hide resolved

github-advanced-security bot found potential problems May 20, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

a-gardner1 added 3 commits May 20, 2024 12:24

Add unique op

14d03b5

Remove try-catch block and apply fixes to enable torch.onnx.dynamo_ex…

48467a2

…port to succeed

Remove unnecessary traceable=True kwargs to torch_op

5f9f6b1

a-gardner1 and others added 4 commits May 20, 2024 12:24

Update onnxscript/function_libs/torch_lib/ops/core.py

4a39cb6

Co-authored-by: Justin Chu <[email protected]>

Complete rename

bc26639

Use multiple return statements

e915a79

Tell linter to ignore unused args

7e6d906

a-gardner1 force-pushed the wip-113118-add-unique-ops branch from 56c06cf to 7e6d906 Compare May 20, 2024 17:24

Add extra op infos, adapt ops to more precise tests

29c4f78

github-advanced-security bot found potential problems May 21, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

Force dependent calculations to avoid empty names during optimization

1310262

justinchuby reviewed May 21, 2024

View reviewed changes

github-advanced-security bot found potential problems May 21, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems May 21, 2024

View reviewed changes

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

onnxscript/function_libs/torch_lib/ops/core.py Fixed Show fixed Hide fixed

a-gardner1 added 3 commits May 21, 2024 22:24

Fix linting errors and remove accidentally committed change

176cb57

Remove hacks

b8b4cb1

Switch to trace_only=True

f9885f1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unique op #1547

Add unique op #1547

a-gardner1 commented May 15, 2024

codecov bot commented May 15, 2024 •

edited

Loading

justinchuby left a comment

justinchuby May 15, 2024 •

edited

Loading

a-gardner1 May 17, 2024

a-gardner1 commented May 15, 2024

a-gardner1 commented May 16, 2024 •

edited

Loading

a-gardner1 commented May 16, 2024

a-gardner1 May 16, 2024

justinchuby May 21, 2024

a-gardner1 May 21, 2024 •

edited

Loading

justinchuby May 21, 2024

a-gardner1 May 17, 2024

justinchuby May 20, 2024

justinchuby commented May 18, 2024

justinchuby May 20, 2024

a-gardner1 May 20, 2024

justinchuby May 21, 2024

a-gardner1 May 21, 2024

justinchuby May 20, 2024 •

edited

Loading

a-gardner1 May 20, 2024

justinchuby May 21, 2024

a-gardner1 May 21, 2024

justinchuby May 21, 2024

a-gardner1 May 21, 2024 •

edited

Loading

a-gardner1 May 21, 2024

a-gardner1 May 24, 2024

justinchuby May 24, 2024

		@@ -438,6 +438,34 @@ def _where_input_wrangler(
		return args, kwargs


		def _unique_unsorted_xfail_matcher(

	opinfo_core.OpInfo(
	"ops.aten._native_batch_norm_legit.no_stats",
	aten_name="_native_batch_norm_legit.no_stats",
	dtypes=common_dtype.floating_types_and(torch.bfloat16),
	dtypesIfCUDA=common_dtype.floating_types_and(torch.float16, torch.bfloat16),
	supports_forward_ad=True,
	supports_fwgrad_bwgrad=True,
	assert_jit_shape_analysis=True,
	sample_inputs_func=sample_inputs__native_batch_norm_legit_no_stats,
	),

Add unique op #1547

Are you sure you want to change the base?

Add unique op #1547

Conversation

a-gardner1 commented May 15, 2024

codecov bot commented May 15, 2024 • edited Loading

Codecov Report

justinchuby left a comment

Choose a reason for hiding this comment

justinchuby May 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-gardner1 commented May 15, 2024

a-gardner1 commented May 16, 2024 • edited Loading

a-gardner1 commented May 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-gardner1 May 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinchuby commented May 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinchuby May 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-gardner1 May 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented May 15, 2024 •

edited

Loading

justinchuby May 15, 2024 •

edited

Loading

a-gardner1 commented May 16, 2024 •

edited

Loading

a-gardner1 May 21, 2024 •

edited

Loading

justinchuby May 20, 2024 •

edited

Loading

a-gardner1 May 21, 2024 •

edited

Loading