Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1787415 Fix dataframe ast to register udtf for replay execution using client created udtf in same client session #2620

Open
wants to merge 11 commits into
base: ls-SNOW-1491199-merge-phase0-server-side
Choose a base branch
from

Conversation

sfc-gh-evandenberg
Copy link
Collaborator

Add Udtf registration that reuses the client registered UDTF. Doesn't need all the previous parameters as they were only required for original creation of udtf.

Fix Full AST Validation to use same session for replay execution of unparser AST, thereby allowing reuse of client defined UDTFs.

@sfc-gh-evandenberg sfc-gh-evandenberg requested review from sfc-gh-yixie, sfc-gh-yuwang and sfc-gh-jrose and removed request for a team November 13, 2024 19:07
@sfc-gh-evandenberg sfc-gh-evandenberg force-pushed the evandenberg-SNOW-1787415-fix-functions-register-udtf-same-session branch from 34f09cd to 99bacfb Compare November 19, 2024 00:30
@@ -953,6 +953,7 @@ def build_proto_from_callable(
expr_builder: proto.SpCallable,
func: Union[Callable, Tuple[str, str]],
ast_batch: Optional[AstBatch] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could do _registered_object_name here as well.

@@ -1166,7 +1170,8 @@ def build_udtf(
comment: Optional[str] = None,
statement_params: Optional[Dict[str, str]] = None,
is_permanent: bool = False,
session=None,
session: "snowflake.snowpark.session.Session" = None,
udtf_name: Optional[Union[str, Iterable[str]]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename to not clash with name? Simply doing _registered_object_name should do.

@@ -100,6 +100,7 @@
"_from_pandas_udf_function",
"input_names", # for pandas_udtf
"max_batch_size", # for pandas_udtf
"_registered_object_name", # db object name if already registered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove if already registered, replace with something like object name within Snowflake (post-registration) or so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's None I assume the object has not been registered, if not none then it should be a valid object name?

stmt = _ast_stmt
ast = None

# Note it's intentional the column expressions are AST serializerd earlier (ast_cols) to ensure any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: serializerd -> serialized

statement_params=statement_params,
is_permanent=is_permanent,
session=self._session,
udtf_name=udtf_name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename udtf_name?

Copy link
Contributor

@sfc-gh-lspiegelberg sfc-gh-lspiegelberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, some nits. Would advise to not use udtf_name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants