Replies: 7 comments
-
This is a very interesting question. I think it's right to focus just on the I think we'd need @idanov or maybe even @tsanikgr to explain exactly why The reason I only say kind of above is that it seems more questionable to me that we only return those outputs that are Also, technically it looks to me like the code that finds |
Beta Was this translation helpful? Give feedback.
-
Add this related SO Question - How to run a kedro pipeline interactively like a fuction - this issues only focus on the |
Beta Was this translation helpful? Give feedback.
-
Notes for Tech Design
|
Beta Was this translation helpful? Give feedback.
-
Notes from Technical Design session: There was agreement that the "free outputs" output from session isn't very clear. It was suggested to simply return all output from nodes that is not consumed, even if it's defined in the catalog. However, this could lead to very large amounts of data being returned. Instead we'll change it to return all free outputs and additionally any The second point about adding an optional argument for |
Beta Was this translation helpful? Give feedback.
-
Supplement on the above comments to address @AntonyMilneQB question:
The answer to that is there is a |
Beta Was this translation helpful? Give feedback.
-
I just give it a go to see what would it takes to make the initial idea works, partly because I want to test how the |
Beta Was this translation helpful? Give feedback.
-
Adding this as inspiration on whether we should have some kind of argument or debug mode that can specifically return output easily without editing configuration. At the moment, the proper way to inspect is
The complication is mainly due to the The question is how can we improve the user experience? It's hard to reason what is "free output" and what is not. |
Beta Was this translation helpful? Give feedback.
-
Background
What's the output of
session.run()
? Currently, this is not clear as you think and it isn't documented anywhere. The logic is defined inrunner.py
, this can be counter-intuitive in some cases, is there a good reason why we want to do this?kedro/kedro/runner/runner.py
Lines 78 to 91 in f491420
kedro
has improved a lot in terms of how to run the pipeline with packaging &KedroSession
as a standalone application, #1423 documents different ways to do it. Personally, I think it is still not easy enough to integrate withkedro
for someone who is inexperienced with kedro. In #1423, It mentioned how a pipeline can be called programmatically. Even though the pipeline itself is a function call, it doesn't behave like a function, i.e. you can't really define an input as an argument easily (it has to be a Catalog entry), theoutput
of the pipeline is also very restricted.Motivation
Kedro works really well within the kedro world, but it also mean that kedro works very differently from the rest of the Python world.
This issue mainly focuses on the
output
side, this will improve the experience to integrate thekedro
pipeline as an upstream. In a over-simplified world, this should be straight forward to do. Currently I think we a strong assumption that people work with "Kedro Project", but if we are moving towards a kedro package, i.e. usingfrom kedro_package import main
, it should behave just like a Python function, I think this is a reasonable expectation.Questions
session.run
?Things to consider
Related Issue:
Beta Was this translation helpful? Give feedback.
All reactions