Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JupyterChart section to Users Guide #3137

Merged
merged 9 commits into from
Aug 10, 2023
Merged

Conversation

jonmmease
Copy link
Contributor

Follow-on to #3119 that adds a "JupyterChart" page to the user's guide and updates the "Interactive Charts" section to include a reference to the "JupyterChart" page.

I included the video clips from the PR. All of the links to the videos and images are included in #3136.

Copy link
Contributor

@mattijn mattijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice docs @jonmmease! From mobile, so just a few short inline comments.

.. raw:: html

<video controls>
<source src="https://github.com/altair-viz/altair/assets/15064365/21d580ef-4dec-4687-a765-9b6d6d84c61f">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should place this in the static folder, so the docs are self-contained in a single folder, so we can facilitate scrapers for offline documentation software. I'm not sure on this one btw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e097995.

Note that this does increase the repository size by ~14MB. Not a huge deal from my POV, but something to keep an eye one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could apply video compression to serve users who are residing in low-bandwidth locations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't want to degrade the resolution for all users. One option would be to upload these to the VegaFusion YouTube channel (or we could make an altair-viz YouTube channel) and embed the videos from YouTube. YouTube handles serving variable resolution based on bandwidth, and this would keep the video files out of the Altair repo.

I don't know how offline documentation systems work, and whether they are already set up to handle embedded youtube videos (which I imagine being pretty common). Do we know of any offline documentation systems that currently support scraping the Vega-Altair docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that @capac has been doing this for different Altair releases for Dash for MacOS in the past, #3028 (comment).

color=alt.condition(brush, 'Origin:N', alt.value('grey')),
).add_params(brush)

widget = alt.ChartWidget(chart)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should read JupyterChart.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e097995

if sel is None or 'Horsepower' not in sel:
filtered = source.iloc[:0]
else:
filtered = source.query(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we build up this query as a separate step? It's a very interesting part of the new functionality and it deserves to be highlighted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 56c9fe0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for not being more clear. I meant to introduce this query also, or firstly, in the section prior 'Observe Selections'. It is interesting how one can (manually) extract data from the source dataframe using the interval selection.

Just wondering, would this still work with an interval selection on a chart including an aggregation in combination with alt.Chart().transformed_data()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, yeah, that makes sense. I'll look at adding sections for filtering input DataFrame based on the point and interval selections (The index selection example already shows how to do this with iloc).

Just wondering, would this still work with an interval selection on a chart including an aggregation in combination with alt.Chart().transformed_data()?

Are you wanting to filter that data after aggregation? If so, this should work with charts that include aggregation but no binning (e.g. an ordinal or nominal bar chart with aggregation). For charts that include timeunit or bin binning, this might not work because the column resulting from transformed_data will generally have a different name. e.g. column_start instead of column. Though in this case the selection might use the renamed columns as well (I'm not sure off hand).

In any case, I think the generally more useful thing is to apply the selection filter to the pre-aggregated DataFrame and then return it (rather than applying the aggregations). This way the result corresponds, row for row, to the input DataFrame.

I need to think through it more carefully, but I'm picturing adding a chart.selected_data method that would input a list of selections. It would need to intelligently crawl through the chart's transform pipeline and apply the early transforms that produce any columns needed by the selection (e.g. the binning outputs), then apply the selection as a filter, then return the result without applying the downstream aggregations. Or, maybe there is an option of whether or not to apply the downstream aggregations.

This would essentially automate the process of building the right pandas query string, handling cases where binned columns are renamed, and it would perform the filter in VegaFusion so it would be done by VegaFusion's current backend (e.g. DataFusion or DuckDB). I'll open an issue in the VegaFusion repo on this at some point. But in the near term, we should document how to do the filtering with pandas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added info on creating pandas query strings in 2ee9d76. Does that look sufficient for now @mattijn?

Your browser does not support the video tag.
</video>

Limitations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss setting setting value of selections here. I often use the init argument to start with a certain selection.

Can we start and change the value of a bound legend selection?

As far as a I understand this is not possible (yet?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added limitation in 94b7f06

Yeah, I don't know how to do this at the moment. There's not a straightforward way to convert selection signals into initial selection states. The legend case by itself isn't too hard, but the approach wouldn't extend to interval selection (where you need to draw the box as well as filter/condition the data).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yes, not so straightforward.. referencing vega/vega-lite#1830 for more info why this is still hard.

Copy link
Contributor

@joelostblom joelostblom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done! I think this looks great overall and just have one small comment/question.

@@ -627,6 +627,7 @@ data before usage in Altair using GeoPandas for example as such:
customization
configuration
saving_charts
jupyter_chart
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this section should be just under the section "Interactive Charts". It is definitely the closest to that section thematically even if it is not as central to Altair as some of the other section. I also wonder if we should call this section `JupyterChart Interactivity" or something similar to more clearly signal what it is about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to "JupyterChart Interactivity" in 5723c28

@mattijn
Copy link
Contributor

mattijn commented Aug 6, 2023

Before your comment here gets lost:

[..] in the future, this "JupyterChart" might be the primary recommendation for using Altair in Jupyter (we could connect it with the rendering system in the future so that alt.renderers.enable("jupyter") uses the "JupyterChart" behind the scenes.) What do you think?

I really like this approach. I can easily imagine myself (wrongly) trying to doing something as such: alt.JupyterChart(df).mark_line().encode(…)..

Maybe we should create a separate issue for this?

@jonmmease
Copy link
Contributor Author

@manzt, Altair has a rendering framework that allow us to build various rendering functions that convert charts into mimebundles. Is it possible to create a mimebundle that represents a Jupyter Widget / AnyWidget? For cases where the widget's traitlets aren't required, it would be nice to have the option to enable the JupyterChart rendering pipeline by calling alt.renderers.enable("jupyter").

An alternative might be to call IPython.display.display inside the function that builds the mimebundle, and return an empty text mimebundle.

@manzt
Copy link
Contributor

manzt commented Aug 6, 2023

Is it possible to create a mimebundle that represents a Jupyter Widget / AnyWidget? For cases where the widget's traitlets aren't required, it would be nice to have the option to enable the JupyterChart rendering pipeline by calling alt.renderers.enable("jupyter").

anywidget implements the _repr_mimebundle_ correctly, so I guess you could implement a simple renderer as:

def render_jupyter(spec):
    return alt.JupyterChart(chart=alt.Chart(**spec))._repr_mimebundle_()

alt.renderers.register("jupyter", render_jupyter)

In this case, access to the traitlets on the JupyterChart are ephemeral, so the widget will be rendered correctly but there is no way to access the widget state in Python. Is this what you are asking?

For some projects where I have a widget + other renderers, I've added something akin to alt.Chart(...).widget() for convenience.

@jonmmease
Copy link
Contributor Author

anywidget implements the repr_mimebundle correctly

Oh, perfect. Yeah, that render_jupyter function is just what I had in mind. A motivation of rendering this way in the future (as opposed to our usual mime renderers), would be to take advantage of binary serialization (JupyterChart doesn't do that yet, but I'm hoping to add that soon).

@manzt
Copy link
Contributor

manzt commented Aug 7, 2023

Oh right, makes total sense!

@jonmmease
Copy link
Contributor Author

This is ready for a final look. I think the last outstanding question is whether we want to include the video files in the Altair repo (as the PR currently does), or whether we want to host them remotely. I don't feel strongly about it, including them in the repo adds ~14MB (to the current size of ~7MB), which isn't a big deal in my mind, but let me know if anyone feels otherwise.

@mattijn
Copy link
Contributor

mattijn commented Aug 10, 2023

Thanks for adding clarifications on the manual usage on using the selections. I think it is very useful! Next to .value we could also include a .query to make it easier for the user. But I'm not sure if these queries can be standardized (and also not for this PR).

Regarding the videos, I prefer to have them hosted, at least for now, within the GitHub repository. So for me I'm fine with the situation as is.

Copy link
Contributor

@mattijn mattijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jonmmease jonmmease merged commit 935e4e8 into main Aug 10, 2023
20 checks passed
@jonmmease jonmmease deleted the jonmmease/jupyter_chart_docs branch August 10, 2023 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants