-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-46437][DOCS] Remove cruft from the built-in SQL functions documentation #44393
Conversation
@huaxingao, you may also be interested in reviewing this since you reviewed #28224. |
cc @HyukjinKwon |
Merged to master. |
@@ -17,202 +17,125 @@ license: | | |||
limitations under the License. | |||
--- | |||
|
|||
{% for static_file in site.static_files %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it was my bad. we still need this if-else because of SKIP_API=1 bundle exec jekyll build
. It fails as below
Configuration file: /.../workspace/forked/spark/docs/_config.yml
Source: /.../workspace/forked/spark/docs
Destination: /.../workspace/forked/spark/docs/_site
Incremental build: disabled. Enable with --incremental
Generating...
Liquid Exception: Could not locate the included file '_generated_function_html/agg-funcs-table.html' in any of ["/.../workspace/forked/spark/docs/"]. Ensure it exists in one of those directories and, if it is a symlink, does not point outside your site source. in sql-ref-functions-builtin.md
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me revert this for now (as it technically breaks the documentation build)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's strange. Does the PR build not build documentation? Because the build was passing as of a9ae743 (the latest commit in the PR), and the merge into master at e4b5977 also had a passing build.
Here's the documentation build after the merge: https://github.com/apache/spark/actions/runs/7294850973/job/19880408450
What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason is that the canonical bundle exec jekyll build
would pass because it will trigger the full build of API. If SKIP_API=1
is specified, it will skip the API build (which will skip SQL API generation, then the HTML files would not exist).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just revive the if
s there, and I think we can just go ahead and merge. Let me reopen this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, the branch was removed so I can't reopen the PR 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. I will reopen the PR.
I will also tweak things, if possible, so that SKIP_API=1
will work even if the SQL API docs have never been generated before, without requiring all of the conditional includes, because they really clutter things up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #44627 to address the build issue so we are not caught off guard like this again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just revive the
if
s there, and I think we can just go ahead and merge. Let me reopen this PR.
@HyukjinKwon - Actually, thinking about this a bit more , is this really the behavior we want to have?
It's technically a type of silent failure if Jekyll successfully builds our docs but can't find some file that was referenced via include
or include_relative
. The way docs/sql-ref-functions-builtin.md
is currently setup makes it so that Jekyll will never detect these missing includes.
Isn't that actually a bad thing? Using this pattern, how will developers know that the documentation built correctly without manually reviewing every rendered site where they wrote a conditional include?
It seems like we need a way to ignore missing includes only when SKIP_API=1
. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah .. as long as those options work (e.g., SKIP_API=1.
), it should be fine.
### What changes were proposed in this pull request? Add [custom Jekyll tags][custom] to enable us to conditionally include files in our documentation build in a more user-friendly manner. [This example][example] demonstrates how a custom tag can build on one of Jekyll's built-in tags. [custom]: https://github.com/Shopify/liquid/wiki/Liquid-for-Programmers#create-your-own-tags [example]: Shopify/liquid#370 (comment) Without this change, files have to be included as follows: ```liquid {% for static_file in site.static_files %} {% if static_file.name == 'generated-agg-funcs-table.html' %} {% include_relative generated-agg-funcs-table.html %} {% break %} {% endif %} {% endfor %} ``` With this change, they can be included more intuitively in one of two ways: ```liquid {% include_relative_if_exists generated-agg-funcs-table.html %} {% include_api_gen generated-agg-funcs-table.html %} ``` `include_relative_if_exists` includes a file if it exists and substitutes an HTML comment if not. Use this tag when it's always OK for an include not to exist. `include_api_gen` includes a file if it exists. If it doesn't, it tolerates the missing file only if one of the `SKIP_` flags is set. Otherwise it raises an error. Use this tag for includes that are generated for the language APIs. These files are required to generate complete documentation, but we tolerate their absence during development---i.e. when a skip flag is set. `include_api_gen` will place a visible text placeholder in the document and post a warning to the console to indicate that missing API files are being tolerated. ```sh $ SKIP_API=1 bundle exec jekyll build Configuration file: /Users/nchammas/dev/nchammas/spark/docs/_config.yml Source: /Users/nchammas/dev/nchammas/spark/docs Destination: /Users/nchammas/dev/nchammas/spark/docs/_site Incremental build: disabled. Enable with --incremental Generating... Warning: Tolerating missing API files because the following skip flags are set: SKIP_API done in 1.703 seconds. Auto-regeneration: disabled. Use --watch to enable. ``` This PR supersedes #44393. ### Why are the changes needed? Jekyll does not have a succinct way to [check if a file exists][check], so the required directives to implement such functionality are very cumbersome. We need the ability to do this so that we can [build the docs successfully with `SKIP_API=1`][build], since many includes reference files that are only generated when `SKIP_API` is _not_ set. [check]: jekyll/jekyll#7528 [build]: #44627 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually building and reviewing the docs, both with and without `SKIP_API=1`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44630 from nchammas/SPARK-46437-conditional-jekyll-include. Authored-by: Nicholas Chammas <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
What changes were proposed in this pull request?
Why are the changes needed?
To reduce confusion for maintainers.
Does this PR introduce any user-facing change?
Yes. It adds a table of contents and change the heading style of the examples.
Otherwise, the generated docs are identical.
How was this patch tested?
I built Spark, ran
./sql/create-docs.sh
, and reviewed the generated docs in my browser.The page is too long to screenshot completely, but here are a couple of screenshots.
Was this patch authored or co-authored using generative AI tooling?
No.