Skip to content

Commit

Permalink
[SPARK-46437][DOCS] Add custom tags for conditional Jekyll includes
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

Add [custom Jekyll tags][custom] to enable us to conditionally include files in our documentation build in a more user-friendly manner. [This example][example] demonstrates how a custom tag can build on one of Jekyll's built-in tags.

[custom]: https://github.com/Shopify/liquid/wiki/Liquid-for-Programmers#create-your-own-tags
[example]: Shopify/liquid#370 (comment)

Without this change, files have to be included as follows:

```liquid
{% for static_file in site.static_files %}
    {% if static_file.name == 'generated-agg-funcs-table.html' %}
        {% include_relative generated-agg-funcs-table.html %}
        {% break %}
    {% endif %}
{% endfor %}
```

With this change, they can be included more intuitively in one of two ways:

```liquid
{% include_relative_if_exists generated-agg-funcs-table.html %}
{% include_api_gen generated-agg-funcs-table.html %}
```

`include_relative_if_exists` includes a file if it exists and substitutes an HTML comment if not. Use this tag when it's always OK for an include not to exist.

`include_api_gen` includes a file if it exists. If it doesn't, it tolerates the missing file only if one of the `SKIP_` flags is set. Otherwise it raises an error. Use this tag for includes that are generated for the language APIs. These files are required to generate complete documentation, but we tolerate their absence during development---i.e. when a skip flag is set.

`include_api_gen` will place a visible text placeholder in the document and post a warning to the console to indicate that missing API files are being tolerated.

```sh
$ SKIP_API=1 bundle exec jekyll build
Configuration file: /Users/nchammas/dev/nchammas/spark/docs/_config.yml
            Source: /Users/nchammas/dev/nchammas/spark/docs
       Destination: /Users/nchammas/dev/nchammas/spark/docs/_site
 Incremental build: disabled. Enable with --incremental
      Generating...
Warning: Tolerating missing API files because the following skip flags are set: SKIP_API
                    done in 1.703 seconds.
 Auto-regeneration: disabled. Use --watch to enable.
```

This PR supersedes #44393.

### Why are the changes needed?

Jekyll does not have a succinct way to [check if a file exists][check], so the required directives to implement such functionality are very cumbersome.

We need the ability to do this so that we can [build the docs successfully with `SKIP_API=1`][build], since many includes reference files that are only generated when `SKIP_API` is _not_ set.

[check]: jekyll/jekyll#7528
[build]: #44627

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually building and reviewing the docs, both with and without `SKIP_API=1`.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44630 from nchammas/SPARK-46437-conditional-jekyll-include.

Authored-by: Nicholas Chammas <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
  • Loading branch information
nchammas authored and HyukjinKwon committed Jan 10, 2024
1 parent d203328 commit 6679419
Show file tree
Hide file tree
Showing 3 changed files with 96 additions and 141 deletions.
55 changes: 55 additions & 0 deletions docs/_plugins/conditonal_includes.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
module Jekyll
# Tag for including a file if it exists.
class IncludeRelativeIfExistsTag < Tags::IncludeRelativeTag
def render(context)
super
rescue IOError
"<!-- missing include: #{@file} -->"
end
end

# Tag for including files generated as part of the various language APIs.
# If a SKIP_ flag is set, tolerate missing files. If not, raise an error.
class IncludeApiGenTag < Tags::IncludeRelativeTag
@@displayed_warning = false

def render(context)
super
rescue IOError => e
skip_flags = [
'SKIP_API',
'SKIP_SCALADOC',
'SKIP_PYTHONDOC',
'SKIP_RDOC',
'SKIP_SQLDOC',
]
set_flags = skip_flags.select { |flag| ENV[flag] }
# A more sophisticated approach would be to accept a tag parameter
# specifying the relevant API so we tolerate missing files only for
# APIs that are explicitly skipped. But this is unnecessary for now.
# Instead, we simply tolerate missing files if _any_ skip flag is set.
if set_flags.any? then
set_flags_string = set_flags.join(', ')
if !@@displayed_warning then
STDERR.puts "Warning: Tolerating missing API files because the " \
"following skip flags are set: #{set_flags_string}"
@@displayed_warning = true
end
# "skip flags set: `#{set_flags_string}`; " \
"placeholder for missing API include: `#{@file}`"
else
raise e
end
end
end
end

Liquid::Template.register_tag(
'include_relative_if_exists',
Jekyll::IncludeRelativeIfExistsTag,
)

Liquid::Template.register_tag(
'include_api_gen',
Jekyll::IncludeApiGenTag,
)
180 changes: 40 additions & 140 deletions docs/sql-ref-functions-builtin.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,202 +17,102 @@ license: |
limitations under the License.
---

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-agg-funcs-table.html' %}
### Aggregate Functions
{% include_relative generated-agg-funcs-table.html %}
{% include_api_gen generated-agg-funcs-table.html %}
#### Examples
{% include_relative generated-agg-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-agg-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-window-funcs-table.html' %}
### Window Functions
{% include_relative generated-window-funcs-table.html %}
{% include_api_gen generated-window-funcs-table.html %}
#### Examples
{% include_relative generated-window-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-window-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-array-funcs-table.html' %}
### Array Functions
{% include_relative generated-array-funcs-table.html %}
{% include_api_gen generated-array-funcs-table.html %}
#### Examples
{% include_relative generated-array-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-array-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-collection-funcs-table.html' %}
### Collection Functions
{% include_relative generated-collection-funcs-table.html %}
{% include_api_gen generated-collection-funcs-table.html %}
#### Examples
{% include_relative generated-collection-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-collection-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-struct-funcs-table.html' %}
### STRUCT Functions
{% include_relative generated-struct-funcs-table.html %}
{% include_api_gen generated-struct-funcs-table.html %}
#### Examples
{% include_relative generated-struct-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-struct-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-map-funcs-table.html' %}
### Map Functions
{% include_relative generated-map-funcs-table.html %}
{% include_api_gen generated-map-funcs-table.html %}
#### Examples
{% include_relative generated-map-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-map-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-datetime-funcs-table.html' %}
### Date and Timestamp Functions
{% include_relative generated-datetime-funcs-table.html %}
{% include_api_gen generated-datetime-funcs-table.html %}
#### Examples
{% include_relative generated-datetime-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-datetime-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-math-funcs-table.html' %}
### Mathematical Functions
{% include_relative generated-math-funcs-table.html %}
{% include_api_gen generated-math-funcs-table.html %}
#### Examples
{% include_relative generated-math-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-math-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-string-funcs-table.html' %}
### String Functions
{% include_relative generated-string-funcs-table.html %}
{% include_api_gen generated-string-funcs-table.html %}
#### Examples
{% include_relative generated-string-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-string-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-conditional-funcs-table.html' %}
### Conditional Functions
{% include_relative generated-conditional-funcs-table.html %}
{% include_api_gen generated-conditional-funcs-table.html %}
#### Examples
{% include_relative generated-conditional-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-conditional-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-hash-funcs-table.html' %}
### Hash Functions
{% include_relative generated-hash-funcs-table.html %}
{% include_api_gen generated-hash-funcs-table.html %}
#### Examples
{% include_relative generated-hash-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-hash-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-csv-funcs-table.html' %}
### CSV Functions
{% include_relative generated-csv-funcs-table.html %}
{% include_api_gen generated-csv-funcs-table.html %}
#### Examples
{% include_relative generated-csv-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-csv-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-json-funcs-table.html' %}
### JSON Functions
{% include_relative generated-json-funcs-table.html %}
{% include_api_gen generated-json-funcs-table.html %}
#### Examples
{% include_relative generated-json-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-json-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-xml-funcs-table.html' %}
### XML Functions
{% include_relative generated-xml-funcs-table.html %}
{% include_api_gen generated-xml-funcs-table.html %}
#### Examples
{% include_relative generated-xml-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-xml-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-url-funcs-table.html' %}
### URL Functions
{% include_relative generated-url-funcs-table.html %}
{% include_api_gen generated-url-funcs-table.html %}
#### Examples
{% include_relative generated-url-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-url-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-bitwise-funcs-table.html' %}
### Bitwise Functions
{% include_relative generated-bitwise-funcs-table.html %}
{% include_api_gen generated-bitwise-funcs-table.html %}
#### Examples
{% include_relative generated-bitwise-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-bitwise-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-conversion-funcs-table.html' %}
### Conversion Functions
{% include_relative generated-conversion-funcs-table.html %}
{% include_api_gen generated-conversion-funcs-table.html %}
#### Examples
{% include_relative generated-conversion-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-conversion-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-predicate-funcs-table.html' %}
### Predicate Functions
{% include_relative generated-predicate-funcs-table.html %}
{% include_api_gen generated-predicate-funcs-table.html %}
#### Examples
{% include_relative generated-predicate-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-predicate-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-misc-funcs-table.html' %}
### Misc Functions
{% include_relative generated-misc-funcs-table.html %}
{% include_api_gen generated-misc-funcs-table.html %}
#### Examples
{% include_relative generated-misc-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-misc-funcs-examples.html %}

{% for static_file in site.static_files %}
{% if static_file.name == 'generated-generator-funcs-table.html' %}
### Generator Functions
{% include_relative generated-generator-funcs-table.html %}
{% include_api_gen generated-generator-funcs-table.html %}
#### Examples
{% include_relative generated-generator-funcs-examples.html %}
{% break %}
{% endif %}
{% endfor %}
{% include_api_gen generated-generator-funcs-examples.html %}
2 changes: 1 addition & 1 deletion docs/sql-ref-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ license: |
---

Spark SQL provides two function features to meet a wide range of user needs: built-in functions and user-defined functions (UDFs).
Built-in functions are commonly used routines that Spark SQL predefines and a complete list of the functions can be found in the [Built-in Functions](api/sql/) API document. UDFs allow users to define their own functions when the system’s built-in functions are not enough to perform the desired task.
Built-in functions are commonly used routines that Spark SQL predefines and a complete list of the functions can be found in the [Built-in Functions](api/sql/index.html) API document. UDFs allow users to define their own functions when the system’s built-in functions are not enough to perform the desired task.

### Built-in Functions

Expand Down

0 comments on commit 6679419

Please sign in to comment.