A GitHub action to collect and check URLs in a project (code and documentation). The action aims at detecting and reporting broken links.
A set of examples are included in the examples folder. A few detailed examples are also included below. Note that examples always reference the master branch, however you should change them to reference a release.
For most use cases, you will want to use the git repository that is being checked for a GitHub actions, and we do this by way of the actions/checkout action.
name: Check URLs
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: urls-checker
uses: urlstechie/urlchecker-action@master
with:
# A subfolder or path to navigate to in the present or cloned repository
subfolder: docs
# A comma-separated list of file types to cover in the URL checks
file_types: .md,.py,.rst
# Choose whether to include file with no URLs in the prints.
print_all: false
# The timeout seconds to provide to requests, defaults to 5 seconds
timeout: 5
# How many times to retry a failed request (each is logged, defaults to 1)
retry_count: 3
# A comma separated links to exclude during URL checks
exclude_urls: https://github.com/SuperKogito/URLs-checker/issues/1,https://github.com/SuperKogito/URLs-checker/issues/2
# A comma separated patterns to exclude during URL checks
exclude_patterns: https://github.com/SuperKogito/Voice-based-gender-recognition/issues
# choose if the force pass or not
force_pass : true
Note that as of version 0.2.2, references to white_listed_*
have been changed to
exclude_*
to be consistent with the include_*
variables.
It could, however, be the case that you've set up a repository with one or more uses of the URLChecker
that must clone one or more repos (possibly with varying branches) before doing the check.
In this case, you might want to define git_path
and branch
for each section.
An example is below:
name: Check URLs
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: URLs-checker
uses: urlstechie/urlchecker-action@master
with:
# A project to clone. If not provided, assumes already cloned in the present working directory.
git_path: https://github.com/urlstechie/URLs-checker-test-repo
# If a git_path is defined to clone, clone this branch (defaults to master)
branch: devel
# A subfolder or path to navigate to in the present or cloned repository
subfolder: docs
# Delete the cloned repository after running URLchecked (default is false)
cleanup: true
# A comma-separated list of file types to cover in the URL checks
file_types: .md,.py,.rst
# Choose whether to include file with no URLs in the prints.
print_all: false
# Choose whether to print a more verbose end summary with files and broken URLs.
verbose: true
# The timeout seconds to provide to requests, defaults to 5 seconds
timeout: 5
# How many times to retry a failed request (each is logged, defaults to 1)
retry_count: 3
# A comma separated links to exclude during URL checks
exclude_urls: https://github.com/SuperKogito/URLs-checker/issues/1,https://github.com/SuperKogito/URLs-checker/issues/2
# A comma separated patterns to exclude during URL checks
exclude_patterns: https://github.com/SuperKogito/Voice-based-gender-recognition/issues
# A comma separated list of file patterns (direct paths work as well) to exclude
exclude_files: README.md,/github/workspace/_config.yml
# numbers of workers to run in parallel (defaults to 9 if unset)
workers: 4
# choose if the force pass or not
force_pass: true
variable name | variable type | variable description |
---|---|---|
git_path |
optional | A git url to clone, if the repository isn't already in $PWD |
branch |
optional | If we do a clone, clone this branch (defaults to master |
cleanup |
optional | If we do a clone, delete the cloned folder after (false) |
serial |
optional | Run in serial (good for debugging) |
subfolder |
optional | A subfolder to navigate to in the repository to check |
file_types |
optional | A comma-separated list of file types to cover in the URLs checks. |
include_files |
optional | A comma-separated list of exact files to check. |
print_all |
optional | Choose whether to include file with no URLs in the prints. |
verbose |
optional | Choose whether to print a more verbose end summary with files and broken URLs |
retry_count |
optional | If a request fails, retry this number of times. Defaults to 1 |
save |
optional | A path to a csv file to save results to |
timeout |
optional | The timeout to provide to requests to wait for a response. |
exclude_urls |
optional | A comma separated list of links. |
exclude_patterns |
optional | A comma separated list of patterns. |
exclude_files |
optional | Full paths to files to exclude (comma separated list). |
force_pass |
optional | Choose whether to force a pass when checks are done. |
workers |
optional | The number of checks (one per file) to run in parallel, defaults to 9 |
Hidden File Types
If you need to specify a file_types pattern to include hidden files, you'll need a .*
pattern that is provided in the context of a comma separated list, for example:
file_types: '.*,'
If there is another pattern or variable specification that you'd like to see an example of here, please open an issue.
- Using version > 0.1.4
- Using version =< 0.1.4
Do you have a question or an issue? Please open an issue and we can help! The following communities are using the url checker! You can look here for examples or inspiration. If you want to add your community, please let us know with an issue.
Repository | Workflow (with permalink to YAML) | Example runs |
---|---|---|
awesome-rseng | Check URLs in PRs, exclude docs | Logs |
buildtest | Check URLs in all commits | Logs |
e4s | Check URLs in PR, exclude some URLs | Logs |
us-rse | Check URLs in PRs, exclude some URL patterns | Logs |
R-hub docs | Check URLs when on PR labelling | Logs |
Berlin Hack & Tell | Check URLs when on PR labelling | Logs |