Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: PR cache fallback to master/main branch's cache #573

Merged
merged 3 commits into from
Aug 20, 2024

Conversation

fredrikaverpil
Copy link
Member

@fredrikaverpil fredrikaverpil commented Aug 16, 2024

Make sure a PR always gets a cache hit

(unless there is no cache yet, or course)

Why this change?

In #571 we disabled actions/go-setup, and delegated all caching to actions/cache because we can reliably pass it a cache key.

However, it turns out that there are specific constraints/requirements around caches and branches:
https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache

Right now, the first commit of a PR doesn't use the cache that was created by another PR. But:

Workflow runs can restore caches created in either the current branch or the default branch (usually main).

This means we could make sure to have a "master cache" that the PR falls back onto (using a restore-keys entry).

What was changed?

First try (commit "fix: cache from branch, fallback to target branch")

  • Added an if/elif/else dance, which generated the cache key and restore-keys values.

Second try (commit "refactor: simplify cache keys")

  • Feedback was it looked complex. I simplified a bit so now the key and restore-keys values are just populated inline. A caveat of this approach is that each branch will produce a new cache. Risk for disk space blowing up and that we exceed the included disk space shown in https://github.com/organizations/einride/settings/billing (I don't have access there).

Also...

  • Remove excessive fallback restore_keys values. This is rather unconventional but after discussions we decided we either wanted a 100% match from Go version + go.sum to get a cache hit. If not, we would rather see no cache hit.
  • Bonus: add ./sage/build folder to cache (forgot this in earlier PR).
  • Bonus: add issue link to why we disabled the actions/go-setup cache in fix: address unreliable go dependency caching #571.

Notes

Try it out without merging this PR

We should try this in some repo by setting this temporarily:

    steps:
      - name: Setup Sage
-        uses: einride/sage/actions/setup@master
+        uses: einride/sage/actions/setup@dynamic-cache-keys

Do not forget to trigger cache creation in master

...to build the cache in the master branch.

on:
  push:
    branches:
      - master
  pull_request:

or something like:

on: push

Further improvement

When the first commit of a PR is pushed, the cache from master will be used. But at the end of the run, a new cache (for the PR) is written which takes around 1 minute in our larger repos with ~1.7 GB of cache.

To shave off this minute in GHA and avoid exploding disk spage usage, we can opt in to only writing cache from master and only read cache from the PR.

jobs:
  job_name:
    steps:
      - name: Restore cache
        if: ${{ inputs.disableCache != 'true' }}
        uses: actions/cache/restore@v4
        with:
          path: |
            ./.sage/build
            ./.sage/tools
            ./.sage/bin
            /home/runner/.cache/go-build
            /home/runner/go/pkg/mod
            /home/runner/go/bin
          key: ${{ runner.os }}-${{ github.ref_name }}-${{ github.workflow }}-${{ github.job }}-${{ inputs.cacheKey }}-${{ inputs.go-version }}-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-${{ github.base_ref }}-${{ github.workflow }}-${{ github.job }}-${{ inputs.cacheKey }}-${{ inputs.go-version }}-${{ hashFiles('**/go.sum') }}

      # Other steps in the job would go here, like running 'make'

      - name: Save cache
        if: ${{ inputs.disableCache != 'true' && (github.ref_name == 'master' || github.ref_name == 'main') }}
        uses: actions/cache/save@v4
        with:
          path: |
            ./.sage/build
            ./.sage/tools
            ./.sage/bin
            /home/runner/.cache/go-build
            /home/runner/go/pkg/mod
            /home/runner/go/bin
          key: ${{ runner.os }}-${{ github.ref_name }}-${{ github.workflow }}-${{ github.job }}-${{ inputs.cacheKey }}-${{ inputs.go-version }}-${{ hashFiles('**/go.sum') }}

See here:

However, this requires breaking up this workflow into two workflows. Or create a post job, which I haven't done before, but seems possible although mighty quirky.

Either way, any of those methods looks like non-trivial changes to me, and I believe this PR (as it stands right now) balances the low-effort change with major impact quite nicely. The only question I would like to discuss is this.

We could add the following, which would also further lower the amount of disk space used by PR cache: https://github.com/actions/cache/blob/main/tips-and-workarounds.md#force-deletion-of-caches-overriding-default-cache-eviction-policy

@fredrikaverpil fredrikaverpil self-assigned this Aug 16, 2024
@fredrikaverpil

This comment was marked as outdated.

@fredrikaverpil fredrikaverpil force-pushed the dynamic-cache-keys branch 3 times, most recently from 1ba5a26 to 68d6390 Compare August 16, 2024 13:42
@fredrikaverpil fredrikaverpil deleted the dynamic-cache-keys branch August 16, 2024 14:00
@fredrikaverpil fredrikaverpil restored the dynamic-cache-keys branch August 16, 2024 14:00
@fredrikaverpil fredrikaverpil force-pushed the dynamic-cache-keys branch 2 times, most recently from b766d55 to 9cf1f62 Compare August 16, 2024 14:05
@fredrikaverpil fredrikaverpil changed the title fix: use dynamic cache keys fix: add branch name / target branch name to cache key Aug 16, 2024
@fredrikaverpil fredrikaverpil marked this pull request as ready for review August 16, 2024 14:14
@fredrikaverpil fredrikaverpil requested a review from a team as a code owner August 16, 2024 14:14
@fredrikaverpil fredrikaverpil marked this pull request as draft August 16, 2024 14:15
@fredrikaverpil fredrikaverpil changed the title fix: add branch name / target branch name to cache key fix: make cache fall back to default branch (if one doesn't exist for the PR) Aug 16, 2024
@fredrikaverpil fredrikaverpil changed the title fix: make cache fall back to default branch (if one doesn't exist for the PR) fix: PR cache fallback to main branch's cache Aug 16, 2024
@fredrikaverpil fredrikaverpil changed the title fix: PR cache fallback to main branch's cache fix: PR cache fallback to master/main branch's cache Aug 16, 2024
@fredrikaverpil fredrikaverpil marked this pull request as ready for review August 19, 2024 10:30
@fredrikaverpil fredrikaverpil force-pushed the dynamic-cache-keys branch 2 times, most recently from f3eacc4 to 80b7393 Compare August 19, 2024 14:16
@fredrikaverpil fredrikaverpil merged commit 6674fd1 into master Aug 20, 2024
3 checks passed
@fredrikaverpil fredrikaverpil deleted the dynamic-cache-keys branch August 20, 2024 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants