-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WORKDIR learned to cache it's potential output layer #3341
base: main
Are you sure you want to change the base?
Conversation
pkg/commands/workdir.go
Outdated
@@ -81,14 +84,116 @@ func (w *WorkdirCommand) ExecuteCommand(config *v1.Config, buildArgs *dockerfile | |||
|
|||
// FilesToSnapshot returns the workingdir, which should have been created if it didn't already exist | |||
func (w *WorkdirCommand) FilesToSnapshot() []string { | |||
return w.snapshotFiles | |||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WORKDIR /
for example would result in an empty list, this was not cacheable on my trials. As it is not cacheable it causes cache misses every time requiring the entire file system to be unrolled. This is copied from RUN
command, as a run command can result in no files being created ie. RUN echo "Hello World"
, somehow if I return nil
instead of an empty list the caching mechanism is handled graciously, I'm currently investigating why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to resolve this by allowing the cache to contain empty images, just to indicate that the command did not change any files and we are aware of this. With this modification we are able to report back an empty list and no longer need to use the FS snapshot function from RUN command.
func (w *WorkdirCommand) MetadataOnly() bool { | ||
return false | ||
} | ||
|
||
func (r *WorkdirCommand) RequiresUnpackedFS() bool { | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WORKDIR
required me to unpack the filesystem as otherwise the user is not known.
error building image: error building stage: failed to execute command: identifying uid and gid for user app: user app is not a uid and does not exist on the system
} | ||
|
||
func (w *WorkdirCommand) ShouldCacheOutput() bool { | ||
return w.shdCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking whether we could optimize this instruction here to not cache the output if no files were created, I'm not sure how to tell from the cache consumer side whether a cache entry is missing or not there on purpose.
|
||
func (wr *CachingWorkdirCommand) ExecuteCommand(config *v1.Config, buildArgs *dockerfile.BuildArgs) error { | ||
var err error | ||
logrus.Info("Cmd: workdir") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied that entire block of code here, it is to ensure that even if we hit the cache we still do the metadata operation thingy and actually change the workdir. I could of course put that in a function etc. but I would actually prefer to pass the resolved directory into the cache, s.t. I can have a single line here. Not only reusing the functionality, but reusing the result.
cfg.WorkingDir = wr.resolvedWorkingDir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passing the result is not possible, as either the regular or the cached ExecuteCommand function is called, never both. So I opted for sharing the functionality, instead of extracting a nameless subroutine that fits the entire bill, I extracted the more meaningful ToAbsPath
function. I think the remaining few lines are fine to duplicate. The remaining calls have too many dependencies to be neatly tucked away in a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... passing the result would be possible but requires some creative thinking. We do store the layer as an image when we push to the cache, so we could change the WORKDIR on that image and with that store that variable too. Come to think of it, labels on the cached image could be used to pass arbitrary data between cache creation and reusing stage.
but this would be a bigger redefinition of what the cache does and how it is used, not suitable for a small PR like this. Yet still interesting.
extractFn: util.ExtractFile, | ||
} | ||
} | ||
|
||
func (w *WorkdirCommand) MetadataOnly() bool { | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes calling WORKDIR
is metadata only, there might be some optimization potential here, depending on whether we can know this a-priori.
ie. calling
WORKDIR /
without any further context is guaranteed to be metadata only in all images to my understanding.
@@ -78,7 +78,7 @@ func GetCommand(cmd instructions.Command, fileContext util.FileContext, useNewRu | |||
case *instructions.EnvCommand: | |||
return &EnvCommand{cmd: c}, nil | |||
case *instructions.WorkdirCommand: | |||
return &WorkdirCommand{cmd: c}, nil | |||
return &WorkdirCommand{cmd: c, shdCache: cacheRun}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I piggy-backed on the flag for RUN
instructions as I don't think it's sensible to have a separate flag for each instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache copy layers probably got its own flag because depending on the context the files might be huge and invalidate the purpose of a cache, which is speeding things up by downloading a layer instead of executing commands. This should not be a problem in our simple case here, so piggy-backing should be fine.
Fixes #3340
Description
When WORKDIR is called on a non-existent directory, kaniko is kind enough to create that directory for you, resulting in a layer being added. However, kaniko does not cache that layer, which means that on every invocation a completely new image is emitted from that point onwards. Inside the same stage this is non-obvious as caching mechanism still pulls, so you get a 100% cache hitrate thereafter, but the image is completely new. In multistage builds or builds that depend on the newly emitted image, this is catastrophic, as they do consider the entire image's sha when determining whether a cache is hit or not, so this will invalidate the entire cache.
So far the workaround was to ensure that the directory exists before calling
WORKDIR
to avoid creating it implicitly, asRUN
statements can be cached:With this change the layer potentially created by WORKDIR is cached too in similar vein to how
RUN
statements are cached.There is some optimization potential left on the table here, as we do sometimes know a-priori whether a layer should be created at all and always know which directory. Currently I copied the code from
RUN
to make it work, but this is suboptimal, as this code assumes no a-priori knowledge. I'm open for suggestions.Submitter Checklist
These are the criteria that every PR should meet, please check them off as you
review them:
See the contribution guide for more details.
Reviewer Notes
Release Notes
WORKDIR