Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding tests for Liveness probes #12497

Closed
wants to merge 5 commits into from

Conversation

Shashankft9
Copy link
Member

@Shashankft9 Shashankft9 commented Jan 12, 2022

Fixes #12480

Proposed Changes

  • changing the readiness code in test_images to accommodate for liveness handler
  • renaming few things in readiness tests
  • adding tests for liveness probe

@knative-prow-robot knative-prow-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Jan 12, 2022
@codecov
Copy link

codecov bot commented Jan 12, 2022

Codecov Report

Merging #12497 (44c79bd) into main (39af716) will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main   #12497      +/-   ##
==========================================
+ Coverage   87.46%   87.48%   +0.01%     
==========================================
  Files         195      195              
  Lines        9671     9718      +47     
==========================================
+ Hits         8459     8502      +43     
- Misses        928      931       +3     
- Partials      284      285       +1     
Impacted Files Coverage Δ
pkg/activator/net/revision_backends.go 92.60% <0.00%> (-0.87%) ⬇️
pkg/reconciler/revision/resources/queue.go 98.23% <0.00%> (-0.02%) ⬇️
cmd/queue/main.go 0.53% <0.00%> (ø)
pkg/apis/serving/fieldmask.go 95.13% <0.00%> (+0.06%) ⬆️
pkg/apis/serving/k8s_validation.go 93.61% <0.00%> (+0.14%) ⬆️
pkg/apis/config/features.go 95.83% <0.00%> (+0.37%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 39af716...44c79bd. Read the comment docs.

@Shashankft9 Shashankft9 changed the title [WIP] Adding tests for Liveness probes Adding tests for Liveness probes Jan 17, 2022
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 17, 2022
@Shashankft9
Copy link
Member Author

cc @julz @nader-ziada
I have only added tests with http get, let me know if there is a need for other types of probes as well, also any nits and improvements or suggestions appreciated!

@nader-ziada
Copy link
Member

looks good to me, @psschwei can you take a look as well

@nader-ziada
Copy link
Member

/cc @psschwei @dprotaso

Copy link
Contributor

@psschwei psschwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 20, 2022

// If true sleeping till the first kubelet probe check.
if tc.sleep {
time.Sleep(15 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is trying to surface the bug we hit in #12462 by making sure the pod stays up for at least 15 seconds even if a liveness probe is configured, but I wouldn't have guessed that without remembering about that issue. I wonder if there's a way to do this that (a) is a bit clearer about what it's testing -- i.e. that a pod with a liveness probe is accessible while that probe returns true (and the probe actually runs), and then is restarted if it fails (b) ideally avoids a hardcoded sleep.

What about if the test image counted how many times its liveness handler had been invoked and returned that number in the response? Then we could WaitForEndpointState until the liveness probe had been executed at least N times (avoiding the sleep). We could then - maybe as a follow up - have the test POST to an endpoint on the test image that would cause the liveness check to fail, and assert that the container is properly restarted (this is similar to how we test readiness probes).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense - will do these changes, thanks!

Copy link
Member Author

@Shashankft9 Shashankft9 Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julz just for clarity, if the liveness probe fails and the kubelet does see it, even though it restarts the containers, the readiness probe fails, and so the liveness probe check will also hang forever because of that. I am not really sure why the readiness probe is failing though - checking on that. (this is without any user provided readiness probe)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be the case where I am just adding bad livenesProbe or readinessProbe and going into the probe pitfalls, but I have tried to capture more of it here: #12571

@Shashankft9
Copy link
Member Author

/hold

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2022
@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 31, 2022
@knative-prow-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Shashankft9
To complete the pull request process, please ask for approval from psschwei after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dprotaso
Copy link
Member

/retest

@dprotaso
Copy link
Member

@Shashankft9 you still have a hold on the PR - is it ready for another review?

@Shashankft9
Copy link
Member Author

@dprotaso this is blocked by couple of issues:
#12571 - where I am trying to deliberately fail the liveness probe and then check for the container to come back again, but it never happens. Test code for this is currently commented here: https://github.com/knative/serving/pull/12497/files#diff-6ad46d4cc76b719ef7ce89ac3997f8d85b7b0afe616f3071d7906406d7a139a0R119

knative/pkg#2407

@knative-prow-robot
Copy link
Contributor

@Shashankft9: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 8, 2022
@github-actions
Copy link

github-actions bot commented Sep 1, 2022

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen. Mark as fresh by adding the
comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 1, 2022
@github-actions github-actions bot closed this Oct 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test-and-release It flags unit/e2e/conformance/perf test issues for product features do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No test coverage for liveness probes
6 participants