Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

ci.ocp: Improve the service-up detection #5807

Merged
merged 2 commits into from
Feb 22, 2024

Conversation

ldoktor
Copy link
Contributor

@ldoktor ldoktor commented Feb 7, 2024

waiting for the first response is not sufficient as OCP returns html page without error even when the route is not yet established describing the issue (why it doesn't reply with 500?). Waiting for the correct output should do better.

@katacontainersbot katacontainersbot added the size/small Small and simple task label Feb 7, 2024
@ldoktor
Copy link
Contributor Author

ldoktor commented Feb 7, 2024

I tested (similar) version here: openshift/release#48473 seems to work well.

@ldoktor ldoktor force-pushed the ocp-curl-fix branch 3 times, most recently from 3cfe330 to 6669db0 Compare February 7, 2024 12:51
@ldoktor
Copy link
Contributor Author

ldoktor commented Feb 15, 2024

@wainersm @gkurz could you please review this? It significantly decreases the stability of the CI pipeline.

@stevenhorsman
Copy link
Member

Hi Lukas - is there a plan to move this openshift testing into the kata-containers repo as the tests repo is deprecated?

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need an issue linked in the commit body via Fixes: to satisfy the automation

.ci/openshift-ci/run_smoke_test.sh Outdated Show resolved Hide resolved
.ci/openshift-ci/run_smoke_test.sh Show resolved Hide resolved
.ci/openshift-ci/run_smoke_test.sh Outdated Show resolved Hide resolved
test_status=$?
if [ $test_status -eq 0 ]; then
if waitForProcess 60 1 "curl '${host}:${port}${hello_file}' 2>/dev/null | grep -q '$hello_msg'"; then
test_status=0
info "HTTP server is working"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave out "everything is working" messages from the log if possible.
You're going to check logs when things do not work and the "working" logs get in the way.
If you can't tell where things break in the code, you're typically missing an error test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this message was already there (and is quite useful when looking for things that do not work, at least you know this thing worked...). Anyway I'd rather move a discussion about this message to another PR, if needed.

waiting for the first response is not sufficient as OCP returns html
page without error even when the route is not yet established describing
the issue (why it doesn't reply with 500?). Waiting for the correct
output should do better.

Fixes: kata-containers#5808

Signed-off-by: Lukáš Doktor <[email protected]>
these logs are vital to analyze a setup failure.

Signed-off-by: Lukáš Doktor <[email protected]>
@ldoktor
Copy link
Contributor Author

ldoktor commented Feb 21, 2024

Changes:

  • Using "check_cmd" variable to emphasize the waitForProcess command
  • Split into 2 commits (fix and extra debug output)
  • Capture output of all curl attempts and log them in case of a failure
  • Added "Fixes" to the first commit

Not changed:

  • Kept the "HTTP server is working" message as it was there before this PR

@tbuskey
Copy link

tbuskey commented Feb 21, 2024

/lgtm

Copy link
Member

@gkurz gkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ldoktor !

@gkurz
Copy link
Member

gkurz commented Feb 22, 2024

/test

@gkurz
Copy link
Member

gkurz commented Feb 22, 2024

Hi Lukas - is there a plan to move this openshift testing into the kata-containers repo as the tests repo is deprecated?

@stevenhorsman IIUC Lukas is using the test repo as a guinea pig. When everything works, the openshift testing will go the the kata-containers repo like the rest.

@ldoktor
Copy link
Contributor Author

ldoktor commented Feb 22, 2024

Hi Lukas - is there a plan to move this openshift testing into the kata-containers repo as the tests repo is deprecated?

@stevenhorsman IIUC Lukas is using the test repo as a guinea pig. When everything works, the openshift testing will go the the kata-containers repo like the rest.

Most of the work is done, now I'm waiting for the 4.13 pipeline issue to be resolved so the openshift/release#48435 can be merged (we actually think about merging it with the failure, depends on the reviewers). Anyway this patchset will have to be backported to kata-containers, I'll take care of it once it's ready. Now it's important to ensure the pipeline works again.

Copy link
Contributor

@wainersm wainersm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thanks @ldoktor !

@wainersm wainersm merged commit 8589ea7 into kata-containers:main Feb 22, 2024
7 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
size/small Small and simple task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants