-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade resty v1 to v2. #424
Upgrade resty v1 to v2. #424
Conversation
…v2". Signed-off-by: Abouzar Kamaee <[email protected]>
cbeccc4
to
2097007
Compare
I noticed upgrade resty needs a refactor in code to be able to use httpmock in our test. I will add a new commit to resolve problem in our tests. |
…nsport so HTTP Mock can intercept requests. Signed-off-by: Abouzar Kamaee <[email protected]>
219444c
to
7b3a86e
Compare
Explanation for Commit
|
…tion should be called 6 times in total while in resty/v1 it meant the function should has been called 5 times. Signed-off-by: Abouzar Kamaee <[email protected]>
709a57d
to
bbf2ead
Compare
Signed-off-by: Abouzar Kamaee <[email protected]>
// Counter to track the number of retries | ||
retryCount := 0 | ||
|
||
_, err := resty.NewWithClient(&http.Client{Transport: http.DefaultTransport}). | ||
SetRetryCount(c.DrainingConfig.MaxRetries). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default max retries is 999 which is a bit too much. Let's reduce it to a more fail fast strategy to avoid elongated loop of retries. Even if the current loop fails, the operator must ideally pick up where it left off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is configurable right? We can lower this for our own deployment.
👍 |
I would suggest having a unit test to set the expectations around the output of the |
👍 |
1 similar comment
👍 |
One-line summary
This PR upgrades resty from v1 to v2 to fix a retry-related bug in es-operator.
Description
After implementing context-aware retries in es-operator, we noticed that the operator could get stuck, preventing cluster scaling. To diagnose the issue, we created an Elasticsearch cluster in our playground environment and successfully reproduced the scenario.
What is the problem?
When a scale-in operation is halted due to a new scale-out request, es-operator can get stuck, posing a significant risk to the system.
Why does es-operator get stuck?
We were using resty/v1.12.0, released on Feb 28, 2019. The project has since released a new version (resty/v2), which is not backward compatible, and we had not upgraded to it. There is a discrepancy between the documentation and the implementation of retry behavior in resty/v1.12.0.
According to the resty/v1.12.0 documentation on the retry function condition: Link to documentation
This means the retry should stop if the retry condition function returns false or if there is an error. However, upon reviewing the code, it turns out the documentation is incorrect. If a retry condition function returns an error, the retry continues: Link to Implementation
It's surprising that such a significant discrepancy exists in an open-source project, but after thorough verification, we are confident this is the case.
How did we solve the problem?
We upgraded resty to the latest version and adjusted our implementation accordingly. Testing the same scenario with the new version resolved the issue.
Types of Changes