ensure we reconnect on failure #173

xlc · 2024-05-10T06:41:54Z

No description provided.

ermalkaleci · 2024-05-10T08:44:03Z

src/extensions/client/mod.rs

            };

-            let mut selected_endpoint = healthiest_endpoint(None).await;


this is important. it ensures at least one endpoint is connected. selecting just the first one may result on endpoint connection failure and never connects so selected_endpoint.connected().await will never resolve

in that case we need a test. the current behaviour makes unit test non-deterministic as it may connect any of the dummy server so it is best to fix the waiting for connect behaviour anyway

ermalkaleci · 2024-05-10T08:44:57Z

src/extensions/client/endpoint.rs

@@ -38,19 +40,23 @@ impl Endpoint {
        health_config: HealthCheckConfig,
    ) -> Self {
        let (client_tx, client_rx) = tokio::sync::watch::channel(None);
+        let (reconnect_tx, mut reconnect_rx) = tokio::sync::mpsc::channel(1);


tokio::sync::Notify may be a better option

notify is one off thing but we may need to reconnect multiple times

ermalkaleci · 2024-05-10T08:48:18Z

src/extensions/client/mod.rs

@@ -422,6 +426,10 @@ impl Client {
                    _ = selected_endpoint.health().unhealthy() => {
                        // Current selected endpoint is unhealthy, try to rotate to another one.
                        // In case of all endpoints are unhealthy, we don't want to keep rotating but stick with the healthiest one.
+
+                        // The ws client maybe in a state that requires a reconnect
+                        selected_endpoint.reconnect().await;


when will execute the moment endpoint becomes unhealthy and when that happens it will try to reconnect. I don't think this extra reconnect will help

there is not reconnect currently. we have to drop and re-create the client to actually reconnect. currently it will always fail if the remote drops connection and can never be able to connect to it anymore

ermalkaleci · 2024-05-10T08:51:48Z

src/extensions/client/tests.rs

+
+    let h1 = tokio::spawn(async move {
+        let _req = rx1.recv().await.unwrap();
+        // no response, let it timeout


a request timeout will make endpoint unhealthy therefor it will try to reconnect itself

* ensure we reconnect on failure * refactor * fix test

This reverts commit 5039cfa.

* Revert "Refactor endpoint (#178)" This reverts commit 7fa3132. * Revert "ensure we reconnect on failure (#173)" This reverts commit 5039cfa. * Revert "improve reconnect wait time (#168)" This reverts commit 7cb7c73. * Revert "Await healthy endpoint (#158)" This reverts commit ef1c524. * Revert "endpoint health (#152)" This reverts commit cdbdd9b. * redo validate middleware * fix

ensure we reconnect on failure

5bbc754

ermalkaleci reviewed May 10, 2024

View reviewed changes

xlc added 2 commits May 10, 2024 22:47

refactor

455c067

fix test

e867d9a

xlc merged commit e61fa69 into master May 10, 2024
1 check passed

xlc deleted the fix-reconnect branch May 10, 2024 11:10

xlc added a commit that referenced this pull request May 10, 2024

ensure we reconnect on failure (#173)

5039cfa

* ensure we reconnect on failure * refactor * fix test

xlc added a commit that referenced this pull request May 18, 2024

Revert "ensure we reconnect on failure (#173)"

49219f7

This reverts commit 5039cfa.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ensure we reconnect on failure #173

ensure we reconnect on failure #173

xlc commented May 10, 2024

ermalkaleci May 10, 2024

xlc May 10, 2024 •

edited

Loading

ermalkaleci May 10, 2024

xlc May 10, 2024

ermalkaleci May 10, 2024

xlc May 10, 2024

ermalkaleci May 10, 2024

		};

		let mut selected_endpoint = healthiest_endpoint(None).await;

ensure we reconnect on failure #173

ensure we reconnect on failure #173

Conversation

xlc commented May 10, 2024

ermalkaleci May 10, 2024

Choose a reason for hiding this comment

xlc May 10, 2024 • edited Loading

Choose a reason for hiding this comment

ermalkaleci May 10, 2024

Choose a reason for hiding this comment

xlc May 10, 2024

Choose a reason for hiding this comment

ermalkaleci May 10, 2024

Choose a reason for hiding this comment

xlc May 10, 2024

Choose a reason for hiding this comment

ermalkaleci May 10, 2024

Choose a reason for hiding this comment

xlc May 10, 2024 •

edited

Loading