Only check latencies once every 10 seconds with `routeByLatency` #2795

justinmir · 2023-11-10T18:54:05Z

routeByLatency currently checks latencies any time a server returns a MOVED or READONLY reply. When a shard is down, the ClusterClient chooses to issue the request to a random server, which returns a MOVED reply. This causes a state refresh and a latency update on all servers. This can lead to significant ping load to clusters with a large number of clients.

This introduces logic to ping only once every 10 seconds, only performing a latency update on a node during the GC function if the latency was set later than 10 seconds ago.

Fixes #2782

Figure: Ping behavior of the client running 21bd40a and a client running this PR. When shards are failed the current cluster client will spam pings while the fixed cluster client will only ping each server once every 10 seconds.

This shows the impact in a running large production cluster. The cluster is handling ~4M pings per second due to this behavior.

`routeByLatency` currently checks latencies any time a server returns a MOVED or READONLY reply. When a shard is down, the ClusterClient chooses to issue the request to a random server, which returns a MOVED reply. This causes a state refresh and a latency update on all servers. This can lead to significant ping load to clusters with a large number of clients. This introduces logic to ping only once every 10 seconds, only performing a latency update on a node during the `GC` function if the latency was set later than 10 seconds ago. Fixes redis#2782

ofekshenawa · 2024-02-18T12:56:18Z

LGTM!
WDYT about changing Unix() to NanoUnix? To be more precise and to avoid unnecessary loops

justinmir · 2024-02-29T17:58:44Z

Sure I'll push that change shortly.

justinmir · 2024-04-11T22:18:22Z

@ofekshenawa PTAL when you get a chance!

justinmir · 2024-10-18T17:42:46Z

@vladvildanov hoping to get some eyes here, this will help us no longer have to maintain our own fork

justinmir marked this pull request as ready for review November 10, 2023 19:47

chayim requested a review from ofekshenawa February 18, 2024 07:06

justinmir and others added 2 commits February 29, 2024 12:26

use UnixNano instead of Unix for better precision

97cee82

Merge branch 'master' into only-update-latency-in-gc-if-stale

5dcad41

Merge branch 'master' into only-update-latency-in-gc-if-stale

2b4bdfc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only check latencies once every 10 seconds with `routeByLatency` #2795

Only check latencies once every 10 seconds with `routeByLatency` #2795

justinmir commented Nov 10, 2023 •

edited

Loading

ofekshenawa commented Feb 18, 2024

justinmir commented Feb 29, 2024

justinmir commented Apr 11, 2024

justinmir commented Oct 18, 2024

Only check latencies once every 10 seconds with routeByLatency #2795

Are you sure you want to change the base?

Only check latencies once every 10 seconds with routeByLatency #2795

Conversation

justinmir commented Nov 10, 2023 • edited Loading

ofekshenawa commented Feb 18, 2024

justinmir commented Feb 29, 2024

justinmir commented Apr 11, 2024

justinmir commented Oct 18, 2024

Only check latencies once every 10 seconds with `routeByLatency` #2795

Only check latencies once every 10 seconds with `routeByLatency` #2795

justinmir commented Nov 10, 2023 •

edited

Loading