-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer deletion can fail without throwing an error, causing spam creation of consumers #1331
Comments
Hey. Unfortunately, we cannot reuse the same consumer name, as if it still exists, we could not reuse it, as it advanced in sequences beyond the point of where the client is. This is how ordered consumer (used by KV watchers) works - with disposable consumers and Those consumers are ephemeral, so they will get cleaned up automatically. Could you describe how you are using those consumers/watchers and share some code snippets? Thanks! |
Hey Jarema, I'll circle back with more details when I'm back at work tomorrow. As a side question: Is there any reason we can't use a durable pull consumer on a KV subject to get our values? Thanks! |
When you're getting values for a watcher, you want to create a consumer, get values, and delete a consumer. There is no point in reusing it later. Btw, consumers are only used for watchers. To retrieve kv values via |
I can confirm we're using a watcher, to subscribe for updates, it's effective getting a KV and then calling |
I would like to understand what difference it would make in your opinion, as you need a new consumer when you create a new watcher, no matter if its durable or not. Watchers should not create 30k consumers, so I would focus rather on that. |
Observed behavior
We've seen in our prod environment, consumers tick up into the 20,000s, OOMs one of the NATS servers and causes impact to our applications (some KVs are not replicated which will now be resolved by us). However I noticed in the following code: https://github.com/nats-io/nats.rs/blob/main/async-nats/src/jetstream/consumer/pull.rs#L2265-L2270 you allow the consumer deletion to fail which can cause this death spiral of increasing consumers.
If I am incorrect in my assumption of how this works, is there a way we can force KV consumers to use the same name so this death spiral can be avoided?
Expected behavior
If consumers fail to delete or atleast fail to delete a few times in a row the error should be thrown upwards to prevent this death spiral which can OOM hosts.
Server and client version
Server: v2.10.11
Async NATS (Rust): 0.36.0
Host environment
Rocky Linux 9.3
Steps to reproduce
Hard to give concrete steps need to get into a state where:
Consumer deletion fails on cleanup.
Consumer creation succeeds.
The text was updated successfully, but these errors were encountered: