Helm chart for monolithic and read-write deployment mode #4832

rubenvw-ngdata · 2023-04-25T08:02:47Z

Is your feature request related to a problem? Please describe.

There is currently only a helm chart available for the full microservices deployment mode of Grafana Mimir. This is pretty exhaustive and results in a lot of pods. Ideally there would be an alternative to this.

Describe the solution you'd like

An separate helm chart or a deployment mode configuration in the chart to distinguish the deployment mode (could be in a similar way as what is available for loki).
Ideally the alternative deployment solution also supports multi-AZ (where we are running one instance in each AZ)

Describe alternatives you've considered

The only alternative now is to run a minimalistic version of the mimir-distributed helm chart

Additional context

See also previous ticket on grafana/helm-charts: grafana/helm-charts#1189

rubenvw-ngdata · 2023-04-27T13:56:17Z

I had some time to work on this, so I did a try to get this functionality myself (but I failed to get it fully working)

See PR #4858 (I know it is not ready, but sharing it, so you can help me on it)

dimitarvdimitrov · 2023-05-03T16:49:28Z

Thank you for the proposal and the draft PR. I appreciate the time spent. We've been experimenting with the two deployment modes and would like to explore them further as alternatives to microservices mode (maybe even "at scale"). We're not quite there yet, but these deployment modes are also not being deprecated soon.

However, there are some considerations we have to take into account before adding different deployment modes to the helm chart. A couple that come to mind now:

How do we reduce code duplication in the helm chart? Helm isn't super friendly to functions and reducing code duplication. Making named templates more complex comes with a readability tradeoff. With time we will have to add features to multiple deployment modes, which will slow down contributions.
How do we test these deployment modes? Currently we commit some golden records, install a handful of configurations for smoke tests, and run OPA policies. We should probably invest in making sure these tests exist in some form for non-microservices deployments; this is unclear at present and needs some thought.
How do we document them? Do we need to change some of the existing docs for the helm chart to make sure they aren't outdated? Or maybe we need new docs dedicated to the different deployment modes
Do we provide a zero-downtime migration path between the different deployment modes?

Most of these aren't trivial to answer and there will probably be divided opinions. At the same time we, at Grafana Labs, don't have much visibility into how much read-write or monolithic deployment modes will be used or how much they can scale.

As much as I hate to say it, keeping this functionality in a fork will be more pragmatic as it stands. You can publish the forked chart under a different name and we can track how much usage it gets. With time we can revisit and incorporate the changes in the mimir-distributed chart and share the maintenance efforts.

rubenvw-ngdata · 2023-05-04T11:28:25Z

Hi @dimitarvdimitrov ,

Thanks for your answer. I'm a bit disappointed though that you propose to leave it on a fork branch.

The most important reason to use mimir for us (and I don't think we are alone) is to make prometheus HA. With the microservices configuration this comes at a high maintenance level with a very fine grained configuration.

I understand that there are various things that you should think about when embedding it into the product; that's also why this is just a draft.

Have you been able to check the error message I was facing with the monolithic setup? I'm willing to continue, rename the chart and maintain the fork for the time being, but I could use a bit of help debugging through the issues that I'm facing (I don't know a lot the mimir internals).

dimitarvdimitrov · 2023-05-04T11:56:15Z

The most important reason to use mimir for us (and I don't think we are alone) is to make prometheus HA. With the microservices configuration this comes at a high maintenance level with a very fine grained configuration.

With the helm chart we are aiming to make this configuration less of a hassle. The defaults in the chart should work for most users. In addition to that monolithic and read-write deployments have the same configuration options as microservices. However, I can see how scaling up/out a microservices deployment is more complicated than scaling a monolithic deployment.

I left a comment on the draft PR wrt the "connection refused" error. I'm happy to help with answers when I can.

WoodyWoodsta · 2023-08-21T13:45:45Z

To add my two cents, since Grafana Loki already has the "read-write" mode and the helm chart for it, I was sort of expecting to be able to deploy Mimir in the same way if it contains the same component architecture (which is does). So I'm wondering if the considerations listed above are not the equivalents of what has already been done in Grafana Loki?

davinkevin · 2023-10-12T20:17:40Z

Monolithic mode is a very important (strategic?) deployment model IMO, because it makes able to start simple with it, and then increase the complexity if the product fits our needs.

ATM, without the monolithic mode, I don't see me deploying mimir or tempo in clusters I manage "just for evaluation purpose"… and so I start to look at other tool, even if I already run loki & grafana.

As a user, I don't expect any SLA or validation from this chart flavour, just a parameter to deploy it in "target=all".

rubenvw-ngdata · 2023-10-13T07:35:29Z

@davinkevin If you want to try out mimir in monolithic deployment mode, you can use our fork at https://github.com/NGDATA/mimir.
Currently we only do internal releases, so if you want to use it, you will have to take care of the release process yourself.

The more usages of the fork, the more likely it gets that this gets embedded in the product.

mhoyer · 2023-10-28T04:18:16Z

I like the idea of providing one ore more less complex helm chart solutions for mimir. Why? Because we also tried to deploy the current mimir-distributed one and it was really though to walk through the values.yaml. Sure, the chart probably would have run out of the box, but a) we had to apply some modifications and b) my inner nerd wants to know what I am deploying. And here I didn't even look into the templates.

The complex mimir-distrubuted helm chart definitely has it's use case for larger production deployments. Though, the more simple rollout methods are valuable too. For beginners, but also for scenarios with lower performance requirements.

As the almost 4k lines long values.yaml is already overwhelming I suggest to really split up into separate helm charts before adding even more complexity to the existing one (with deployment method). This makes your lifes as maintainers easier and the ones of the consumers too, because they can decide upfront which sophisticated kind of helm chart to start with. In fact, they just have to deal with less complex values.yaml and may understand how the templates work (in case of an issue).

Regarding the sharing of common template functions you may follow a similar approach like Bitnami with a mimir-common helm chart? See https://github.com/bitnami/charts/tree/main/bitnami/common

davinkevin · 2024-02-04T12:57:01Z

@rubenvw-ngdata is the fork still maintained?

rubenvw-ngdata · 2024-02-04T13:07:01Z

It is, we are using it without issues. We do not follow all changes that happen on main immediately though. If there is something that is not working for you, let me know.

Ca-moes · 2024-08-12T14:31:04Z

Having a monolithic deployment for the helm chart would be awesome for the meta-monitoring chart

lieberlois · 2024-09-04T14:19:28Z

Is there any update on this? I really don't understand the decision to have the simplescalable variant for loki but not for mimir 😓

rorynickolls-skyral · 2024-09-13T08:39:37Z

This would be a useful feature where Mimir needs to be deployed for testing. We currently test our observability stack in CI and Mimir, even in a minimal distributed setup, consumes a lot of resources.

Loki can easily just run in SingleBinary mode for tests, and I had assumed the two would be configurable in the same way.

gclawes · 2024-09-25T00:06:07Z

This would be a useful feature where Mimir needs to be deployed for testing. We currently test our observability stack in CI and Mimir, even in a minimal distributed setup, consumes a lot of resources.

Loki can easily just run in SingleBinary mode for tests, and I had assumed the two would be configurable in the same way.

Seconded. This would also be useful in small-footprint/homelab deployments.

kmdlcp · 2024-10-03T14:12:13Z

Thirded. If that is even a word.

This is really a must have.

Robsta86 · 2024-10-09T20:28:07Z

Fourthed, if that’s even a word. ;-)

I am surprised to see Mimir still only has the microservices mode available to use via the helm chart, unlike the other deployment options that Loki has. Would like to run mimir simple scalable in my lab for testing.

ravenolf · 2024-10-21T20:48:56Z

It would definitely be a useful option especially for smaller clusters or just for experimenting with Mimir before a migration from another Prometheus-like tool

rubenvw-ngdata mentioned this issue Apr 25, 2023

[Mimir] Add a "simple-scalable" version of helm chart grafana/helm-charts#1189

Open

dimitarvdimitrov added the helm label Apr 26, 2023

rubenvw-ngdata mentioned this issue Apr 27, 2023

Attempt to introduce a monolithic deployment with Helm #4858

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helm chart for monolithic and read-write deployment mode #4832

Helm chart for monolithic and read-write deployment mode #4832

rubenvw-ngdata commented Apr 25, 2023

rubenvw-ngdata commented Apr 27, 2023

dimitarvdimitrov commented May 3, 2023

rubenvw-ngdata commented May 4, 2023

dimitarvdimitrov commented May 4, 2023

WoodyWoodsta commented Aug 21, 2023 •

edited

Loading

davinkevin commented Oct 12, 2023

rubenvw-ngdata commented Oct 13, 2023

mhoyer commented Oct 28, 2023 •

edited

Loading

davinkevin commented Feb 4, 2024

rubenvw-ngdata commented Feb 4, 2024

Ca-moes commented Aug 12, 2024

lieberlois commented Sep 4, 2024

rorynickolls-skyral commented Sep 13, 2024 •

edited

Loading

gclawes commented Sep 25, 2024

kmdlcp commented Oct 3, 2024

Robsta86 commented Oct 9, 2024

ravenolf commented Oct 21, 2024

Helm chart for monolithic and read-write deployment mode #4832

Helm chart for monolithic and read-write deployment mode #4832

Comments

rubenvw-ngdata commented Apr 25, 2023

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

rubenvw-ngdata commented Apr 27, 2023

dimitarvdimitrov commented May 3, 2023

rubenvw-ngdata commented May 4, 2023

dimitarvdimitrov commented May 4, 2023

WoodyWoodsta commented Aug 21, 2023 • edited Loading

davinkevin commented Oct 12, 2023

rubenvw-ngdata commented Oct 13, 2023

mhoyer commented Oct 28, 2023 • edited Loading

davinkevin commented Feb 4, 2024

rubenvw-ngdata commented Feb 4, 2024

Ca-moes commented Aug 12, 2024

lieberlois commented Sep 4, 2024

rorynickolls-skyral commented Sep 13, 2024 • edited Loading

gclawes commented Sep 25, 2024

kmdlcp commented Oct 3, 2024

Robsta86 commented Oct 9, 2024

ravenolf commented Oct 21, 2024

WoodyWoodsta commented Aug 21, 2023 •

edited

Loading

mhoyer commented Oct 28, 2023 •

edited

Loading

rorynickolls-skyral commented Sep 13, 2024 •

edited

Loading