Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus feature native-histograms breaks targets with info metric types from being scraped. #14

Open
defenestration opened this issue Nov 21, 2023 · 3 comments

Comments

@defenestration
Copy link

defenestration commented Nov 21, 2023

I was updating fluxv2 and using this repo as a template, but i noticed in prometheus, that the kube-state-metrics target was showing the error "invalid metric type "info". After some some local testing with a fork of the repo, I'm finding the native-histograms feature of prometheus seems to cause the problem. I have a branch here for testing this out.

https://github.com/defenestration/flux2-monitoring-example/tree/native-histograms

Heres a pr with the diffs from this repo from when i forked it if it helps https://github.com/defenestration/flux2-monitoring-example/pull/1/files

Realize this is probably an upstream issue, but wondering what the best work-around for this could be? Seems like the custom metric types may be best if set as Gauge instead perhaps?

I will continue to look on my own but figured it was worth filing an issue in case someone else had this problem too.

@defenestration
Copy link
Author

Seems like simply switching to Gauge type produces an error on kube-state-metrics:

E1121 19:38:35.721442       1 discovery.go:214] "failed to update custom resource stores" err="failed to create metrics factory for kustomize.toolkit.fluxcd.io_v1_Kustomization: resource_info: compiling metric: expected each.gauge to not be nil"

@defenestration
Copy link
Author

I was able to setup the gauges after looking at the kube-state-metrics docs

And have setup gauges based on the following.
https://github.com/defenestration/flux2-monitoring-example/blob/c7b36deb6cd53b0a857636e2f6db1d774823ca11/monitoring/controllers/kube-prometheus-stack/kube-state-metrics-config.yaml

The metric name is changed to gotk_resource_status since not sure it is 1:1 with the suggested alerts in the flux 2.1 docs.

@darkowlzz
Copy link
Contributor

darkowlzz commented Dec 18, 2023

@defenestration thanks for reporting this and doing all the investigation. I see that it's being worked on in upstream KSM and started some discussions in prometheus.
The reason we have info metric type instead of guage is that we wanted to only show one metric per resource that contains all the necessary information about the resource. There is nothing to measure, the value of the metric remains 1. I think the guage config for CRs status conditions shown in the KSM docs and used by you may result in multiple metric values for different status conditions and make the metric more about the different status conditions with same labels in each of them. At present, resource_info metric tries to capture multiple properties of a resource in a single metric.
The KSM configuration we have in this repository is more of an example for people to get started. If they need status condition specific metrics, they can easily do that. We don't recommend using these example config directly but modifying them as per their needs. So, it's okay to change them to what works for you.

I think we can wait for the on-going KSM work to conclude before making any change for prometheus native-histograms support. My guess is that we may not need to change anything and KSM will handle the issue internally, based on the KSM discussions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants