WIP: Optimize storage.LabelQuerier.LabelValues implementations #525
+188
−113
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Try to make
storage.LabelQuerier.LabelValues
implementations faster by, inlabelValuesWithMatchers
, makingPostingsForMatchers
intersect matched series with series containing the label name (through a special new match typeMatchSet
). This eliminates the need to after the fact find series containing the label name and intersect with thePostingsForMatchers
result.I dropped the
maxExpandedPostings
optimization, since I don't see it being relevant after my changes, but please point out if I'm wrong.Benchmark results are promising, geometric mean CPU time improvement looking to be ~59%, geometric mean memory usage reduction ~68%. I do see some of the benchmark cases looking much worse, I guess we'll have to look into those.
It doesn't look as if I've broken anything, considering the test suite passes, but please look carefully. I'm new to this code :)
Benchmark results for blocks (
go test -bench=BenchmarkQuerier/Block/labelValuesWithMatchers -benchmem -run='^$' -timeout=30m -count 6 ./tsdb
):benchstat-blocks.txt
Benchmark results for heads (
go test -bench=BenchmarkQuerier/Head [benchstat-head.txt](https://github.com/grafana/mimir-prometheus/files/12440956/benchstat-head.txt) /labelValuesWithMatchers -benchmem -run='^$' -timeout=30m -count 6 ./tsdb
):benchstat-head.txt