Replies: 1 comment
-
I tried a few variations of configuration. At first sight, it seems Spilo/Patroni's automatic configuration is not very suited for small clusters, I don't remember the values exactly, but work_mem for example was extremely low, and there were a few other things I changed that seemed to give a relief on the extremely poor performance I was having. It is still far from other deployments I tested (bitnami, kubegres, crunchy), but it does seem that it's possible to narrow the problem down to how Postgres is configured instead of any Operator specific configuration (I think). I still have to experiment with different images (from what I understand from the docs only Spilo images will work?), and check the Spilo default parameters. Although not a proper solution, it does seem relevant that users with small clusters (don't really know exactly at what point a cluster becomes "small" for Spilo's config to stop being optimal) beware of automatic injected configuration that may not be ideal. This is what I found with trial & error so far and in a not very scientific experiment, any other feedbacks are more than welcome. |
Beta Was this translation helpful? Give feedback.
-
I've been experimenting with the Operator for a while and things are mostly working fine.
One thing that I haven't been able to understand is the very slow speeds I've been getting.
Bear in mind this is a minimal test cluster (3 nodes, 6 CPU, 12GB RAM) hosted on DOKS, tested both with local-path (Rancher driver) and with DO's block-storage with 50Gb. The resources are set for 250m CPU request and 256Mi memory request, no limits are set. All tests were performed with a totally empty cluster using pgbench.
In my work computer (1Tb NVME, 32Gb RAM, i5-11660k) I get 2400 TPS (Yes I know, I'm not comparing, just having an idea of "good" and "bad"), in a minimal DO instance I get around 1400 TPS locally, and 500-600 when accessing the instance through the cluster.
With this operator I've been consistently getting 80-120 TPS on pgbench. initial pgbench vacuum takes ~10 seconds.
With bitnami's HA postgres chart I'm getting anywhere from 700 to 1000 TPS (local path) and ~300 TPS with Blcok Storage and initial pgbench vacuum takes ~3 seconds.
PgBench:
What I tried:
We have a tiny 200-300Mb Database during development, and queries are normally very simple: 1 join, ~1k lines nothing fancy at all). I imagine I'm doing something wrong with the configuration, but can't find out what exactly. Any ideas are welcome.
Helm Values
WAL Secret
Cluster Manifest
Beta Was this translation helpful? Give feedback.
All reactions