Add benchmarks & stress tests #366

tallclair · 2022-06-23T22:43:58Z

I think this project is currently lacking benchmarks and stress tests. These tests can be useful for identifying major regressions and issues. Some ideas of tests I'd like to see:

Single high-throughput connection
- Agent → Proxy server
- Agent ← Proxy server
- Agent ⇆ Proxy server (bidirectional)
Multiple high-throughput connections
High proxied connection churn (normal behavior, e.g. multiple webhooks)
High churn of agent connections (simulating flaky network or agent restarts)

Each case above should collect metrics on CPU, memory, and # of goroutines. Metrics should be measured before, during & after the tests, with quantiles.

High-throughput cases should also benchmark throughput rates
Connection churn cases should also look at latency

justinsb · 2022-07-12T01:02:07Z

This is a great list. If I was adding one scenario, I would add the "unexpected close" scenario. I think this happens in the real world with a webhook if kube-apiserver closes the connection after a timeout. Also (in the other direction) if the webhook crashes I think that should also yield a close at an unexpected place in the message flow.

tallclair · 2022-07-28T20:57:26Z

/assign

k8s-triage-robot · 2022-10-26T21:55:13Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

tallclair · 2022-11-01T16:34:56Z

/remove-lifecycle stale

k8s-triage-robot · 2023-01-30T17:05:27Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

tallclair · 2023-02-09T18:58:28Z

/remove-lifecycle stale
/triage accepted

k8s-triage-robot · 2024-02-09T19:33:37Z

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

Confirm that this issue is still relevant with /triage accepted (org members only)
Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

tallclair · 2024-02-16T19:59:56Z

/triage accepted

We have much better test coverage, but I think we're still lacking the ability to detect a performance regression.

Some basic benchmarks were added in https://github.com/kubernetes-sigs/apiserver-network-proxy/blob/master/tests/benchmarks_test.go, but we're still missing benchmarks for concurrent requests & throughput, as well as the tooling to run these regularly to detect regressions.

k8s-ci-robot assigned tallclair Jul 28, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 26, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 1, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 30, 2023

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 9, 2023

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Feb 9, 2024

k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmarks & stress tests #366

Add benchmarks & stress tests #366

tallclair commented Jun 23, 2022 •

edited

Loading

justinsb commented Jul 12, 2022

tallclair commented Jul 28, 2022

k8s-triage-robot commented Oct 26, 2022

tallclair commented Nov 1, 2022

k8s-triage-robot commented Jan 30, 2023

tallclair commented Feb 9, 2023

k8s-triage-robot commented Feb 9, 2024

tallclair commented Feb 16, 2024

Add benchmarks & stress tests #366

Add benchmarks & stress tests #366

Comments

tallclair commented Jun 23, 2022 • edited Loading

justinsb commented Jul 12, 2022

tallclair commented Jul 28, 2022

k8s-triage-robot commented Oct 26, 2022

tallclair commented Nov 1, 2022

k8s-triage-robot commented Jan 30, 2023

tallclair commented Feb 9, 2023

k8s-triage-robot commented Feb 9, 2024

tallclair commented Feb 16, 2024

tallclair commented Jun 23, 2022 •

edited

Loading