Customer-facing Benchmarks #281

stanbrub · 2024-04-05T23:19:30Z

The current benchmarks run nightly are developer-facing single-operation benchmarks. This is good for detecting and narrowing down regression by operation. What we are missing is more real-world benchmarks that test scenarios that users may be actively doing.

The current where benchmarks, for example, run in isolation. All CPU threads are available when those benchmark tests are run. They are extremely fast. But what happens if a where operation must compete with other chained operations or even other where operations running simultaneously from different tables?

Higher level benchmark possibilities:

User Scenario: Kafka to Rolling Group to UDFs or built-ins on the vectors, to joins, sorts, etc
Wide Boundary Test: A large tree of chained tables, shallow and wide
Deep Boundary Test: A large tree of chained tables, narrow and deep
Multi Source Test: Run many chains of operations from different sources
Cardinality: Run the same cardinality with widely different numbers of keys or vice versa

stanbrub added the enhancement New feature or request label Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Customer-facing Benchmarks #281

Customer-facing Benchmarks #281

stanbrub commented Apr 5, 2024 •

edited

Loading

Customer-facing Benchmarks #281

Customer-facing Benchmarks #281

Comments

stanbrub commented Apr 5, 2024 • edited Loading

stanbrub commented Apr 5, 2024 •

edited

Loading