Memory limit (total) exceeded in Dataset preview #135

handgunman · 2024-05-03T07:57:05Z

When adding computed fields with aggregation to the dataset preview, a query is built that consumes excessive memory in Clickhouse.
A query of the form:
default
query: SELECT
sum(t1.sumfield) AS res_0,
...
t1.id AS res_4,
t1.datecommit AS res_5,
t1.ndssum AS res_6,
t1.numberfield AS res_7
...
FROM "default".cheks_v AS t1
GROUP BY
res_1,
res_2,
res_3,
res_4,
res_5,
res_6,
res_7,
...

In fact, grouping is performed on all fields of the sample.
And only then the limitation is done.
If the table size is comparable to the RAM size, an error occurs.
Since grouping by all fields is performed, you should change the query to this structure:

select sum(a), a,b,c
from (select a,b,c from t limit 10)
group by a,b,c

limit before group by

Otherwise, it is almost impossible to use the preview on large amounts of data.

KonstantAnxiety · 2024-05-22T12:04:19Z

Hey @handgunman! Your suggestion totally makes sense, but if we were to just put the LIMIT inside the original subquery (the one that select fields from selected tables) instead of limiting the result after all calculations, the resulting values would change and in a lot of cases be different from what one expects, all in all, this is an expected behaviour.

We are currently working out ways to make dataset previews more user-friendly when they become too heavy to calculate, we may as well consider limiting the portion of the original table(s) when querying the source for a preview

handgunman · 2024-05-22T12:31:30Z

Hi!
Since we are talking about a preview where basically GROUP BY ALL is done, there will be no difference in the result.

https://fiddle.clickhouse.com/acfbd276-d3eb-4899-bba5-824ec9690da4

Of course you can't do that in charts that don't have GROUP BY ALL. But in preview mode, need to do that.

Because the problem is not some heavy calculations that the user does. The problem is grouping by all fields, which the DBMS tries to perform on heavy tables.
In charts, there are usually several fields involved in the grouping and no such problem there.

Cheers!
Sergey

resure added the enhancement New feature or request label May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory limit (total) exceeded in Dataset preview #135

Memory limit (total) exceeded in Dataset preview #135

handgunman commented May 3, 2024

KonstantAnxiety commented May 22, 2024

handgunman commented May 22, 2024

Memory limit (total) exceeded in Dataset preview #135

Memory limit (total) exceeded in Dataset preview #135

Comments

handgunman commented May 3, 2024

KonstantAnxiety commented May 22, 2024

handgunman commented May 22, 2024