You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When adding computed fields with aggregation to the dataset preview, a query is built that consumes excessive memory in Clickhouse.
A query of the form:
default
query: SELECT
sum(t1.sumfield) AS res_0,
...
t1.id AS res_4,
t1.datecommit AS res_5,
t1.ndssum AS res_6,
t1.numberfield AS res_7
...
FROM "default".cheks_v AS t1
GROUP BY
res_1,
res_2,
res_3,
res_4,
res_5,
res_6,
res_7,
...
In fact, grouping is performed on all fields of the sample.
And only then the limitation is done.
If the table size is comparable to the RAM size, an error occurs.
Since grouping by all fields is performed, you should change the query to this structure:
select sum(a), a,b,c
from (select a,b,c from t limit 10)
group by a,b,c
limit before group by
Otherwise, it is almost impossible to use the preview on large amounts of data.
The text was updated successfully, but these errors were encountered:
Hey @handgunman! Your suggestion totally makes sense, but if we were to just put the LIMIT inside the original subquery (the one that select fields from selected tables) instead of limiting the result after all calculations, the resulting values would change and in a lot of cases be different from what one expects, all in all, this is an expected behaviour.
We are currently working out ways to make dataset previews more user-friendly when they become too heavy to calculate, we may as well consider limiting the portion of the original table(s) when querying the source for a preview
Of course you can't do that in charts that don't have GROUP BY ALL. But in preview mode, need to do that.
Because the problem is not some heavy calculations that the user does. The problem is grouping by all fields, which the DBMS tries to perform on heavy tables.
In charts, there are usually several fields involved in the grouping and no such problem there.
When adding computed fields with aggregation to the dataset preview, a query is built that consumes excessive memory in Clickhouse.
A query of the form:
default
query: SELECT
sum(t1.sumfield) AS res_0,
...
t1.id AS res_4,
t1.datecommit AS res_5,
t1.ndssum AS res_6,
t1.numberfield AS res_7
...
FROM "default".cheks_v AS t1
GROUP BY
res_1,
res_2,
res_3,
res_4,
res_5,
res_6,
res_7,
...
In fact, grouping is performed on all fields of the sample.
And only then the limitation is done.
If the table size is comparable to the RAM size, an error occurs.
Since grouping by all fields is performed, you should change the query to this structure:
select sum(a), a,b,c
from (select a,b,c from t limit 10)
group by a,b,c
limit before group by
Otherwise, it is almost impossible to use the preview on large amounts of data.
The text was updated successfully, but these errors were encountered: