我正在运行一个类似于这样的查询:
SELECT date, group, COUNT(a.user_id) as count
FROM (
SELECT user_id, DATE(timestamp) as date
FROM [db.log_2015_08]
GROUP EACH BY user_id, date) as a
JOIN EACH (
SELECT user_id, group
FROM [db.users]) as b
ON a.user_id = b.user_id
GROUP BY date, group
并得到以下错误:
Error: Shuffle reached broadcast limit for table __I0 (broadcasted at least 137619698 bytes). Consider using partitioned joins instead of broadcast joins .
我已经使用BigQuery有一段时间了,这对我来说是一个新的!这是一个相当大的连接,但我觉得我以前做过类似的没有这个错误。
尝试将GROUP EACH BY
更改为GROUP BY
。EACH
关键字不再需要聚合,事实上,如果您删除它,查询规划器可以更聪明一点。