'Too large to JOIN'不使用 JOIN 时出错


select 
CASE 
    WHEN .....
    ELSE .....
END AS carrier,
count(vehicle_id) as cnt
from test.vehicle_info 
WHERE vehicle_id NOT IN(select hardware_id 
                        from TABLE_DATE_RANGE(test.gps32_,DATE_ADD(CURRENT_TIMESTAMP(), -6,     'DAY'),DATE_ADD(CURRENT_TIMESTAMP(), -1, 'DAY')))
group by carrier
order by cnt

我得到了这个错误:

Query Failed
Error: Table too large for JOIN. Consider using JOIN EACH. For more details, please see https://developers.google.com/bigquery/docs/query-reference#joins
Job ID: red-road-574:job_e2o6sBjO9Dt5QrU_cRM2VHSRTso

原因是什么以及如何解决?

@Hobbs上面的猜测是正确的。SEMIJOIN(使用 WHERE ... IN ...(和 ANTIJOIN(使用 WHERE ... NOT IN ...(作为 JOIN 操作实现。解决这些限制的方法是使用 join EACH 自行重写为联接。那是:

select 
CASE 
    WHEN .....
    ELSE .....
END AS carrier,
count(vi.vehicle_id) as cnt
from test.vehicle_info vi
LEFT OUTER JOIN EACH (select hardware_id FROM TABLE_DATE_RANGE(...)) hi
ON vi.vechicle_id = hi.hardware_id
WHERE hi.hardware_id is NULL
group by carrier
order by cnt

最新更新