在bigquery中对列表应用分区



我使用FARM_FINGERPRINT函数(ABS(MOD(FARM_FINGERPRINT((country)),4000))在事实表中对国家列应用了分区。现在,基于包含每个区域的国家列表的另一个表(table1)(例如,在"Europe"中我们有"France"、"Germany"、"Espagne"),我想运行一个查询来检测给定区域内的国家列表,并基于应用分区的列表运行where子句(以避免完全扫描)。但是当我运行这个查询时,没有应用分区:

WITH
step1 AS (
SELECT
ARRAY_AGG(ABS(MOD(FARM_FINGERPRINT((MULTIDIVISION_CLUSTER_CODE)),4000))) AS list
FROM (
SELECT
DISTINCT(MULTIDIVISION_CLUSTER_CODE) AS MULTIDIVISION_CLUSTER_CODE
FROM
`project.dataset.table1` table1
WHERE
table1.MULTIDIVISION_ZONE = "Europe" )),
step2 AS(
SELECT
*
FROM
`project.dataset.table2`
WHERE
_hash_partition IN UNNEST((select list from step1))
)
SELECT
*
FROM
step2

如果我替换了"_hash_partition IN UNNEST((select list from step1)"with "_hash_partition IN (2591,287,3623,1537)"or "_hash_partition IN UNNEST(([2591,287,3623,1537]))"它工作(查询不做一个完整的扫描)

表:(地区,国家)

表:(日期,区域,国家,_hash_partition,度量)

您可以尝试下面的动态sql。FORMAT函数将生成与您所说的相同的查询。

-- simplified query for *step1* CTE.
CREATE TEMP TABLE step1 AS SELECT [2591,287,3623,1537] AS list;
EXECUTE IMMEDIATE FORMAT("""
WITH step2 AS (
SELECT
*
FROM
`project.dataset.table2`
WHERE
_hash_partition IN UNNEST(%s)
)
SELECT * FROM step2;
""", (SELECT FORMAT('%t', list) FROM step1));

最新更新