Pivot in BigQuery



当我在BigQuery中使用pivot函数时,下面的代码片段

CALL fhoffa.x.pivot(
'bigquery-public-data.ghcn_d.ghcnd_2019' # source table
, 'fh-bigquery.temp.test_pivotted' # destination table
, ['id', 'date'] # row_ids
, 'element' # pivot_col_name
, 'value' # pivot_col_value
, 30 # max_columns
, 'AVG' # aggregation
, 'LIMIT 10' # optional_limit
);

中心列名有一个前缀e_

<表类>id日期e_PRCPe_TMINe_TMAXe_SNOWtbody><<tr>127-01-20211258228-01-20213593329-01-20215741

更新5/12/2021Bigquery现在有原生的PIVOT支持,请查看这里的文档。


这是该过程的完整定义,您可以在自己的数据集中创建一个,并根据需要进行修改。您可以看到e_被添加到第一个EXECUTE IMMEDIATE中。

CREATE OR REPLACE PROCEDURE `fhoffa.x.pivot`(table_name STRING, destination_table STRING, row_ids ARRAY<STRING>, pivot_col_name STRING, pivot_col_value STRING, max_columns INT64, aggregation STRING, optional_limit STRING)
BEGIN
DECLARE pivotter STRING;
EXECUTE IMMEDIATE (
"SELECT STRING_AGG(' "||aggregation
||"""(IF('||@pivot_col_name||'="'||x.value||'", '||@pivot_col_value||', null)) e_'||fhoffa.x.normalize_col_name(x.value))
FROM UNNEST((
SELECT APPROX_TOP_COUNT("""||pivot_col_name||", @max_columns) FROM `"||table_name||"`)) x"
) INTO pivotter 
USING pivot_col_name AS pivot_col_name, pivot_col_value AS pivot_col_value, max_columns AS max_columns;
EXECUTE IMMEDIATE (
'CREATE OR REPLACE TABLE `'||destination_table
||'` AS SELECT '
||(SELECT STRING_AGG(x) FROM UNNEST(row_ids) x)
||', '||pivotter
||' FROM `'||table_name||'` GROUP BY '
|| (SELECT STRING_AGG(''||(i+1)) FROM UNNEST(row_ids) WITH OFFSET i)||' ORDER BY '
|| (SELECT STRING_AGG(''||(i+1)) FROM UNNEST(row_ids) WITH OFFSET i)
||' '||optional_limit
);
END;

最新更新