创建一个包含两列类型为 RECORD 的表



我正在使用大查询,我想创建一个用"记录"类型列填充表的作业。数据将由查询填充 - 是否可以创建具有两列类型记录的表?

就像BG公共数据集中的表格[bigquery-public-data:samples.trigrams]一样

谢谢!

将查询输出控制为记录的最简单方法是使用 JavaScript UDF。

例如:

SELECT *
FROM js(
(
  SELECT item
  FROM [fh-bigquery:wikidata.latest_raw] 
),
item,
"[{name: 'id', type:'string'},
  {name: 'sitelinks', type:'record', mode:'repeated', fields: [{name: 'site', type: 'string'},{name: 'title', type: 'string'},{name: 'encoded', type: 'string'}]},
  ]",
  "function(r, emit) {
    [...]
emit({
    id: obj.id,
    sitelinks: sitelinks,
    });  
  }")

请参阅 https://github.com/fhoffa/code_snippets/blob/master/wikidata/create_wiki_en_table.sql 的完整示例。

随着 BigQuery Standard SQL 的引入,我们有了处理记录
的简单方法在下面尝试,不要忘记取消选中"显示选项"下的复选框Use Legacy SQL

WITH YourTable AS (
  SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 2 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 2 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 3 AS a, 2 AS b, 3 AS c, 13 AS x, 12 AS y, 13 AS z
)
SELECT 
  a, ARRAY_AGG(STRUCT(b, c)) AS aa, 
  x, ARRAY_AGG(STRUCT(y, z)) AS xx
FROM YourTable
GROUP BY a, x

BigQuery Legacy SQL中的类似结果可以通过以下代码完成:

SELECT *
FROM JS( 
  ( // input table 
  SELECT 
    a, GROUP_CONCAT(CONCAT(STRING(b), ';', STRING(c))) AS aa, 
    x, GROUP_CONCAT(CONCAT(STRING(y), ';', STRING(z))) AS xx
  FROM 
    (SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
    (SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
    (SELECT 2 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
    (SELECT 2 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
    (SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
    (SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
    (SELECT 3 AS a, 2 AS b, 3 AS c, 13 AS x, 12 AS y, 13 AS z)
  GROUP BY a,x
  ), 
  a, aa, x, xx, // input columns 
  "[ // output schema 
  {name: 'a', type:'integer'},
  {name: 'aa', type:'record', mode:'repeated', 
  fields: [
    {name: 'b', type: 'integer'},
    {name: 'c', type: 'integer'}
    ]},
  {name: 'x', type:'integer'},
  {name: 'xx', type:'record', mode:'repeated', 
  fields: [
    {name: 'y', type: 'integer'},
    {name: 'z', type: 'integer'}
    ]}
   ]", 
  "function(row, emit) { // function 
    var aa = []; 
    aa1 = row.aa.split(',');
    for (var i = 0; i < aa1.length; i++) { 
      aa2 = aa1[i].split(';');
      aa.push({b:parseInt(aa2[0]), c:parseInt(aa2[1])}); 
    }; 
    var xx = []; 
    xx1 = row.xx.split(',');
    for (var i = 0; i < aa1.length; i++) { 
      xx2 = xx1[i].split(';');
      xx.push({y:parseInt(xx2[0]), z:parseInt(xx2[1])}); 
    };
    emit({
      a: row.a, 
      aa: aa, 
      x: row.x,
      xx: xx
      }); 
  }"
)  

为此(对于旧版 SQL),您需要设置目标表并选中Allow Large Results复选框并取消Flatten Results复选框(全部在"显示选项"下)

最新更新