,你好我在bigquery中使用这个数据集/查询:
数据集/查询
select * from fh-bigquery.reddit.subreddits limit 10;
我被要求编写一个查询,将使用上述表插入一个模式,这将在新表中具有以下JSON模式结构:
schema: {
fields: [
{
mode: NULLABLE,
name: dt,
type: DATE
},
{
mode: NULLABLE,
name: num_comments,
type: INTEGER
},
{
mode: NULLABLE,
name: posts,
type: INTEGER
},
{
mode: NULLABLE,
name: ups,
type: INTEGER
},
{
mode: NULLABLE,
name: downs,
type: INTEGER
},
{
fields: [
{
mode: NULLABLE,
name: ups,
type: INTEGER
},
{
mode: NULLABLE,
name: downs,
type: INTEGER
}
],
mode: REPEATED,
name: subreddit_metrics,
type: RECORD
}
]
},
subreddit_metrics
字段按照上面的JSON嵌套。
这个查询来自BigQuery文档,它向我展示了如何为表创建嵌套字段:
CREATE TABLE IF NOT EXISTS mydataset.mytable(
id STRING,
first_name STRING,
last_name STRING,
dob DATE,
addresses
ARRAY<
STRUCT<
status STRING,
address STRING,
city STRING,
state STRING,
zip STRING,
numberOfYears STRING>>)
OPTIONS (description = 'Example name and addresses table')
根据原始请求,要编写一个基于上面原始数据集/查询插入模式的查询,我不能在SELECT语句中创建一个嵌套字段来创建一个带有嵌套字段的新表。像这样:
CREATE TABLE
mydataset.test AS
SELECT
subreddit ARRAY< STRUCT< ups STRING,
downs STRING,
FROM
fh-bigquery.reddit.subreddits;
Error: Syntax error: Expected end of input but got keyword ARRAY at [12:15]
问题:
1. Am I understanding the question correctly, in regards to writing a query that will INSERT a schema from another table in question?
2. If my understanding of #1 is correct, how can I INSERT a schema from another table,with the right nesting, I would think using a CREATE statement with the help from a SELECT statement, right?
提前谢谢你。
以下是我的最佳猜测。subreddit_metrics
模式可由ARRAY_AGG(STRUCT(ups, downs))
制备。
SELECT subr AS subreddit,
created_utc AS dt,
num_comments,
c_posts AS posts,
ups,
downs,
ARRAY_AGG(STRUCT(ups, downs)) OVER(PARTITION BY subr) AS subreddit_metrics
FROM `fh-bigquery.reddit.subreddits`
;