BigQuery-选择多个列并希望排除两个双嵌套列



Hi我正在处理lt-pcf-analytics-exp.90676036.ga_sessions_*表,我需要提取不同的变量,包括嵌套命中列中除列hits.customDimensions.valuehits.customDimensions.index之外的所有变量。我认为命中率和hits.customDimensions都是排列。如何在标准SQL中做到这一点?

我已经发现了一个类似问题(BigQuery除了双嵌套列(,但在我的情况下,我有一个双嵌套数组列,我无法调整代码。

基本上,这就是我想要提取的。我如何修改它以排除hits.customDimensions.valuehits.customDimensions.index?非常感谢。

SELECT fullVisitorId,
visitId,
visitNumber,
cd.value as PCF_CUST_ID,
date,
TIMESTAMP_SECONDS(visitStartTime) as visitStartTime,
totals.visits as visits,
totals.hits as total_hits,
hits.* (EXCEPT hits.customDimensions.value and hits.customDimensions.index)
FROM `lt-pcf-analytics-exp.90676036.ga_sessions_*` as t
left join unnest(customDimensions) as cd
left join unnest(hits) as hits
WHERE _TABLE_SUFFIX between '20210101' and '20210131' 
and cd.index = 4 and cd.value is not null
ORDER BY PCF_CUST_ID, visitStartTime, hitNumber

就像@martinus注意到的那样,您的except语法不正确。如果您查看BigQuery文档,您会发现使用except运行查询的正确方法是:

SELECT 
field.* EXCEPT (nested_field1, nested_field2)
FROM `my_table`

但是,不能直接在嵌套字段上使用EXCEPT。作为一种变通方法,您可以从hits.*中排除所有hits.customDimensions值,然后仅针对hits.customDimensions.*排除SELECT,然后排除需要删除的嵌套元素,如indexvalue

下面这样的查询应该有效:

SELECT fullVisitorId,
visitId,
visitNumber,
cd.value as PCF_CUST_ID,
date,
TIMESTAMP_SECONDS(visitStartTime) as visitStartTime,
totals.visits as visits,
totals.hits as total_hits,
hits.* EXCEPT (hits.customDimensions),
hits.customDimensions.* EXCEPT (index, value)
FROM `lt-pcf-analytics-exp.90676036.ga_sessions_*` as t
left join unnest(customDimensions) as cd
left join unnest(hits) as hits
WHERE _TABLE_SUFFIX between '20210101' and '20210131' 
and cd.index = 4 and cd.value is not null
ORDER BY PCF_CUST_ID, visitStartTime, hitNumber

如果您检查BigQuery文档中的except,这不是一个好的语法:

SELECT [ AS { typename | STRUCT | VALUE } ] [{ ALL | DISTINCT }]
{ [ expression. ]* [ EXCEPT ( column_name [, ...] ) ]
[ REPLACE ( expression [ AS ] column_name [, ...] ) ]
| expression [ [ AS ] alias ] } [, ...]

所以,像这样使用它:

SELECT hits.* EXCEPT (value, index)

最新更新