我目前有两个查询,我在Bigquery中使用视图拼接在一起。我想知道是否有一种方法可以将它们合并到一个查询中,并消除对多个视图的需求。
第一个查询连接几个字符串来创建一个字段"id">
SELECT *, CONCAT(CAST(ga_sourcemedium AS string), CAST(ga_campaign AS string), CAST(ga_adcontent AS string), CAST(ga_country AS string), CAST(ga_region AS string), CAST(ga_devicecategory AS string), CAST(ga_date AS string)) AS id,FROM `table_name`
第二个查询根据id字段对记录进行重复数据删除。
SELECT DISTINCT orders.*
FROM `table_name` orders
INNER JOIN (
SELECT id,
MAX(_sdc_sequence) AS sequence
FROM `table_name`
GROUP BY id
) latest_orders
ON orders.id = latest_orders.id
AND orders._sdc_sequence = latest_orders.sequence
提前谢谢你
对于BigQuery,您可以使用" UNION ALL "语句,但是您需要在两个查询中遵循一些规则才能使用它。
规则如下:
- 两个查询必须具有相同的列数。
- 两个查询必须具有相同的数据类型。
可以使用" UNION ALL: "
查看示例代码select '1','11','3'
union all
select '3','4','5'
具有相同的列数和相同的数据类型。
在这种情况下,您可以看到错误的示例代码:Select
TY_Sales ,
LY_Sales ,
Comp
From
[project.dataset.table]
UNION ALL
TY_Visits ,
LY_Visits ,
Comp
From
[project.dataset.table]
UNION ALL
Select
TY_Orders ,
LY_Orders ,
Comp
From
[project.dataset.table]
查询无效,因为所有查询包含相同数量的列,但列的数据类型不同(前两个为整数,comp为字符串)。因此我们必须将其转换为相同的数据类型。
你可以在BigQuery中看到更多关于" UNION ALL "的信息。
您可以使用公共表表达式,这样结果是自上而下的,易于理解:
WITH orders AS (
SELECT
*,
CONCAT(CAST(ga_sourcemedium AS STRING), CAST(ga_campaign AS STRING),
CAST(ga_adcontent AS STRING), CAST(ga_country AS STRING),
CAST(ga_region AS STRING), CAST(ga_devicecategory AS STRING),
CAST(ga_date AS STRING)) AS id
FROM
`table_name`
),
latest_orders AS (
SELECT
id,
MAX(_sdc_sequence) AS sequence
FROM
orders
GROUP BY
1
),
final AS (
SELECT DISTINCT
o.*
FROM
orders AS o
INNER JOIN
latest_orders AS lo ON o.id = lo.id AND o._sdc_sequence = lo.sequence
)
SELECT * FROM final