GCP Bigquery,我想在一列中按分类值聚合一些值



我有这些要分析的数据,模式如下

timestamp           price volume Qualifiers TradeCatogary
------------------- ----- ------ ---------- -------------
2016-07-11 01:00:00     3   2323          U           OML
2016-07-11 01:02:03   2.5    434          K           KCL
2016-07-11 01:03:34     4   3244          U           KCL
2016-07-11 01:23:00   2.3     45          K           OML
...

我想要计算的是每5分钟的总份额、交易价值、交易计数和平均价格(VWAP = dollarTraded/totalVolume)。期望的模式如下

timestamp tradeCount totalVolume dollarTraded VWAP
--------- ---------- ----------- ------------ ----

这并不难,我用这样的SQL来管理它

SELECT
TIMESTAMP_SECONDS(CAST((ROUND(UNIX_SECONDS(Date_Time) / 300) * 300) AS int64)) AS interval_alias1,
count(*) AS cnt,
SUM(Volume) AS ShareVolumeTraded,
SUM(Price * Volume) AS DollarVolumeTraded,
(SUM(Price * Volume) / SUM(Volume)) AS VWAP
FROM
`dbd-sdlc-prod.HKG_NORMALISED.HKG_NORMALISED`
WHERE
RIC = '1606.HK'
AND Type="Trade"
AND (Date_Time BETWEEN TIMESTAMP('2016-07-11 00:00:00.000000') AND
TIMESTAMP('2016-07-11 23:59:59.999999'))
AND Volume >0
AND Price >0
GROUP BY 
ROUND(UNIX_SECONDS(Date_Time) / 300), interval_alias1
ORDER BY interval_alias1

然而,我想进一步推动我的分析,我想要的不仅仅是交易的总摘要,我想有更具体的分析基于不同组合的限定符和tradecatary,如:

count_U volume_U dollar_U VWAP_U count_U_OML volume_U_OML dollar_U_OML VWAP_U_OML ....
there are two qualifiers and two catogaries in the example so there will be:
3(basic) + 2 * 3(only qualifier) + 3 * 2 * 2(qualifier and catogary combination) = 21 columns
If there are no such trade in these time slot, the value will be left 0 as default

我希望用一个SQL执行来管理这些查询,我不确定如何解决它,请给我一个提示或一些,提前感谢

正如评论中提到的,Group By Rollup适合您的情况。Group By Rollup对上述列的所有可能组合返回Group By的结果。由于您还希望输出为列而不是行,因此可以使用数据透视表将行更改为列。此外,对于您的其他要求,即连接两列的输出并使其成为单列,CONCAT函数可以使用。

最新更新