我有这些要分析的数据,模式如下
timestamp price volume Qualifiers TradeCatogary
------------------- ----- ------ ---------- -------------
2016-07-11 01:00:00 3 2323 U OML
2016-07-11 01:02:03 2.5 434 K KCL
2016-07-11 01:03:34 4 3244 U KCL
2016-07-11 01:23:00 2.3 45 K OML
...
我想要计算的是每5分钟的总份额、交易价值、交易计数和平均价格(VWAP = dollarTraded/totalVolume)。期望的模式如下
timestamp tradeCount totalVolume dollarTraded VWAP
--------- ---------- ----------- ------------ ----
这并不难,我用这样的SQL来管理它
SELECT
TIMESTAMP_SECONDS(CAST((ROUND(UNIX_SECONDS(Date_Time) / 300) * 300) AS int64)) AS interval_alias1,
count(*) AS cnt,
SUM(Volume) AS ShareVolumeTraded,
SUM(Price * Volume) AS DollarVolumeTraded,
(SUM(Price * Volume) / SUM(Volume)) AS VWAP
FROM
`dbd-sdlc-prod.HKG_NORMALISED.HKG_NORMALISED`
WHERE
RIC = '1606.HK'
AND Type="Trade"
AND (Date_Time BETWEEN TIMESTAMP('2016-07-11 00:00:00.000000') AND
TIMESTAMP('2016-07-11 23:59:59.999999'))
AND Volume >0
AND Price >0
GROUP BY
ROUND(UNIX_SECONDS(Date_Time) / 300), interval_alias1
ORDER BY interval_alias1
然而,我想进一步推动我的分析,我想要的不仅仅是交易的总摘要,我想有更具体的分析基于不同组合的限定符和tradecatary,如:
count_U volume_U dollar_U VWAP_U count_U_OML volume_U_OML dollar_U_OML VWAP_U_OML ....
there are two qualifiers and two catogaries in the example so there will be:
3(basic) + 2 * 3(only qualifier) + 3 * 2 * 2(qualifier and catogary combination) = 21 columns
If there are no such trade in these time slot, the value will be left 0 as default
我希望用一个SQL执行来管理这些查询,我不确定如何解决它,请给我一个提示或一些,提前感谢
正如评论中提到的,Group By Rollup适合您的情况。Group By Rollup对上述列的所有可能组合返回Group By的结果。由于您还希望输出为列而不是行,因此可以使用数据透视表将行更改为列。此外,对于您的其他要求,即连接两列的输出并使其成为单列,CONCAT函数可以使用。