如何在HiveQL中有效地实现GROUP BY列组合



例如,我有一个名为Roll:的表

ID    Name    Address
---------------------
01    Lily    NewYork
02    Lucy    NewYork
03    Lucy    NewYork

我想得到COUNT(1(GROUP BY列名和地址的组合:

SELECT Name, Address, COUNT(1) FROM Roll GROUP BY Name, Address
+
SELECT Name, COUNT(1) FROM Roll GROUP BY Name
+
SELEC Address, COUNT(1) FROM Roll GROUP BY Address
+
SELECT COUNT(1) FROM Roll

以下SQL可以实现我的想法,"##"表示"GROUP BY NONE":

SELECT Name, Address, COUNT(1) FROM (
SELECT Name, Address FROM Roll
UNION ALL
SELECT '##', Address FROM Roll
UNION ALL
SELECT Name, '##' FROM Roll
UNION ALL
SELECT '##', '##' FROM Roll) t
GROUP by Name, Address;

结果:

+------+---------+----------+
| Name | Address | COUNT(1) |
+------+---------+----------+
| ##   | ##      |        3 |
| ##   | NewYork |        3 |
| Lily | ##      |        1 |
| Lily | NewYork |        1 |
| Lucy | ##      |        2 |
| Lucy | NewYork |        2 |
+------+---------+----------+

除了上面的方法之外,还有更有效的方法吗?

谢谢。

您在寻找小计吗?如果是这样的话,可以通过分组集和多维数据集/汇总来实现。查看这个关于分组的wiki

SELECT coalesce(Name,"##"), coalesce(Address,"##"), count(1)
FROM ROLL
GROUP BY Name, Address with cube;

我想这就是你想要的:(

相关内容

  • 没有找到相关文章

最新更新