Impala - 在不使用 UNION ALL 的情况下将列转换为行



我有一个包含用户和每周活动的表,例如这是user_activity表:

userid | wk1 | wk2 | wk3
u1     |  1  |  0  |  1
u2     |  0  |  1  |  0
u3     |  1  |  0  |  1

我想将其转换为:

week   | active
wk1    |   2
wk2    |   1
wk3    |   2

我可以像这样使用 UNION ALL 来实现这一点:

SELECT 'wk1' as week,
       SUM( wk1 ) AS active
FROM user_activity
UNION ALL
SELECT 'wk2' as week,
       SUM( wk2 ) AS active
FROM user_activity
UNION ALL
SELECT 'wk3' as week,
       SUM( wk3 ) AS active
FROM user_activity;

有没有办法在不使用 UNION ALL 的情况下实现相同的目标?

谢谢!

编辑:黑斑羚版本:2.6.0

没有统一 ALL 的原因:对于每个SELECT,将从 HDFS 扫描整个表。如果我有一个巨大的表,这将导致 OOM 错误。

这应该可以解决您的性能问题。
仅扫描一次表。
记录未被复制 X3。
此处的 UNION ALL 仅用于单个记录。

select  concat('wk',cast(c.i as string))    as week
       ,case c.i 
            when 1 then wk1 
            when 2 then wk2
            when 3 then wk3
        end                 as active
from   (select  sum(wk1) AS wk1
               ,sum(wk2) AS wk2
               ,sum(wk3) AS wk3 
        from    user_activity
        ) t
        cross join (            select 1 as i
                    union all   select 2 
                    union all   select 3
                    ) c
;                    

+------+--------+
| week | active |
+------+--------+
| wk1  | 2      |
| wk2  | 1      |
| wk3  | 2      |
+------+--------+

您可以尝试取消透视和聚合。这样,user_activity表将只读取一次。

select
    w.week,
    sum(case w.week 
        when 'wk1' then wk1
        when 'wk2' then wk2
        when 'wk3' then wk3
    end) active
from user_activity u
cross join (
    select 'wk1' week union all
    select 'wk2' week union all
    select 'wk3' week 
) w group by w.week;

生产:

+------+--------+
| week | active |
+------+--------+
| wk1  | 2      |
| wk2  | 1      |
| wk3  | 2      |
+------+--------+

它只需要一个聚合而不是三个聚合。我仅使用 UNION ALL 来构建自定义透视查询。我没有在用户表上使用它。

忽略

琐碎的"使用UNION",这个问题似乎有点荒谬。 但这里有一种方法:

with nounionall as (
      select (case row_number() over (order by userid)
                  when 1 then 'wk1'
                  when 2 then 'wk2'
                  when 3 then 'wk3'
              end) as week
      from user_activity ua
      limit 3
     )
select nounionall.week,
       sum(case when nounionall.week = 'wk1' then wk1
                when nounionall.week = 'wk2' then wk2
                when nounionall.week = 'wk3' then wk3
           end) as actives
from nounionall cross join
     user_activity ua
group by nounionall.week

这个非常简单的解决方案怎么样,没有任何联合:-

SELECT [week],active
FROM (
    SELECT SUM(wk1)wk1,SUM(wk2)wk2,SUM(wk3)wk3
    FROM user_activity)pvt
UNPIVOT ([active] FOR [Week] IN  (wk1,wk2,wk3)) unpvt

最新更新