SQL Impala-具有分析函数的聚合导致AnalysisException分组



我有一个查询,想计算某个产品在其类别中的销售额百分比。因此,我计算每个产品的销售额,并使用分析函数和按类别划分。

SELECT product_id, SUM(sales)/(SUM(sales) OVER(PARTITION BY category))
FROM table1 
GROUP BY product_id

我收到这个错误消息:

AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?): sum(sales) / ((sum(sales) OVER (PARTITION BY category)))

有人知道如何解决这个问题以及为什么会发生这种情况吗?

我想,我有一个很好的例子适合你的问题:

表示例

+----------+---------+---------+--------+
| order_id | cust_id | empl_id | total  |
+----------+---------+---------+--------+
| 1        | c       | 1       | 24.78  |
| 2        | a       | 4       | 28.54  |
| 3        | b       | 3       | 48.69  |
| 4        | b       | 3       | -16.39 |
| 5        | z       | 2       | 29.92  |
| 6        | z       | 3       | 12.50  |
| 7        | a       | 4       | 10.20  |
| 8        | c       | 1       | 5.22   |
+----------+---------+---------+--------+

查询

SELECT order_id, total / (SUM(total) OVER(PARTITION BY cust_id)) AS percentage
FROM orders
ORDER BY order_id;

结果

+----------+------------+
| order_id | percentage |
+----------+------------+
| 1        | 0.82       |
| 2        | 0.73       |
| 3        | 1.50       |
| 4        | -0.50      |
| 5        | 0.70       |
| 6        | 0.29       |
| 7        | 0.26       |
| 8        | 0.17       |
+----------+------------+

如您所见,例如,对于cust_id --> c,有两个值24,785,22,即sum = 30.00percentage24,78 --> 0.825,22 --> 0.17

请检查它是否是您想要的。

您需要在窗口函数中嵌套聚合函数:

SELECT product_id, category,
(SUM(sales) /
SUM( SUM(sales) ) OVER (PARTITION BY category))
-------------^ aggregation function
--------^ window function
)
FROM table1 
GROUP BY product_id, category;

将聚合函数和分析函数结合起来一开始看起来很奇怪(至少对我来说是这样(。只需记住,聚合函数是首先求值的,因此内部SUM(sales)是聚合的结果。另一个SUM()是窗口函数。

Impala似乎不支持分析函数。通过这种方式,我用子查询解决了问题:

SELECT DISTINCT t2.product_id, (t2.sales/t2.tot_sales_cat) AS perc_cat_spend 
FROM( 
SELECT t1.product_id, SUM(t1.sales) OVER (PARTITION BY t1.category) AS tot_sales_cat, t4.prod_sales 
FROM table1 as t1 
JOIN(   SELECT  t1.product_id, SUM(t1.sales) AS prod_sales 
FROM table1 AS t1 
GROUP BY 1
)   AS t4 ON t4.product_id=t1.product_id;

相关内容

  • 没有找到相关文章

最新更新