如何使用过分区查询在SQL中,以获得当前,平均和最大值?



我有这个表,它显示了一个点是由一个设备在一个区域和特定的位置。

working_date    device   points   area   location
19-06-2020        a        1       x       xa   
19-06-2020        a        2       x       xa 
19-06-2020        a        3       x       xa 
19-06-2020        a        4       x       xa
20-06-2020        a        5       x       xa
20-06-2020        a        6       x       xa
20-06-2020        a        7       x       xa
20-06-2020        a        8       x       xa
20-06-2020        a        9       x       xa

我想获得当前,平均和最大的点分组的面积和位置。如果我选择任何一天,当前数量将显示最近工作日期的数量。同时,平均数量将显示设备工作的总体平均值。最后,最大数量将显示设备正在完成的总体最大点数。

根据上面的表格,如果我选择21-06-2020,那么期望的结果:

working_date  area  location   device   current_qty  avg_qty   max_qty
21-06-2020     x       xa        a         5           4,5        5

平均数量来自total_qty/total_of_date,而最大数量来自所有日期的最大数量。

到目前为止,我建立的查询是:
select t1.working_date, t1.device, t1.area, t1.location, t1.points_qty, t1.total_date,
sum(t1.pile_qty) over(partition by t1.working_date) / sum(t1.total_date) over(partition by t1.working_date) as avg_qty,
max(t1.pile_qty) over(partition by t1.working_date) as max_qty
from (
select working_date, device, points, area, location, count(points) as points_qty, count(distinct working_date) as total_date 
from table1 group by device, area, location
group by working_date, device, points, area, location) t1
group by working_date, device, points, area, location, pile_qty, total_date
通过上面的查询,我得到:
working_date  area  location   device   current_qty  avg_qty   max_qty
21-06-2020     x       xa        a         5           5          5

我应该如何写我的查询,以获得所需的结果?

提前感谢。

我想我有办法了。然而,我不确定答案是否会在不同的情况下提供正确的结果。下面是我的代码=>请查看链接=>DB-FIDDLE link .

WITH CTE AS
(
SELECT working_date,area,location,device, 
COUNT(working_date) GrpCount
FROM MYTable 
GROUP BY working_date,area,location,device

),y AS
(SELECT area,location,device,GrpCount,
(SELECT GrpCount FROM CTE WHERE working_date<TO_DATE('21-06-2020','DD-MM-YYYY') ORDER BY working_date DESC LIMIT 1)  current_qty  
FROM CTE
)
SELECT TO_DATE('21-06-2020','DD-MM-YYYY'),area,location,device, 
MAX(current_qty) current_qty,
string_agg(GrpCount::text, ',') avg_qty,
Max(GrpCount) max_qty
FROM Y
GROUP BY area,location,device

注意:-在这里,您可以看到,对于current_qty,我使用了您的输入日期21-06-2020(SELECT GrpCount FROM CTE WHERE working_date<TO_DATE('21-06-2020','DD-MM-YYYY') ORDER BY working_date DESC LIMIT 1) current_qty来查找当前数量。它给出了你期望的结果。请检查代码与不同的日期范围和数据范围。

demo:db<>fiddle

SELECT
*,
AVG(current_qty) OVER () as avg_qty,             -- 2
MAX(current_qty) OVER () as max_qty
FROM (
SELECT 
working_date,
area,
location,
device,
COUNT(*) as current_qty                      -- 1
FROM mytable
GROUP BY working_date, device, area, location    -- 1
) s
WHERE working_date <= '2020-06-21'                   -- 3
ORDER BY working_date DESC
LIMIT 1
  1. working_date值进行正常分组,计算日期的qty
  2. 使用整个分组数据集的qty值,使用无限窗口函数将avgmax数量值添加到记录中
  3. 查找给定日期的最新数据集:过滤具有相同或更小日期值的所有记录,将这些日期中的最近日期排序到顶部,并使用限制仅返回最顶部。

如果区域、位置和设备的值与示例中每个记录的值相同,则分组只能正常工作。如果它们不同,您可以使用COUNT()作为窗口函数,而不是组聚合来为每条记录添加值:

演示:db<在小提琴>

SELECT
*,
AVG(current_qty) OVER () as avg_qty,
MAX(current_qty) OVER () as max_qty
FROM (
SELECT 
working_date,
area,
location,
device,
COUNT(*) OVER (PARTITION BY working_date) as current_qty
FROM mytable
) s
WHERE working_date <= '2020-06-21'
ORDER BY working_date DESC
LIMIT 1

但是,在这种情况下,不清楚2020-06-20组的五条记录中应该取哪一条。您必须应用您的订购标准来订购预期的一个到顶部。

最新更新