使用单个表中的PARTITION对多列求和



我有一个问题,看起来很简单,但我想不通。我有一张这样的样品表:

Overtime Table (OT)
+----------+------------+----------+-------------+
|EmployeeId|OvertimeDate|HourMargin|OvertimePoint|
+----------+------------+----------+-------------+
|         1|  2020-07-01|     05:00|           15|
|         1|  2020-07-02|     03:00|            9|
|         2|  2020-07-01|     01:00|            3|
|         2|  2020-07-03|     03:00|            9|
|         3|  2020-07-06|     03:00|            9|
|         3|  2020-07-07|     01:00|            3|
+----------+------------+----------+-------------+
OLC Table (OLC)
+----------+------------+-----+------+
|EmployeeId|   OLCDate  | OLC | Trip |
+----------+------------+-----+------+
|         1|  2020-07-01|    2|     0|
|         3|  2020-07-13|    3|     6|
+----------+------------+-----+------+

因此,基于这些表,我想计算OT.HourMargin、OT.OTPoint、OLC.OLC和OLC.Trip的总和,最终结果如下:

Result
+----------+-----------+----------+--------+----------+
|EmployeeId|TotalMargin|TotalPoint|TotalOLC|TotalPoint|
+----------+-----------+----------+--------+----------+
|         1|      08:00|        24|       2|         0|
|         2|      04:00|        12|       0|         0|
|         3|      04:00|        24|       3|         6|
+----------+-----------+----------+--------+----------+

以下是我试图实现的查询结果:

DECLARE @Overtime TABLE (
    EmployeeId      INT,
    OvertimeDate    DATE,
    HourMargin      TIME,
    OvertimePoint   INT
)
DECLARE @OLC TABLE (
    EmployeeId      INT,
    OLCDate         DATE,
    OLC             INT,
    Trip            INT
)
INSERT INTO @Overtime VALUES (1, '2020-07-01', '05:00:00', 15)
INSERT INTO @Overtime VALUES (1, '2020-07-02', '03:00:00', 9)
INSERT INTO @Overtime VALUES (2, '2020-07-01', '01:00:00', 3)
INSERT INTO @Overtime VALUES (2, '2020-07-03', '03:00:00', 9)
INSERT INTO @Overtime VALUES (3, '2020-07-06', '03:00:00', 9)
INSERT INTO @Overtime VALUES (3, '2020-07-07', '01:00:00', 3)
INSERT INTO @OLC VALUES (1, '2020-07-01', 2, 0)
INSERT INTO @OLC VALUES (3, '2020-07-13', 3, 6)
SELECT
    OT.EmployeeId,
    CONVERT(TIME, DATEADD(MS, (SUM(DATEDIFF(MS, '00:00:00.000', OT.HourMargin)) OVER (PARTITION BY OT.EmployeeId)), '00:00:00.000')) AS TotalMargin,
    SUM(OT.OvertimePoint) OVER (PARTITION BY OT.EmployeeId) AS TotalPoint,
    SUM(OLC.OLC) OVER (PARTITION BY OLC.EmployeeId) AS TotalOLC,
    SUM(OLC.Trip) OVER (PARTITION BY OLC.EmployeeId) AS TotalTrip
FROM
    @Overtime OT
    LEFT JOIN @OLC OLC ON OLC.EmployeeId = OT.EmployeeId
                          AND OLC.OLCDate = OT.OvertimeDate
ORDER BY
    EmployeeId

以下是我的查询结果:

+----------+-----------+----------+--------+----------+
|EmployeeId|TotalMargin|TotalPoint|TotalOLC|TotalPoint|
+----------+-----------+----------+--------+----------+
|         1|      08:00|        24|    NULL|      NULL|
|         1|      08:00|        24|       2|         0|
|         2|      04:00|        12|    NULL|      NULL|
|         2|      04:00|        12|    NULL|      NULL|
|         3|      04:00|        12|    NULL|      NULL|
|         3|      04:00|        12|    NULL|      NULL|
+----------+-----------+----------+--------+----------+

当我尝试对单个表中的多列求和时,似乎会在最终结果中创建多行。现在,我想到的是使用CTE,将多列分离为多个CTE,并从所有CTE中进行查询。或者甚至尝试创建临时表/表变量,查询每列的总和并存储/更新它

那么,知道如何在不使用多个CTE或临时表的情况下实现我的结果吗?

谢谢

您希望将属于同一EmployeeID的行分组在一起,因此这意味着聚合而不是窗口函数:

SELECT
    OT.EmployeeId,
    CONVERT(TIME, DATEADD(MS, SUM(DATEDIFF(MS, '00:00:00.000', OT.HourMargin)), '00:00:00.000')) AS TotalMargin,
    SUM(OT.OvertimePoint) AS TotalPoint,
    COALESCE(SUM(OLC.OLC), 0) AS TotalOLC,
    COALESCE(SUM(OLC.Trip), 0) AS TotalTrip
FROM @Overtime OT
LEFT JOIN @OLC OLC ON OLC.EmployeeId = OT.EmployeeId
GROUP BY OT.EmployeeId

我也看不出日期上的联接条件有什么意义,所以我删除了它。最后,对于没有OLC的行,可以使用coalesce()返回0

DB Fiddle上的演示

EmployeeId|TotalMargin|TotalPoint|TotalOLC|TotalTrip---------:|:---------------|--------:|-------:|--------1|08:00:00|24|4|02|04:00:00|12|0|03|04:00:00|12|6|12

您已经决定使用SUM OVER,但您遇到了"问题";多行的。。。这就是总和的作用;你可以想象做一个OVER(PARTITION..(做一个自动连接回驱动表的组,这样你就可以得到驱动表中的所有行以及求和的重复结果

这里有一个简单的数据集:

ProductID, Price
1, 100
1, 200
2, 300
2, 400

以下是一些查询和结果:

--perform a basic group and sum
SELECT ProductID, SUM(Price) S FROM x GROUP BY ProductID
1, 300
2, 700

--perform basic group/sum and join it back to the main table 
SELECT ProductID, Price, S
FROM
  x 
  INNER JOIN 
  (SELECT ProductID, SUM(Price) s FROM x GROUP BY ProductID) y 
  ON x.ProductID = y.ProductID
1, 100, 300
1, 200, 300
2, 300, 700
2, 400, 700

--perform a sum over, the partition here being the same as the earlier group
SELECT ProductID, Price, SUM(Price) OVER(PARTITION BY ProductID) FROM x
1, 100, 300
1, 200, 300
2, 300, 700
2, 400, 700

您可以看到后两者产生相同的结果,即附加了总计的额外行。如果你认为这就是数据库内部所做的,它可能会帮助你理解简单的窗口函数——;"分区";,使用它执行子查询组,并将结果连接回分区中的任何列

看起来你真正想要的是一个简单的小组:

SELECT
  OT.EmployeeId,
  CONVERT(TIME, DATEADD(MS, (SUM(DATEDIFF(MS, '00:00:00.000', OT.HourMargin))), '00:00:00.000')) AS TotalMargin,
  SUM(OT.OvertimePoint) AS TotalPoint,
  SUM(OLC.OLC) AS TotalOLC,
  SUM(OLC.Trip) AS TotalTrip
FROM @Overtime OT
LEFT JOIN @OLC OLC ON OLC.EmployeeId = OT.EmployeeId
                      AND OLC.OLCDate = OT.OvertimeDate
GROUP BY OT.EmployeeID

最新更新