我有一个表,其中包含一个game_id
、category
日期列,表示一个月的活动,表示为该月的第一天date_month
,以及一个总amount
。
在某些月份,game_id
和date_month
都有缺失的类别,我需要在每个月内为每个游戏的整个表中缺失的唯一组填充这些缺失的行。
举以下例子:
CREATE TEMPORARY TABLE activity (
game_id INT,
category TEXT,
date DATE,
amount INT
);
INSERT INTO activity (game_id, category, date, amount) VALUES
(1, 'Up', '2015-12-01', 9)
, (1, 'Down', '2015-12-01', 12)
-- Left Missing for '2015-12-01
-- Right Missing for '2015-12-01
, (1, 'Up', '2016-01-01', 12)
, (1, 'Down', '2016-01-01', 4)
, (1, 'Left', '2016-01-01', 7)
, (1, 'Right', '2016-01-01', 3)
, (1, 'Up', '2016-02-01', 3)
, (1, 'Down', '2016-02-01', 11)
, (1, 'Left', '2016-02-01', 4)
, (1, 'Right', '2016-02-01', 8)
, (1, 'Up', '2016-03-01', 3)
, (1, 'Down', '2016-03-01', 11)
-- Left Missing for '2016-03-01'
, (1, 'Right', '2016-03-01', 8)
, (1, 'Up', '2016-04-01', 3)
, (1, 'Down', '2016-04-01', 11)
, (1, 'Left', '2016-04-01', 4)
-- Right Missing for '2016-04-01'
, (2, 'Up', '2020-12-01', 9)
, (2, 'Down', '2020-12-01', 12)
-- Left Missing for '2020-12-01'
-- Right Missing for '2020-12-01'
, (2, 'Up', '2020-01-01', 12)
, (2, 'Down', '2020-01-01', 4)
, (2, 'Left', '2020-01-01', 7)
-- Right Missing for '2020-01-01'
;
在这种情况下,以下缺失的值需要创建为0,game_id
可以有一组不同的日期范围。
(1, 'Left', '2015-12-01', 0)
(1, 'Right', '2015-12-01', 0)
(1, 'Left', '2016-03-01', 0)
(1, 'Right', '2016-04-01', 0)
(2, 'Left', '2020-12-01', 0)
(2, 'Right', '2020-12-01', 0)
(2, 'Right', '2020-01-01', 0)
到目前为止,我的尝试是在UNION
中使用它回到主表上。这不会产生任何行,因为不会生成超出其最小和最大日期范围的丢失组。
SELECT
game_id,
category,
generate_series(
min(date_month),
max(date_month),
'1month'
)::date AS date_month,
0 as amount
FROM activity
WHERE NOT EXISTS (
SELECT
1
FROM activity
WHERE game_id=game_id AND category=category AND date_month=date_month
)
GROUP BY 1,2
ORDER BY game_id, date_month
您必须CROSS
将表的不同game_id
和date
组合联接到表的不同的category
,然后LEFT
联接到表:
SELECT d.game_id, c.category, d.date, COALESCE(a.amount, 0) amount
FROM (SELECT DISTINCT game_id, date FROM activity) d
CROSS JOIN (SELECT DISTINCT category FROM activity) c
LEFT JOIN activity a
ON a.game_id = d.game_id AND a.date = d.date AND a.category = c.category
ORDER BY d.game_id, d.date
如果您想在表中插入缺失的行:
INSERT INTO activity (game_id, category, date, amount)
SELECT d.game_id, c.category, d.date, 0
FROM (SELECT DISTINCT game_id, date FROM activity) d
CROSS JOIN (SELECT DISTINCT category FROM activity) c
LEFT JOIN activity a
ON a.game_id = d.game_id AND a.date = d.date AND a.category = c.category
WHERE a.game_id IS NULL
请参阅演示