这是"任务"表:
+----+------------------------+-----------+-----------+-----------+----------------+
| id | task | subtask_1 | subtask_2 | subtask_3 | active_subtask |
+----+------------------------+-----------+-----------+-----------+----------------+
| 1 | Earn 1 million dollars | Earn 100k | Earn 500k | Earn 1m | 3 |
+----+------------------------+-----------+-----------+-----------+----------------+
| 2 | Earn 2 million dollars | Earn 1m | Earn 1.5m | Earn 2m | 1 |
+----+------------------------+-----------+-----------+-----------+----------------+
| 3 | Earn 3 million dollars | Earn 2m | Earn 2.5m | Earn 3m | 2 |
+----+------------------------+-----------+-----------+-----------+----------------+
| 4 | ... | ... | ... | ... | ... |
+----+------------------------+-----------+-----------+-----------+----------------+
我有两个查询要加入,以获得具有非唯一活动子任务的任务列表。第一个给了我一个每个任务的活动子任务(有一些"在哪里"的条件(:
+----+-----------+
| id | subtask |
+----+-----------+
| 1 | Earn 1m |
+----+-----------+
| 2 | Earn 1m |
+----+-----------+
| 3 | Earn 2.5m |
+----+-----------+
| . | ... |
+----+-----------+
第二个给了我一个所有非唯一活动子任务的列表(即有不止一个任务具有此活动子任务(:
+---------+
| subtask |
+---------+
| Earn 1m |
+---------+
| ... |
+---------+
在这里,我尝试使用INNER JOIN来获取所有具有非唯一活动子任务的任务。预期结果是:
+----+---------+
| id | subtask |
+----+---------+
| 1 | Earn 1m |
+----+---------+
| 2 | Earn 1m |
+----+---------+
| . | ... |
+----+---------+
但我总是得到第一个查询的完整结果。
简化的SQL查询(没有"where"子句(如下所示:
SELECT id,
CASE WHEN active_subtask = 1 THEN subtask_1 WHEN active_subtask = 2 THEN subtask_2 WHEN active_subtask = 3 THEN subtask_3 END
AS subtask
FROM "tasks"
INNER JOIN
(SELECT CASE WHEN active_subtask = 1 THEN subtask_1 WHEN active_subtask = 2 THEN subtask_2 WHEN active_subtask = 3 THEN subtask_3 END
AS subtask
FROM "tasks"
GROUP BY subtask
HAVING (COUNT(*) > 1)) AS agg
ON subtask = agg.subtask
我做错了什么?
我把你的问题理解为"哪些不同的任务具有相同的活动子任务">如果我说得对,这里有一种方法。。
with cte (id, subtask) as
(select id, split_part(concat_ws('|',subtask_1,subtask_2,subtask_3),'|', active_subtask)
from t)
select a.*
from cte a
join cte b on a.subtask = b.subtask and a.id <> b.id;
演示
这个方法有点麻烦,所以一定要注意nulls
,并且|
分隔符还不是数据的一部分。
另一种选择是
with cte (id, subtask) as
(select id,
case when active_subtask=1 then subtask_1
when active_subtask=2 then subtask_2
when active_subtask=3 then subtask_3 end
from tasks)
select *
from cte
where subtask in (select subtask
from cte
group by subtask
having count(distinct id)>1);
演示
我建议一种不同的方法:
/* non-unique subtasks */
select
id, task, subtask
from (
select
id, task, subtask, count(*) over(partition by subtask) c
from (
select id , task, subtask_1 as subtask from tasks where subtask_1 is not null union all
select id , task, subtask_2 as subtask from tasks where subtask_2 is not null union all
select id , task, subtask_3 as subtask from tasks where subtask_3 is not null
) fddl
) derived
where c > 1
;
/* unique subtasks */
select
id, task, subtask
from (
select
id, task, subtask, count(*) over(partition by subtask) c
from (
select id , task, subtask_1 as subtask from tasks where subtask_1 is not null union all
select id , task, subtask_2 as subtask from tasks where subtask_2 is not null union all
select id , task, subtask_3 as subtask from tasks where subtask_3 is not null
) fddl
) derived
where c = 1
;
这里的主要问题是,您有一个非规范化的数据结构,这限制了您简单查询的能力。在我看来,最好将子任务保存在一个单独的表中,以支持与任务表的多对一关系。实际上,这就是上面的联合子查询";有点";模拟。