我有一个数据库,其中包含表示";编辑";至";页面";。每个编辑都有一个ID和一个时间戳;状态";,其具有一定的离散值。页面具有ID;类别";。
我希望找到给定类别中每个状态的页数,只考虑最近编辑时的状态。
编辑:
+---------+---------+-----------+--------+
| edit_id | page_id | edit_time | status |
+---------+---------+-----------+--------+
| 1 | 10 | 20210502 | 90 |
| 2 | 10 | 20210503 | 91 |
| 3 | 20 | 20210504 | 91 |
| 4 | 30 | 20210504 | 90 |
| 5 | 30 | 20210505 | 92 |
| 6 | 40 | 20210505 | 90 |
| 7 | 50 | 20210503 | 90 |
+---------+---------+-----------+--------+
页码:
+---------+--------+
| page_id | cat_id |
+---------+--------+
| 10 | 100 |
| 20 | 100 |
| 30 | 100 |
| 40 | 200 |
+---------+--------+
我想获得类别100
:
+--------+-------+
| stat | count |
+--------+-------+
| 90 | 1 |
| 91 | 2 |
| 92 | 1 |
+--------+-------+
页面10
和30
有两次编辑,但后面的一次"编辑";覆盖";因此只对状态为91
和92
的编辑进行计数。页面20
和40
分别占91
和90
中的一个,并且页面50
属于错误的类别,因此它没有特征。
我试过以下方法,但似乎不起作用。其想法是为每个具有正确类别的页面选择最大(即最新(编辑。然后将其加入编辑表并按状态分组,并计算行数:
SELECT stat, COUNT(*)
FROM edits as out_e
INNER JOIN (
SELECT edit_id, page_id, max(edit_time) as last_edit
FROM edits
INNER JOIN pages on edit_page_id = page_id
WHERE cat_id = 100
GROUP BY page_id
) in_e ON out_e.edit_id = in_e.edit_id
GROUP BY stat
ORDER BY stat;
"""
例如,在这把小提琴中:http://sqlfiddle.com/#!9/42f2ed/1
结果是:
+--------+-------+
| stat | count |
+--------+-------+
| 90 | 3 |
| 91 | 1 |
+--------+-------+
获取这些信息的正确方法是什么?
SELECT cat_id, stat, COUNT(*) cnt
FROM pages
JOIN edits ON pages.page_id = edits.edit_page_id
JOIN ( SELECT edit_page_id, MAX(edit_time) edit_time
FROM edits
GROUP BY edit_page_id ) last_time ON edits.edit_page_id = last_time.edit_page_id
AND edits.edit_time = last_time.edit_time
GROUP BY cat_id, stat
输出:
cat_id | stat | cnt|
---|---|---|
100 | 90 | <1>|
100 | 91 | 2 |
100 | 92 | <1>|
200 | 90 | <1>
认为您不需要第二个联接-看看查询是否有帮助。
select
t1.stat, count(*) count_
from
(
SELECT
e.edit_id, p.page_id, e.stat,
rank() over(partition by e.edit_page_id order by e.edit_time desc) edit_rank
FROM
edits e
INNER JOIN pages p on e.edit_page_id = p.page_id
WHERE
p.cat_id = 100
) t1
where
t1.edit_rank = 1
group by
t1.stat
小提琴网址:(https://dbfiddle.uk/?rdbms=mariadb_10.3&fiddle=0f681dc8d93cc3eebf9a03c8d84850(
select e1.stat, count(e1.stat) as count
from edits e1
join (
select edit_page_id, max(edit_time) as edit_time
from edits
where edit_page_id in (
select page_id
from pages
where cat_id = 100
)
group by edit_page_id
) as e2
on e1.edit_page_id = e2.edit_page_id and e1.edit_time = e2.edit_time
group by e1.stat;
这是fiddle的链接-http://sqlfiddle.com/#!9/42f2ed/40/0
编辑:更新为考虑Edit_time而不是stat以查找最新记录