在一个查询OPTIMIZED中显示今天、昨天和其他条件的用户评论摘要



我看到了许多与获取不同日期范围但在单独查询中的数据计数有关的问题。我被要求显示每个用户今天、昨天访问了多少次,以及用户请求"文件部分"的次数。因此,我需要显示许多列,每个列对不同日期范围内的每个用户都有不同的计数,或者有特定的详细信息。

以下是创建和插入初始数据的脚本:

CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(45) DEFAULT NULL,
`realname` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `username_UNIQUE` (`username`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
CREATE TABLE `history` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(45) DEFAULT NULL,
`access_date` datetime DEFAULT NULL,
`detail` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
INSERT INTO users (username,realname) VALUES
("mwazowski","Mike Wazowski"),("jsullivan","James P. Sullivan"),
("rboggs","Randall Boggs"),("hwaternoose","Henry J. Waternoose");
INSERT INTO history (username,access_date,detail) VALUES
("","2020-04-22 12:00:00","publicData"),
("mwazowski","2020-04-22 12:01:00","login"),
("mwazowski","2020-04-22 12:02:00","practice"),
("mwazowski","2020-04-22 12:04:00","files"),
("mwazowski","2020-04-22 12:10:00","logout"),
("","2020-04-23 12:25:00","publicData"),
("","2020-04-23 12:27:00","publicData"),
("jsullivan","2020-04-23 12:30:00","login"),
("jsullivan","2020-04-23 12:35:00","files"),
("jsullivan","2020-04-23 12:40:00","logout"),
("","2020-04-23 12:52:00","publicData"),
("rboggs","2020-04-23 13:00:00","login"),
("rboggs","2020-04-23 13:01:00","files"),
("rboggs","2020-04-23 13:40:00","logout"),
("","2020-04-23 13:43:00","publicData");

这是表Users:的结果集

+----+-------------+---------------------+
| id |  username   |      realname       |
+----+-------------+---------------------+
|  1 | mwazowski   | Mike Wazowski       |
|  2 | jsullivan   | James P. Sullivan   |
|  3 | rboggs      | Randall Boggs       |
|  4 | hwaternoose | Henry J. Waternoose |
|    |             |                     |
+----+-------------+---------------------+

这是表历史的结果集:

+----+-----------+---------------------+------------+
| id | username  |     access_date     |   detail   |
+----+-----------+---------------------+------------+
|  1 |           | 2020-04-22 12:00:00 | publicData |
|  2 | mwazowski | 2020-04-22 12:01:00 | login      |
|  3 | mwazowski | 2020-04-22 12:02:00 | practice   |
|  4 | mwazowski | 2020-04-22 12:04:00 | files      |
|  5 | mwazowski | 2020-04-22 12:10:00 | logout     |
|  6 |           | 2020-04-23 12:25:00 | publicData |
|  7 |           | 2020-04-23 12:27:00 | publicData |
|  8 | jsullivan | 2020-04-23 12:30:00 | login      |
|  9 | jsullivan | 2020-04-23 12:35:00 | files      |
| 10 | jsullivan | 2020-04-23 12:40:00 | logout     |
| 11 |           | 2020-04-23 12:52:00 | publicData |
| 12 | rboggs    | 2020-04-23 13:00:00 | login      |
| 13 | rboggs    | 2020-04-23 13:01:00 | files      |
| 14 | rboggs    | 2020-04-23 13:40:00 | logout     |
| 15 |           | 2020-04-23 13:43:00 | publicData |
+----+-----------+---------------------+------------+

这是我设计的查询,按用户组显示访问记录的计数、今天和昨天的日期范围以及特定的详细数据(添加了字段详细信息="文件"的记录(:

select coalesce(u.realname,"PUBLIC") username,
coalesce(h2.todayCount,0) nToday,
coalesce(h3.yesterdayCount,0) nYesterday,
coalesce(h4.fileCount,0) nFiles
from (select distinct username from history) h1
left join users u on h1.username=u.username
left join (select username,count(1) todayCount from history
where access_date>=current_date() group by username) h2
on h1.username=h2.username
left join (select username,count(1) yesterdayCount from history
where access_date between (current_date()-1) and current_date() group by username) h3
on h1.username=h3.username
left join (select username,count(1) fileCount from history
where detail="files" group by username) h4
on h1.username=h4.username order by h1.username;

这个查询可以工作并根据需要显示数据,但随着数据的增加,速度会变慢,而且我以后可能需要添加更多的列,这意味着要额外读取所有历史记录。历史记录表被多次读取,如果某个时候需要更多的日期范围,情况会变得最糟。这是带有初始数据的查询结果:

+-------------------+--------+------------+--------+
|     username      | nToday | nYesterday | nFiles |
+-------------------+--------+------------+--------+
| PUBLIC            |      4 |          1 |      0 |
| James P. Sullivan |      3 |          0 |      1 |
| Mike Wazowski     |      0 |          4 |      1 |
| Randall Boggs     |      3 |          0 |      1 |
+-------------------+--------+------------+--------+

样本数据中包含的日期仅能产生预期结果,而今天是4月23日。为了让这个测试数据在未来的任何一天都能工作,请运行下一个脚本:

UPDATE history SET access_date=concat((curdate() - INTERVAL 1 DAY)," ",time(access_date)) WHERE id<=5;
UPDATE history SET access_date=concat(curdate()," ",time(access_date)) WHERE id>5;

我肯定还没有完全理解您的要求。尽管如此,如果是我,我会将查询保持为这样简单的内容,并处理应用程序代码中的任何剩余显示问题:

DROP TABLE IF EXISTS users;
CREATE TABLE `users` (
`user_id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(45) DEFAULT NULL,
`realname` varchar(45) DEFAULT NULL,
PRIMARY KEY (`user_id`),
UNIQUE KEY `username_UNIQUE` (`username`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS history;
CREATE TABLE `history` (
`history_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` INT NULL,
`access_date` datetime DEFAULT NULL,
`detail` varchar(45) DEFAULT NULL,
PRIMARY KEY (`history_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
INSERT INTO users (username,realname) VALUES
("mwazowski","Mike Wazowski"),("jsullivan","James P. Sullivan"),
("rboggs","Randall Boggs"),("hwaternoose","Henry J. Waternoose");
INSERT INTO history (user_id,access_date,detail) VALUES
(NULL,"2020-04-22 12:00:00","publicData"),
(1,"2020-04-22 12:01:00","login"),
(1,"2020-04-22 12:02:00","practice"),
(1,"2020-04-22 12:04:00","files"),
(1,"2020-04-22 12:10:00","logout"),
(NULL,"2020-04-23 12:25:00","publicData"),
(NULL,"2020-04-23 12:27:00","publicData"),
(2,"2020-04-23 12:30:00","login"),
(2,"2020-04-23 12:35:00","files"),
(2,"2020-04-23 12:40:00","logout"),
(NULL,"2020-04-23 12:52:00","publicData"),
(3,"2020-04-23 13:00:00","login"),
(3,"2020-04-23 13:01:00","files"),
(3,"2020-04-23 13:40:00","logout"),
(NULL,"2020-04-23 13:43:00","publicData");
SELECT u.user_id
, u.username  
, u.realname          
, DATE(h.access_date) access_date
, COALESCE(COUNT(*),0) total
, COALESCE(SUM(h.detail = 'files'),0) total_files
FROM users u
LEFT
JOIN history h
ON h.user_id = u.user_id
AND h.access_date >= '2020-04-22 00:00:00' 
AND h.access_date < '2020-04-24 00:00:00'
GROUP
BY u.user_id
, DATE(h.access_date);
+---------+-------------+---------------------+-------------+-------+-------------+
| user_id | username    | realname            | access_date | total | total_files |
+---------+-------------+---------------------+-------------+-------+-------------+
|       1 | mwazowski   | Mike Wazowski       | 2020-04-22  |     4 |           1 |
|       2 | jsullivan   | James P. Sullivan   | 2020-04-23  |     3 |           1 |
|       3 | rboggs      | Randall Boggs       | 2020-04-23  |     3 |           1 |
|       4 | hwaternoose | Henry J. Waternoose | NULL        |     1 |           0 |
+---------+-------------+---------------------+-------------+-------+-------------+   

注意,我稍微修改了你的模式;对于(user_id,access_date(上的复合索引,这应该很快。

最新更新