使用sum up重复的Sql连接



1)样本数据集看起来像这样:

create table user(
user_id int,
name varchar(10),
surname varchar(10)
);

insert into user(user_id, name, surname) values 
(1, 'a', 'aa'), 
(2, 'b', 'bb'), 
(3, 'c', 'cc');

create table books(
user_id int,
book_name varchar(10)
);
insert into books(user_id, book_name) values
(1, 'book1'), 
(1, 'book2'), 
(1, 'book3'), 
(2, 'book1');
create table expanses(
id int,
user_id int,
amount_spent int,
date timestamp
);
insert into expanses(id, user_id, amount_spent, date)
values
(1,1,10, '2020-02-03'),
(2,1,10, '2020-02-03'),
(3,1,30, '2020-02-02'),
(4,1,12, '2020-02-01'),
(5,1,15, '2020-01-31'),
(6,1,13, '2020-01-15'),
(7,2,15, '2020-02-01'),
(8,3,20, '2020-02-01');

我想要的结果:

| CountUsers | amount_spent |
|---------|--------------|
| 2       | 77           |

解释:I want to count

a)有多少用户拥有book1或book2和

b)在2020-02-01 - 2020-02-03期间,他们在约会中总共花了多少钱。

现在查询应该是什么样子?

我正在使用MySQL版本8

I have try:

SELECT count(*) 
, sum(amount_spend) as total_amount_spend
FROM
( select sum(amount_spend) as amount_spend
FROM expanses e
LEFT 
JOIN books b
ON b.user_id = e.user_id 
WHERE (b.book_name ='book1' or b.book_name ='book2') 
and e.date between '2020-02-01' and '2020-02-03' 
GROUP 
BY e.user_id) src'

结果是错误的,因为select子句从内部(稍微修改,以更清楚地显示):

select amount_spend as amount_spend
FROM expanses 
LEFT JOIN books ON books.user_id = expanses.user_id WHERE (books.book_name ='book1' or books.book_name ='book2') and expanses.date between '2020-02-01' and '2020-02-03' 

3)将返回如下内容:

| user_id | amount_spent | book_name |
|---------|--------------|-----------|
| 1       | 10           | book1     |
| 1       | 30           | book1     |
| 1       | 30           | book1     |
| 1       | 12           | book1     |
| 1       | 10           | book2     |
| 1       | 10           | book2     |
| 1       | 30           | book2     |
| 1       | 12           | book2     |
| 2       | 15           | book1     |
所以如果把这些加起来,我们会得到
| CountUsers | amount_spent |
|---------|--------------|
| 2       | 139           |

这是错误的,因为有重复的。

如果我们将DISTINCT加到sum(DISTINCT amount_spend)它也将是错误的,因为它将给出以下结果

| CountUsers | amount_spent |
|---------|--------------|
| 2       | 67           |

总结一下,您可以在表3中看到,在book_name中有一些重复的amount_spent。(一对多关系)

如何避免重复amount_spent,但保持与book_name?

小提琴

select count(distinct user_id) 
, sum(amount_spent) 
from expanses 
where expanses.date between '2020-02-01' and '2020-02-03'
and user_id in (select user_id from books where book_name in('book1','book2'))

https://www.db-fiddle.com/f/26ifPWyRRKGp9YVQXg1qje/0

a)

SELECT COUNT(DISTINCT user_id) total FROM books WHERE book_name IN ('book1','book2');

b)

SELECT SUM(amount_spent) total_spent 
FROM expanses e
WHERE e.date BETWEEN '2020-02-01' AND '2020-02-03'
AND EXISTS 
( SELECT * 
FROM books b
WHERE b.user_id = e.user_id
AND b.book_name IN ('book1','book2')
);

这是一个结合a)和b)的想法

SELECT SUM(amount_spent) total_spent
, (SELECT COUNT(DISTINCT user_id) total FROM books WHERE book_name IN ('book1','book2')) total_customers
FROM expanses e
WHERE e.date BETWEEN '2020-02-01' AND '2020-02-03'
AND EXISTS 
( SELECT * 
FROM books b
WHERE b.user_id = e.user_id
AND b.book_name IN ('book1','book2')
);

最新更新