如何从另一个表中选择使用关联记录和sql avg函数

table1
user_id  prob
1        0.33
2        0.34
3        0.35
11       0.90
88       0.80
7        0.10
8        0.11
10       0.09
12       0.80
17       0.80
18       0.80
19       0.80
20       0.80
table2
user_id  canon_id 
1        456     
2        456     
3        456      
11       4344      
88       4344      
7        2023      
8        2023      
10       2023      
12       1234      
17       1234      
18       1234      
19       1234      
20       1234

在上述情况下，我如何从表2中选择记录，其中表2.canon_id中的关联表1.user_id的平均值(prob(>0.50。在这种情况下，canon_id 2023和相关联的user_ids 7,8,10 avg(prob(为0.10，因此不应选择它。

SELECT table2.user_id, AVG(prob) AS prob FROM table2
INNER JOIN table1 ON table1.user_id = table2.user_id
GROUP BY canon_id
HAVING prob > 0.50

我使用GROUP BY在一行中聚合等于canon _id，因此您可以使用AVG从聚合的行中计算平均值，然后我使用HAVING和AVG别名来满足条件。

SELECT语句中使用HAVING子句为一组行或聚合指定筛选条件。

HAVING子句通常与GROUP BY子句一起用于根据指定条件筛选组。如果省略GROUP BY子句，则HAVING子句的行为与WHERE子句类似。

将table_b连接到table_a并聚合以获得所需的canon_id：

select * from table_b
where canon_id in (
select b.canon_id
from table_b b inner join table_a a
on a.user_id = b.user_id
group by b.canon_id
having avg(a.score) > 0.5
);

请参阅演示
结果：

| user_id | canon_id |
| ------- | -------- |
| 11      | 4344     |
| 88      | 4344     |
| 12      | 3333     |
| 17      | 3333     |
| 18      | 3333     |
| 19      | 3333     |
| 20      | 3333     |

加入表，groupby canon_id的平均prob超过的一半

试试这个：

SELECT
t2.cannon
FROM table2 t2
JOIN table1 t1 ON t1.user_id = t2.user_id
GROUP BY 1
HAVING AVG(prob) > 0.5
-- will return the canon_ids: 4344, 1234

只要以下条件成立，上述查询就会给你正确的答案：

user_id在table1中是唯一的
(canon_id, user_id)的组合在table2中是唯一的

尝试以下操作。这是演示。

select
user_id,
avg_id
from
(
select
t1.user_id,
canon_id,
avg(prob) over (partition by canon_id) as avg_id
from table1 t1
join table2 t2
on t1.user_id = t2.user_id
) val
where avg_id > 0.5

相关内容

最新更新

热门标签：