示例:
--------------------------
| product | store |
--------------------------
| abc1 | 2 |
| abc2 | 2 |
| abc1 | 3 |
| abc1 | 5 |
| abc1 | 4 |
| abc2 | 3 |
| abc3 | 3 |
| abc1 | 1 |
| abc4 | 3 |
| abc5 | 3 |
--------------------------
我试图在不同的商店之间获得共享产品的百分比,并按重叠最大的商店进行分类。
我真的不知道该如何处理这个问题。
Fiddle:https://www.db-fiddle.com/f/bm5c7k7q5qbQY3Qu8t5Gvx/1
我认为您正在寻找商店之间的笛卡尔连接,然后检查sku的匹配计数。
这是我的查询
--This gets the total count of sku by store
with data
as (select store_id,count(sku) as tot_sku
from products
group by store_id
)
select a.store_id as store_id
,b.store_id as other_store_id
--when the skus in one store match with another then count those
,count(case when a.sku=b.sku then 1 end) as cnt_overlap
,max(c.tot_sku) as tot_sku
,count(case when a.sku=b.sku then 1 end)*100.00/max(c.tot_sku) as pct_overlap
from products a
join products b
on a.store_id <> b.store_id
join data c
on a.store_id=c.store_id
group by a.store_id,b.store_id
order by 1
+----------+----------------+-------------+---------+----------------------+
| store_id | other_store_id | cnt_overlap | tot_sku | pct_overlap |
+----------+----------------+-------------+---------+----------------------+
| 1 | 2 | 1 | 1 | 100.0000000000000000 |
| 1 | 3 | 1 | 1 | 100.0000000000000000 |
| 1 | 4 | 1 | 1 | 100.0000000000000000 |
| 1 | 5 | 1 | 1 | 100.0000000000000000 |
| 2 | 1 | 1 | 2 | 50.0000000000000000 |
| 2 | 3 | 2 | 2 | 100.0000000000000000 |
| 2 | 4 | 1 | 2 | 50.0000000000000000 |
| 2 | 5 | 1 | 2 | 50.0000000000000000 |
| 3 | 1 | 1 | 5 | 20.0000000000000000 |
| 3 | 2 | 2 | 5 | 40.0000000000000000 |
| 3 | 4 | 1 | 5 | 20.0000000000000000 |
| 3 | 5 | 1 | 5 | 20.0000000000000000 |
| 4 | 1 | 1 | 1 | 100.0000000000000000 |
| 4 | 2 | 1 | 1 | 100.0000000000000000 |
| 4 | 3 | 1 | 1 | 100.0000000000000000 |
| 4 | 5 | 1 | 1 | 100.0000000000000000 |
| 5 | 1 | 1 | 1 | 100.0000000000000000 |
| 5 | 2 | 1 | 1 | 100.0000000000000000 |
| 5 | 3 | 1 | 1 | 100.0000000000000000 |
| 5 | 4 | 1 | 1 | 100.0000000000000000 |
+----------+----------------+-------------+---------+----------------------+
这是我的数据库小提琴链接
https://dbfiddle.uk/?rdbms=postgres_12&fiddle=151fa1f18ac25bb0362ebc47de02dd09
您可以使用自联接。以下假设您想要基于第一个商店中的产品的比例:
select t1.store_id, t2.store_id, t1.num_products,
count(*) * 1.0 / t1.num_products as ratio
from (select p.*, count(*) over (partition by store_id) as num_products
from products p
) t1 join
products t2
on t1.sku = t2.sku
group by t1.store_id, t2.store_id, t1.num_products
order by ratio desc;
这不是对称的。
这是一个数据库<>不停摆弄