如何获得重叠值的计数,并按最大重叠值的百分比排序



示例:

--------------------------
| product      | store   |
--------------------------
| abc1         | 2       |
| abc2         | 2       |
| abc1         | 3       |
| abc1         | 5       |
| abc1         | 4       |
| abc2         | 3       |
| abc3         | 3       |
| abc1         | 1       |
| abc4         | 3       |
| abc5         | 3       |
--------------------------

我试图在不同的商店之间获得共享产品的百分比,并按重叠最大的商店进行分类。

我真的不知道该如何处理这个问题。

Fiddle:https://www.db-fiddle.com/f/bm5c7k7q5qbQY3Qu8t5Gvx/1

我认为您正在寻找商店之间的笛卡尔连接,然后检查sku的匹配计数。

这是我的查询

--This gets the total count of sku by store
with data
as (select store_id,count(sku) as tot_sku
from products
group by store_id
)
select a.store_id as store_id
,b.store_id as other_store_id
--when the skus in one store match with another then count those
,count(case when a.sku=b.sku then 1 end) as cnt_overlap 
,max(c.tot_sku) as tot_sku
,count(case when a.sku=b.sku then 1 end)*100.00/max(c.tot_sku) as pct_overlap
from products a
join products b
on a.store_id <> b.store_id
join data c
on a.store_id=c.store_id
group by a.store_id,b.store_id
order by 1
+----------+----------------+-------------+---------+----------------------+
| store_id | other_store_id | cnt_overlap | tot_sku |     pct_overlap      |
+----------+----------------+-------------+---------+----------------------+
|        1 |              2 |           1 |       1 | 100.0000000000000000 |
|        1 |              3 |           1 |       1 | 100.0000000000000000 |
|        1 |              4 |           1 |       1 | 100.0000000000000000 |
|        1 |              5 |           1 |       1 | 100.0000000000000000 |
|        2 |              1 |           1 |       2 |  50.0000000000000000 |
|        2 |              3 |           2 |       2 | 100.0000000000000000 |
|        2 |              4 |           1 |       2 |  50.0000000000000000 |
|        2 |              5 |           1 |       2 |  50.0000000000000000 |
|        3 |              1 |           1 |       5 |  20.0000000000000000 |
|        3 |              2 |           2 |       5 |  40.0000000000000000 |
|        3 |              4 |           1 |       5 |  20.0000000000000000 |
|        3 |              5 |           1 |       5 |  20.0000000000000000 |
|        4 |              1 |           1 |       1 | 100.0000000000000000 |
|        4 |              2 |           1 |       1 | 100.0000000000000000 |
|        4 |              3 |           1 |       1 | 100.0000000000000000 |
|        4 |              5 |           1 |       1 | 100.0000000000000000 |
|        5 |              1 |           1 |       1 | 100.0000000000000000 |
|        5 |              2 |           1 |       1 | 100.0000000000000000 |
|        5 |              3 |           1 |       1 | 100.0000000000000000 |
|        5 |              4 |           1 |       1 | 100.0000000000000000 |
+----------+----------------+-------------+---------+----------------------+

这是我的数据库小提琴链接

https://dbfiddle.uk/?rdbms=postgres_12&fiddle=151fa1f18ac25bb0362ebc47de02dd09

您可以使用自联接。以下假设您想要基于第一个商店中的产品的比例:

select t1.store_id, t2.store_id, t1.num_products,
count(*) * 1.0 / t1.num_products as ratio
from (select p.*, count(*) over (partition by store_id) as num_products
from products p
) t1 join
products t2
on t1.sku = t2.sku
group by t1.store_id, t2.store_id, t1.num_products
order by ratio desc;

这不是对称的。

这是一个数据库<>不停摆弄

最新更新