我有一个表,请参阅下面的模式:
CREATE TEMP TABLE merchants
(
merchant_id varchar(15)
,product_id varchar(15)
);
INSERT INTO merchants VALUES ('Merchant A','Product 1');
INSERT INTO merchants VALUES ('Merchant A','Product 2');
INSERT INTO merchants VALUES ('Merchant A','Product 3');
INSERT INTO merchants VALUES ('Merchant B','Product 1');
INSERT INTO merchants VALUES ('Merchant B','Product 2');
INSERT INTO merchants VALUES ('Merchant B','Product 3');
INSERT INTO merchants VALUES ('Merchant B','Product 4');
INSERT INTO merchants VALUES ('Merchant C','Product 1');
INSERT INTO merchants VALUES ('Merchant C','Product 2');
INSERT INTO merchants VALUES ('Merchant C','Product 3');
INSERT INTO merchants VALUES ('Merchant D','Product 5');
SELECT *
FROM merchants;
我需要找到销售1(相同数量产品和2(完全相同产品的商家配对。
上述数据的答案是:
商户A,商户C——同一行。
为了找到这个,我想出了一个非常糟糕的解决方案IMO,但我认为它是不可扩展的,因为如果产品更多,我就不能将它们硬编码为列。另外,我的答案是两排,而不是一对。
CREATE TEMP TABLE merchants_profile AS
(
SELECT
merchant_id
, COUNT(DISTINCT product_id) AS num_products
, SUM(CASE WHEN product_id = 'Product 1' THEN 1 ELSE 0 END) AS num_1
, SUM(CASE WHEN product_id = 'Product 2' THEN 1 ELSE 0 END) AS num_2
, SUM(CASE WHEN product_id = 'Product 3' THEN 1 ELSE 0 END) AS num_3
, SUM(CASE WHEN product_id = 'Product 4' THEN 1 ELSE 0 END) AS num_4
, SUM(CASE WHEN product_id = 'Product 5' THEN 1 ELSE 0 END) AS num_5
FROM merchants
GROUP BY 1
);
SELECT A.merchant_id
FROM merchants_profile AS A
INNER JOIN
merchants_profile AS B
ON A.merchant_id <> B.merchant_id
AND (A.num_products = B.num_products
AND A.num_1 = B.num_1
AND A.num_2= B.num_2
AND A.num_3 = B.num_3
AND A.num_4 = B.num_4
AND A.num_5 = B.num_5)
答案:
merchant_id商户C商户A
有人能想出一种更具创造性/更有效的方法来做到这一点吗,而不使用枢轴功能。我需要在红移中解决这个问题。
可以看出,我的解决方案是不正确的,我想不出找到配对的方法。
这能满足您的需求吗?
select distinct
listagg(product_id,',') within group (order by product_id) as product_combos,
listagg(merchant_id,',') within group (order by merchant_id) over (partition by product_combos) as merchant_list,
count(1) over (partition by product_combos) as count_of_merchants
from merchants
group by merchant_id
order by count_of_merchants desc;
最简单的方法是使用listagg()
。这些生成了每套产品的商家列表:
select products, listagg(merchant_id) within group (order by merchant_id) as merchants
from (select merchant_id,
listagg(product_id) within group (order by product_id) as products
from merchants_profile
group by merchant_id
) m
group by products
having count(*) > 1;
你可以做同样的事情,而不需要在商家级别进行聚合,但这更简单。