在不使用Redshift中的pivot函数的情况下,如何在SQL中比较完整的组



我有一个表,请参阅下面的模式:

CREATE TEMP TABLE merchants
(
merchant_id varchar(15)
,product_id varchar(15)
);
INSERT INTO merchants VALUES ('Merchant A','Product 1');
INSERT INTO merchants VALUES ('Merchant A','Product 2');
INSERT INTO merchants VALUES ('Merchant A','Product 3');
INSERT INTO merchants VALUES ('Merchant B','Product 1');
INSERT INTO merchants VALUES ('Merchant B','Product 2');
INSERT INTO merchants VALUES ('Merchant B','Product 3');
INSERT INTO merchants VALUES ('Merchant B','Product 4');
INSERT INTO merchants VALUES ('Merchant C','Product 1');
INSERT INTO merchants VALUES ('Merchant C','Product 2');
INSERT INTO merchants VALUES ('Merchant C','Product 3');
INSERT INTO merchants VALUES ('Merchant D','Product 5');
SELECT *
FROM merchants;

我需要找到销售1(相同数量产品和2(完全相同产品的商家配对。

上述数据的答案是:

商户A,商户C——同一行。

为了找到这个,我想出了一个非常糟糕的解决方案IMO,但我认为它是不可扩展的,因为如果产品更多,我就不能将它们硬编码为列。另外,我的答案是两排,而不是一对。

CREATE TEMP TABLE merchants_profile AS 
(
SELECT 
merchant_id
, COUNT(DISTINCT product_id) AS num_products
, SUM(CASE WHEN product_id = 'Product 1' THEN 1 ELSE 0 END) AS num_1
, SUM(CASE WHEN product_id = 'Product 2' THEN 1 ELSE 0 END) AS num_2
, SUM(CASE WHEN product_id = 'Product 3' THEN 1 ELSE 0 END) AS num_3
, SUM(CASE WHEN product_id = 'Product 4' THEN 1 ELSE 0 END) AS num_4
, SUM(CASE WHEN product_id = 'Product 5' THEN 1 ELSE 0 END) AS num_5
FROM merchants
GROUP BY 1
);
SELECT A.merchant_id
FROM merchants_profile AS A 
INNER JOIN 
merchants_profile AS B 
ON A.merchant_id <> B.merchant_id
AND (A.num_products = B.num_products 
AND A.num_1 = B.num_1 
AND A.num_2= B.num_2 
AND A.num_3 = B.num_3
AND A.num_4 = B.num_4
AND A.num_5 = B.num_5)

答案:

merchant_id商户C商户A

有人能想出一种更具创造性/更有效的方法来做到这一点吗,而不使用枢轴功能。我需要在红移中解决这个问题。

可以看出,我的解决方案是不正确的,我想不出找到配对的方法。

这能满足您的需求吗?

select distinct
listagg(product_id,',') within group (order by product_id) as product_combos, 
listagg(merchant_id,',') within group (order by merchant_id) over (partition by product_combos) as merchant_list,
count(1) over (partition by product_combos) as count_of_merchants
from merchants
group by merchant_id
order by count_of_merchants desc;

最简单的方法是使用listagg()。这些生成了每套产品的商家列表:

select products, listagg(merchant_id) within group (order by merchant_id) as merchants
from (select merchant_id,
listagg(product_id) within group (order by product_id) as products
from merchants_profile
group by merchant_id
) m
group by products
having count(*) > 1;

你可以做同样的事情,而不需要在商家级别进行聚合,但这更简单。

最新更新