我正在为每个客户id(customer_id
(查找最频繁购买的商品(用item_id
表示(。
select distinct on customer_id, most_freq_item from (
select customer_id, item_id as most_freq_item, count(*) as _count
from my_table
group by customer_id, item_id)
order by customer_id, _count desc;
这导致了错误:
"在"select distinct on"之后的"customer_id"处或附近出现语法错误。
SELECT stat.*
FROM (SELECT customer_id,
item_id most_freq_item,
COUNT(*) cnt,
ROW_NUMBER() OVER (PARTITION BY customer_id
ORDER BY COUNT(*) DESC) seqnum
FROM my_table
GROUP BY customer_id,
item_id) stat
WHERE seqnum = 1
您可以使用窗口函数:
select ci.*
from (select customer_id, item_id as most_freq_item,
count(*) as cnt,
row_number() over (partition by customer_id order by count(*) desc) as seqnum
from my_table
group by customer_id, item_id
) ci
where seqnum = 1;
在发生平局的情况下,这将返回一个任意的最频繁值。如果需要所有这些,请使用rank()
而不是row_number()
。