SQL Server 2014
我有一个包含以下内容的表格:-
[id ] [transaction_date] [customer_id]
1 2019-11-18 00:00:00.000 1234
2 2019-11-18 00:00:00.000 1234
3 2019-11-18 00:00:00.000 5687
4 2019-11-19 00:00:00.000 9999
5 2019-11-19 00:00:00.000 8888
...
客户可以在一天内进行多笔交易。
我需要返回自 2019 年 9 月 1 日以来每天的前 10 名客户(基于交易数量(,例如客户 1234 在 2019 年 11 月 18 日进行了 2 笔交易。
我该怎么做?
我从以下的初始查询开始:-
SELECT COUNT(*) as transactions, customer_id, DATEADD(DAY,0, DATEDIFF(day,0, transaction_date)) as 'transaction_date'
FROM InvoiceTable
WHERE transaction_date > {ts '2019-09-01 00:00:00'}
GROUP BY customer_id, DATEADD(DAY,0, DATEDIFF(DAY,0, transaction_date))
ORDER BY transaction_date
但在那之后并没有取得更大的进展。
要提取日期,请使用转换为date
。dateadd()
方法已经过时了十多年。
可以组合窗口函数和聚合,因此查询可能如下所示:
SELECT t.*
FROM (SELECT CONVERT(DATE, transaction_date) as transaction_date, customer_id,
COUNT(*) as transactions,
ROW_NUMBER() OVER (PARTITION BY CONVERT(DATE, transaction_date) ORDER BY COUNT(*) DESC) as seqnum
FROM InvoiceTable
WHERE transaction_date > {ts '2019-09-01 00:00:00'}
GROUP BY CONVERT(DATE, transaction_date), customer_id
) t
WHERE seqnum <= 10
ORDER BY transaction_date;
请注意,这始终返回 10 行(如果每天有 10 个客户(。 如果存在绑定,则返回任意绑定的客户,并且返回的客户可能会因查询的不同运行而异。
可以将现有查询转换为子查询,然后使用窗口函数RANK()
按每天的每日事务对客户进行排名,然后筛选每组的前 10 个客户:
SELECT transactions, customer_id, transaction_date
FROM (
SELECT
t.*,
RANK() OVER(PARTITION BY transaction_date ORDER BY transactions DESC) rn
FROM (
SELECT
COUNT(*) as transactions,
customer_id,
DATEADD(DAY,0, DATEDIFF(day,0, transaction_date)) as transaction_date
FROM InvoiceTable
WHERE transaction_date > {ts '2019-09-01 00:00:00'}
GROUP BY
customer_id,
DATEADD(DAY,0, DATEDIFF(DAY,0, transaction_date))
) t
) t
WHERE rn <= 10
ORDER BY transaction_date, transactions desc
旁注:SQL Server在混合聚合和窗口函数方面非常灵活,因此这也可能有效:
SELECT transactions, customer_id, transaction_date
FROM (
SELECT
COUNT(*) as transactions,
customer_id,
DATEADD(DAY,0, DATEDIFF(day,0, transaction_date)) as transaction_date,
RANK() OVER(PARTITION BY transaction_date ORDER BY COUNT(*) DESC) rn
FROM InvoiceTable
WHERE transaction_date > {ts '2019-09-01 00:00:00'}
GROUP BY
customer_id,
DATEADD(DAY,0, DATEDIFF(DAY,0, transaction_date))
) t
WHERE rn <= 10
ORDER BY transaction_date, transactions desc
您可以使用row_number()
with cte as
(SELECT COUNT(*) as transactions, customer_id, DATEADD(DAY,0, DATEDIFF(day,0, transaction_date)) as 'transaction_date',
FROM InvoiceTable
WHERE transaction_date > {ts '2019-09-01 00:00:00'}
GROUP BY customer_id, DATEADD(DAY,0, DATEDIFF(DAY,0, transaction_date))
ORDER BY transaction_date
)
select * from
(
select *,row_number() over(partition by transaction_date order by transactions desc) as rn from cte
)A where rn<11