给定如下结构的表:
subscriber_id,乐队1
11日1
12日1
13日
…2
21日2
22日2
23日2
24日
…
n1, n
n2, n
n3, n
…
nm, n
我想从每个组中得到一个n%大小的订阅者子组。对于10%,我应该得到第一组的10%第二组的10%n组10%
听起来你想要一个分层样本。您可以先在每个组中枚举,然后选择需要的"n"条记录。下面是如何在SQL Server中这样做的一个例子:
select t.id, t.band
from (select t.*,
row_number() over (order by band_seqnum) as seqnum
from (select t.*,
row_number() over (partition by band order by rand(checksum()) as band_seqnum,
count(*) over () as cnt
from t
) t
) t
where band_seqnum <= 0.10 * cnt;
Try This
Select * from
(
Select *, NTILE(n%) over(partition by id order by id) 'R' from t)t
where t.R<=(n%)