我有一个10个值的数据,我想根据它的百分比分配它们。
我的数据:
stock value
s_001 -0.001932
s_002 0.004001
s_003 0.001323
s_004 -0.006785
s_005 0.004405
s_006 -0.002872
s_007 0.003101
s_008 0.001383
s_009 -0.004785
s_010 0.001405
百分位数:
breakpoints = [0, 20, 40, 60, 80]
我用了df。Sort_values按照时间顺序对值进行排序:
stock value
s_001 -0.001932
s_006 -0.002872
s_009 -0.004785
s_004 -0.006785
s_003 0.001323
s_008 0.001383
s_010 0.001405
s_007 0.003101
s_002 0.004001
s_005 0.004405
排序后,如何将前两个值分配给第一个百分位数,然后将后两个值分配给第二个百分位数,依此类推?
您可以使用pandas.qcut
。需要将断点设置为0到1之间的数字:
breakpoints = [0. , 0.2, 0.4, 0.6, 0.8]
df['quantile'] = pd.qcut(df['value'],
breakpoints+[1],
labels=[int(i*100) for i in breakpoints]
)
NB。数据框不需要为
进行排序输出:
stock value quantile
0 s_001 -0.001932 20
1 s_002 0.004001 80
2 s_003 0.001323 40
3 s_004 -0.006785 0
4 s_005 0.004405 80
5 s_006 -0.002872 20
6 s_007 0.003101 60
7 s_008 0.001383 40
8 s_009 -0.004785 0
9 s_010 0.001405 60