我有一个csv文件,如下所示,它有10000多行在此处输入链接描述ID参考(_r(235R23 356982B 362C879空白625478 119S4284 11985U12 11524555 5899L852 601024T4 58102W49 3258q34空白….
I'd like to calculate the frequency for col Ref_r (1 to 99), where consists integers and blanks:
df=pd.DataFrame(数据(df1=df[‘Ref_r’]。value_counts((
however it doesn't work....
Expected result would be:
```none
Ref_r Frequency
1 0
2 0
3 3
...
11 3
...
58 2
59 0
60 1
99 0
blank 2
IIUC,您可以将panda与一起使用
import pandas as pd
(pd.read_csv('input.csv', sep='s+', dtype='str')['Ref_r']
.value_counts(sort=False)
.reindex(list(map(str, range(1,100)))+['blank'], fill_value=0)
.rename_axis('Ref_r')
.reset_index(name='Frequency')
.to_csv('out.csv', index=False, sep='t')
)
示例输出:
Ref_r Frequency
1 0
2 0
3 3
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 3
12 0
...
98 0
99 0
blank 2