我一直在尝试将一些状态数据分组在一起。这是我的数据的样子,例如,以Date作为索引,其余的是功能:
让我们用sum
和first
来做replace
和groupby
df.State = df.State.replace({"''":np.nan,'nan':np.nan})
out = df.groupby(['Region','Date'],as_index=False).
agg({'Population':'sum',
'Num_Men':'sum',
'Num_Women':'sum',
'State':'first'})
Out[99]:
Region Date Population Num_Men Num_Women State
0 Middle 2020-02-01 2000 950 1050 GL
1 North 2020-01-01 500 300 200 NY
2 North 2020-02-01 600 400 200 NY