将新索引添加到multiindex中，作为第一级计数

我想在已经存在的索引'Warnings'和'equip'之间添加一个新的多索引，并为每个'Warnings'级别添加列'count per equip'的总和。

idx = pd.MultiIndex.from_product([['warning1', 'warning2', 'warning3'],
['ff0001', 'ff0002', 'ff0003']],
names=['Warnings', 'equip'])
col = ['count per equip']
df = pd.DataFrame([100,2,1,44,45,20,25,98,0], idx, col)
df

因此结果数据帧在0级'Warnings'中具有相同的索引数，在本例中分别为[103,109,123]。

我已经设法在正确的地方求和并插入索引，但是当试图一起做时，所有值都是NaN的:

df = df.assign(total=df.groupby(level=[0]).size()).set_index('total', append=True).reorder_levels(['Warnings','total','equip'])

在assign中我们不能做groupby。因此，下面的代码创建类似的数据。

idx = pd.MultiIndex.from_product([['warning1', 'warning2', 'warning3'],
['ff0001', 'ff0002', 'ff0003']],
names=['Warnings', 'equip'])
col = ['count per equip']
df = pd.DataFrame([100,2,1,44,45,20,25,98,0], idx, col)

基于level = 0的分组

df['total'] = df.groupby(level=0).transform(lambda x: x.size)
df = df.set_index('total', append=True).reorder_levels(['Warnings','total','equip'])
print(df)
count per equip
Warnings total equip                  
warning1 3     ff0001              100
ff0002                2
ff0003                1
warning2 3     ff0001               44
ff0002               45
ff0003               20
warning3 3     ff0001               25
ff0002               98
ff0003                0

相关内容

最新更新

热门标签：