对熊猫中共享标识符的每一行进行分割


id item val 
o a  tire  5
1 a  brick 5
2 b  wheel 9
3 b  brick 6
4 c  ice   6
5 c  brick 3
6 d  brick 3
7 d  grass 6

假设我有这个数据帧。我想把";砖;通过与"共享id的项目的值;砖";。最终结果应该是这样的:

id item val per/brick
o a  tire  5     1
1 a  brick 5     1
2 b  wheel 9     0.7
3 b  brick 6     1
4 c  ice   6     .5
5 c  brick 3     1
6 d  brick 3     .5
7 d  grass 6     2

我尝试使用for循环声明列:

perbrick=[]
for x in df['id']:
if df[df['id']==x & df['item']!='brick']:
perbrick.append(df[(df['id']==x)&(df['item']=='brick')]['val']/df[df['id']==x]['val']
else:
perbrick.append(1)

然而,这只会产生TypeError: Cannot perform 'rand_' with a dtyped [object] array and scalar of type [bool]

提前感谢

编辑:我想用砖块的价值除以其他物品的价值

您可以使用groupby.transform:

bricks = (df['val']
.where(df['item'].eq('brick'), 0)
.groupby(df['id']).transform('sum')
)
df['per/brick'] = df['val'].rdiv(bricks).round(1)

输出:

id   item  val  per/brick
o  a   tire    5        1.0
1  a  brick    5        1.0
2  b  wheel    9        0.7
3  b  brick    6        1.0
4  c    ice    6        0.5
5  c  brick    3        1.0
6  d  brick    3        1.0
7  d  grass    6        0.5

这里有一种方法可以实现

#create a dictionary of id, val for brick items
d=dict(df.loc[df['item'].eq('brick')][['id','val']].values)

结果字典

{'a': 5, 'b': 6, 'c': 3, 'd': 3}
# divide by the brick mapped value
df['perbrick']=df['val'].div( df['id'].map(d)  )
df
id  item    val     perbrick
o   a   tire    5   1.0
1   a   brick   5   1.0
2   b   wheel   9   1.5
3   b   brick   6   1.0
4   c   ice     6   2.0
5   c   brick   3   1.0
6   d   brick   3   1.0
7   d   grass   6   2.0

最新更新