将lambda与panda结合使用,可以根据现有列计算新列



我需要在pandas DataFrame中创建一个新列,该列按DataFrame中现有2列的比率计算。但是,比率计算中的分母将根据DataFrame中另一列中的字符串值而变化。

示例。样本数据集:

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
'exp_force' : [25,28,82,84], 
'left_max'  : [38,38,38,38], 
'both_max'  : [90,90,90,90]})

我需要根据df['hand']的条件创建一个新的DataFrame列df['ratio']

如果df['hand']=='left',则df['ratio'] = df['exp_force'] / df['left_max']

如果df['hand']=='both',则df['ratio'] = df['exp_force'] / df['both_max']

您可以使用np.where():

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
'exp_force' : [25,28,82,84], 
'left_max'  : [38,38,38,38], 
'both_max'  : [90,90,90,90]})
df['ratio'] = np.where((df['hand']=='left'), df['exp_force'] / df['left_max'], df['exp_force'] / df['both_max'])
df
Out[42]: 
hand  exp_force  left_max  both_max     ratio
0  left         25        38        90  0.657895
1  left         28        38        90  0.736842
2  both         82        38        90  0.911111
3  both         84        38        90  0.933333

或者,在现实生活中,如果您有很多条件和结果,那么您可以使用np.select(),这样您就不必像我在旧代码中做的那样不断重复np.where()语句。在以下情况下最好使用np.select

import pandas as pd
df = pd.DataFrame(data={'hand'      : ['left','left','both','both'], 
'exp_force' : [25,28,82,84], 
'left_max'  : [38,38,38,38], 
'both_max'  : [90,90,90,90]})
c1 = (df['hand']=='left')
c2 = (df['hand']=='both')
r1 = df['exp_force'] / df['left_max']
r2 = df['exp_force'] / df['both_max']
conditions = [c1,c2]
results = [r1,r2]
df['ratio'] = np.select(conditions,results)
df
Out[430]: 
hand  exp_force  left_max  both_max     ratio
0  left         25        38        90  0.657895
1  left         28        38        90  0.736842
2  both         82        38        90  0.911111
3  both         84        38        90  0.933333

枚举

for i,e in enumerate(df['hand']):

if e == 'left':
df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'left_max']
if e == 'both':
df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'both_max']
df

输出:

hand    exp_force   left_max    both_max    ratio
0   left    25            38          90      0.657895
1   left    28            38          90      0.736842
2   both    82            38          90      0.911111
3   both    84            38          90      0.933333

您可以使用数据帧的apply()方法:

df['ratio'] = df.apply(
lambda x: x['exp_force'] / x['left_max'] if x['hand']=='left' else x['exp_force'] / x['both_max'],
axis=1
)

最新更新