根据pandas中的多个条件为每一行计算一个分数,以避免for循环



我想基于多个列条件计算一个分数。在这一刻,我使用迭代和for循环来计算值,但我想知道是否有一个更有效的方法来实现相同的结果,避免for循环?

import pandas as pd
df = pd.DataFrame([{'column1':1, 'column2':['low grade'],'column3': True}, {'column1':8,'column2':['low grade', 'medium grade'],'column3': False}, {'column1': 7,'column2':['high grade'],'column3': True}])
for index, resource in df.iterrows():
i = 0
i = i + df.apply(lambda x: 0 if (x['column1'] == 1)
else (3 if x['column1'] > 1 and x['column1'] < 8
else (6 if x['column1'] >=8
else 0)), axis=1)  
i = i + df.apply(lambda x: 1 if ("high" in str(x['column2']))
else (2 if "low" in str(x['column2'])
else 0), axis=1) 
i = i + df.apply(lambda x: 1 if (x['column3'] == True)
else 0, axis=1)   
df["score"] = i
df["critical"] = df.apply(lambda x: True if ((x['score'] < 5) and 
("low" in str(x['column2'])  or
"high" in str(x['column2'])  ))
else False, axis=1)   

你可以这样做:

def calculate_score(x):
score = 0
if x['column1'] > 1 and x['column1'] < 8 :
score += 3
elif x['column1'] >=8 :
score += 6
else:
score += 0

if "high" in str(x['column2']) :
score += 1
elif "low" in str(x['column2']) :
score += 2
else:
score += 0

if x['column3'] == True:
score += 1
else:
score += 0

return score
df['score'] = df.apply(calculate_score, axis=1)
df["critical"] = df.apply(lambda x: True if ((x['score'] < 5) and ("low" in str(x['column2'])  or "high" in str(x['column2']))) else False, axis=1)

这样就可以避免iterrows,用单个apply代替多个apply

最新更新