从多个数据帧创建新的条件数据帧系列



我想在一个语句中从多个数据帧创建一个新的数据框。

例如:

df.loc[(df[[A,B]].mean(axis=1) <= Const1), 'D'] = df['F']
df.loc[(df[[A,B]].mean(axis=1) >  Const1) & (df['E']<=Const2), 'D'] = df['G']
df.loc[(df[[A,B]].mean(axis=1) >  Const1) & (df['E']> Const2), 'D'] = df['H'] 

df['D'] = 'some statement'

使用提供的np.select链接,即使您希望选择的是系列而不是恒定的,它的工作方式也相同。

import pandas as pd
import numpy as np
# example data
np.random.seed(0)
df = pd.DataFrame(np.random.randint(0,100, 30).reshape(-1, 6), 
columns=list('ABEFGH'))
Const1, Const2 = 50, 50
# conditions
conds = [(df[['A','B']].mean(axis=1) <= Const1), 
(df[['A','B']].mean(axis=1) >  Const1) & (df['E']<=Const2), 
(df[['A','B']].mean(axis=1) >  Const1) & (df['E']> Const2)]
# choices that are series
choices = [df['F'], df['G'], df['H']]
#use np.select
df['D'] = np.select(condlist=conds, choicelist=choices)
print(df)
A   B   E   F   G   H   D
0  44  47  64  67  67   9  67 #value from F
1  83  21  36  87  70  88  70 #value from G
2  88  12  58  65  39  87  65
3  46  88  81  37  25  77  77 #value from H
4  72   9  20  80  69  79  80

最新更新