我有一个包含5列的CSV数据框架。我想根据行中的条件创建一个新列。比如我的df =
col1 col2 col3 col4
1 1 1 1
0 0 1 1
1 1 1 1
nan nan nan nan
下面是我的代码示例
m1 = df[['col1','col2','col3','col4']].all(axis=1)
m2 = df[['col1','col2','col3','col4']].isna().any(axis=1)
df['STATUS AUTO'] = np.select([m2, m1], ['ZD', 'FIC'],'PARTIALLY IMMUNIZED')
它没有给我"PARTIALLY IMMUNIZED"
,尽管有很多。与上面的示例一样,row1是FIC, row2 &行3为"PARTIALLY IMMUNIZED"
,行4为"ZD"
。用"ZD"
代替"PARTIALLY IMMUNIZED"
。请帮忙。PS:(相同的代码适用于几个月前的另一个DF,但不适用于这个DF)
用字符串代替数字似乎有问题:
cols = ['col1','col2','col3','col4']
df[cols] = df[cols].astype(float)
m1 = df[cols].eq(1).all(axis=1)
m2 = df[cols].isna().any(axis=1)
df['STATUS AUTO'] = np.select([m2, m1], ['ZD', 'FIC'],'PARTIALLY IMMUNIZED')
print (df)
col1 col2 col3 col4 STATUS AUTO
0 1.0 1.0 1.0 1.0 FIC
1 0.0 0.0 1.0 1.0 PARTIALLY IMMUNIZED
2 1.0 1.0 1.0 1.0 FIC
3 NaN NaN NaN NaN ZD