在df中使用不同规则从多个列创建一个New Pandas列



我有以下df

SUITE      DET 
0       HISTO  HisRep2  
1       HISTO  HisRep3 
2         NaN    store  
3         NaN   OnHold 
4        PMFA   Nemato  
5    bactmate   Nemato 
6      bacto1   abbac4  
7   abbadabba   BhyCPC 
8      HAEMFF   pmbac3
9      INCLIN    Trico 
10     BOCOSP    isol1 

我想创建第三列。第三列将使用3个条件保存前两列的混合数据。

我已经分别为3个标准创建了代码,但我想同时执行它们。我的3位是

  1. 创建新列'SUITEDET',如果列'SUITE'包含Nan/为空,则使用列'DET'
  2. 中的数据填充新列
df['SUITEDET'] = df[df['SUITE'].isnull()]['DET']

使得df


SUITE      DET SUITEDET
0       HISTO  HisRep2      NaN
1       HISTO  HisRep3      NaN
2         NaN    store    store
3         NaN   OnHold   OnHold
4        PMFA   Nemato      NaN
5    bactmate   Nemato      NaN
6      bacto1   abbac4      NaN
7   abbadabba   BhyCPC      NaN
8      HAEMFF   pmbac3      NaN
9      INCLIN    Trico      NaN
10     BOCOSP    isol1      NaN
  1. 创建新列'SUITEDET',如果'DET'列在列表'usedet'中有条目,使用'SUITE'列中的数据完成新列
usedet = ['pmbac1','pmbac2','pmbac3','pmbac4','pmbac5','pmbac6','bact1','bact2','bact3','bact4','bact5','bact6','bactAdd1','bactAdd2','bactAdd3','abbac1','abbac2','abbac3','abbac4','abbac5','abbac6','BrachSmear','Brachy','Brachy1','Brachy10','Brachy2','Brachy3','Brachy4','Brachy5','Brachy6','Brachy7','Brachy8','Brachy9','BhyC','isol','salm1','salm2','salm3','salm4','pmavb1','pmavb2','pmavb3','pmavb4','pmavb5','pmavb6','pmavsh','pmavsp','cult','zncult','pmfoBK','pmfood','Coccn','Coccid','CoccidGoat','Strngy','StoidE','Stoide','Nemato','NematE','TrichE','Tricd','Tricm','Trico','Trics','Tricu','Monspp','Fecn','BhyC21','BhyCID','BhyCPC','BhyCPCTrig','BhyClt','isol1','isol1C','isol1F','isol1M','isol1S','isol2','isol2C','isol2F','isol2M','isol2S','isol3','isol3C','isol3F','isol3M','isol3S','isolA','isolB','isolC']
df['SUITEDET'] = df[df['DET'].isin(usedet)]['SUITE']

创建df

SUITE      DET   SUITEDET
0       HISTO  HisRep2        NaN
1       HISTO  HisRep3        NaN
2         NaN    store        NaN
3         NaN   Nemato        NaN
4        PMFA   BodyWt        PMFA
5    bactmate   Nemato   bactmate
6      bacto1   abbac4     bacto1
7   abbadabba   BhyCPC  abbadabba
8      HAEMFF   pmbac3     HAEMFF
9      INCLIN    Trico     INCLIN
10     BOCOSP    isol1     BOCOSP
  1. 创建新列'SUITEDET',如果'SUITE'列在'usesuite'列表中有条目,则使用'SUITE'列中的数据完成新列
usesuite = ["HAEMFF","HAEM4F","INCLIN","BOCOSP","OVCOSP","WECOCF","WECOCA","WECOCP","ECOPCR","BLKWEC","BLKWF","BLKFLK","FLKIND","WECFLK","BHYCULT","BHY21","INCH","LEPTMS","GSH-PX","HISTO","CHLEIAS","CHLEIAG","BVD","BVDANT","BVDAB","CLA","IBRMS","SINFAB","JMILKI","JMILKB","BVDMS","BVDIND","BVDPCR","SBVIMB","SBVIMI","RUMENE","FATLIV","DOWNER","BMMETP","FAMIN","OMETAP","BOVPRO","OPRODP","COPBLK","COPRO","RESPCR"]
df['SUITEDET'] = df[df['SUITE'].isin(usesuite)]['SUITE']

使得df

SUITE      DET SUITEDET
0       HISTO  HisRep2    HISTO
1       HISTO  HisRep3    HISTO
2         NaN    store      NaN
3         NaN   OnHold      NaN
4        PMFA   Nemato      NaN
5    bactmate   Nemato      NaN
6      bacto1   abbac4      NaN
7   abbadabba   BhyCPC      NaN
8      HAEMFF   pmbac3   HAEMFF
9      INCLIN    Trico   INCLIN
10     BOCOSP    isol1   BOCOSP

最后我想要得到的是像这样的df,其中第三列使用通过使用三个标准从前两列选择正确的数据完成

SUITE      DET SUITEDET
0       HISTO  HisRep2    HISTO
1       HISTO  HisRep3    HISTO
2         NaN    store    store
3         NaN   OnHold    OnHold
4        PMFA   Nemato    PMFA
5    bactmate   Nemato    bactmate
6      bacto1   abbac4    bacto1
7   abbadabba   BhyCPC    abbadabba
8      HAEMFF   pmbac3    HAEMFF
9      INCLIN    Trico    INCLIN
10     BOCOSP    isol1    BOCOSP

我应该运行三个标准并创建3dfs,然后尝试合并它们,还是有一个python的方法来运行3位在一起。我甚至不需要一个具体的答案,如果这是一个麻烦,只是一个想法,去查。谢谢。

尝试:

df['SUITEDET'] = df[df['SUITE'].isnull()]['DET']
condition = df['DET'].isin(usedet) | df['SUITE'].isin(usesuite)
df.loc[condition, 'SUITEDET'] = df['SUITE']

输出:

SUITE      DET   SUITEDET
0       HISTO  HisRep2      HISTO
1       HISTO  HisRep3      HISTO
2         NaN    store      store
3         NaN   OnHold     OnHold
4        PMFA   Nemato       PMFA
5    bactmate   Nemato   bactmate
6      bacto1   abbac4     bacto1
7   abbadabba   BhyCPC  abbadabba
8      HAEMFF   pmbac3     HAEMFF
9      INCLIN    Trico     INCLIN
10     BOCOSP    isol1     BOCOSP

解决方案2:

condition = df['DET'].isin(usedet) | df['SUITE'].isin(usesuite)
df['SUITEDET'] = np.where(condition, df['SUITE'], np.where(df['SUITE'].isnull(), df['DET'], np.nan))

最新更新