为混合值类型创建组合panda列



我有一个pandas数据帧,我希望以ID开头的值只替换为'user/ID'前缀,并删除任何前导零。我想制作第三列,在其中我只获取同一行上的ID值(没有用户前缀,没有前导零,没有IDm/ID,只有ID(和E值,并用下划线组合,然后添加"user/"前缀。我有一个例子可供参考。原始

item_id_a                   item_id_b   
0   E00000170630            IDm00010461 
1   IDm00010461             E00000170630    
2   E00000353915            IDs236274573    
3   IDs23627457             E00000353915    

所需:

item_id_a                   item_id_b                  combined
0   E00000170630            user/ID10461             user/E00000170630_ID10461
1   user/ID10461            E00000170630              user/ID10461_E00000170630
2   E00000353915            user/ID236274573          user/E00000353915_ID236274573            
3   user/ID23627457         E00000353915              user/ID23627457_E00000353915

这应该有效:

(df.replace(r'ID[a-z]?0*','ID',regex=True)
.assign(combined = lambda x: 'user/' + x['item_id_a'] + '_' + x['item_id_b'])
.replace(r'^ID','user/ID',regex=True))

输出:

item_id_a         item_id_b                       combined
0     E00000170630      user/ID10461      user/E00000170630_ID10461
1     user/ID10461      E00000170630      user/ID10461_E00000170630
2     E00000353915  user/ID236274573  user/E00000353915_ID236274573
3  user/ID23627457      E00000353915   user/ID23627457_E00000353915
df["combined"] = "user/" + df.item_id_a + "_" + df.item_id_b
df.loc[1::2, "item_id_a"] = "user/" + df.loc[1::2, "item_id_a"]
df.loc[0::2, "item_id_b"] = "user/" + df.loc[0::2, "item_id_b"]

最新更新