基于来自另一个规则的新列(在值中包含一个数字)



我想创建一个新列(b),其标签基于另一列(a)的值。列a的条目都是唯一的(它们是文件名),所以我需要使用文件名的一部分来定义它。我尝试使用def/if函数,但不知道如何写"如果文件名包含…xxx.."然后标记为xxx;而不是if filename==xxx.

下面是我尝试的例子:

df

FileName    
C3_828_blahblahblah1.cvs    
C3_828_blahblahblah3.cvs
C3_831_blahblahblah2.cvs
C3_c3_blahblahblah1.cvs

C3_c4_blahblahblah2.cvs
C3_c4_blahblahblah3.cvs

C3_831_blahblahblah.cvs      

def CellType(c):
if c['FileName'] = 828:
return 'mutant1'
elif c['FileName'] = 831:
return 'Mutant2'
else:
return 'Control'

df['CellTyple'] = df.apply(CellType, axis=1)

您可以根据每个搜索字符串将不同的值添加到新列中:

# First, create the case not related to a specific condition with the value 'control'
df['CellType'] = 'control'
# Filter the dataframe by your conditions and add the appropriate values
df.loc[df['filename'].str.contains("828"),'CellType'] = 'mutant1'
df.loc[df['filename'].str.contains("831"),'CellType'] = 'mutant2'

相关内容

最新更新