我想创建一个新列(b),其标签基于另一列(a)的值。列a的条目都是唯一的(它们是文件名),所以我需要使用文件名的一部分来定义它。我尝试使用def/if函数,但不知道如何写"如果文件名包含…xxx.."然后标记为xxx;而不是if filename==xxx.
下面是我尝试的例子:
df
FileName
C3_828_blahblahblah1.cvs
C3_828_blahblahblah3.cvs
C3_831_blahblahblah2.cvs
C3_c3_blahblahblah1.cvs
C3_c4_blahblahblah2.cvs
C3_c4_blahblahblah3.cvs
C3_831_blahblahblah.cvs
def CellType(c):
if c['FileName'] = 828:
return 'mutant1'
elif c['FileName'] = 831:
return 'Mutant2'
else:
return 'Control'
df['CellTyple'] = df.apply(CellType, axis=1)
您可以根据每个搜索字符串将不同的值添加到新列中:
# First, create the case not related to a specific condition with the value 'control'
df['CellType'] = 'control'
# Filter the dataframe by your conditions and add the appropriate values
df.loc[df['filename'].str.contains("828"),'CellType'] = 'mutant1'
df.loc[df['filename'].str.contains("831"),'CellType'] = 'mutant2'