我有一个pandas数据帧,我想根据第一列中的条目是否在列表中,创建一个值为"in list"或"not in list"的新列。为了说明,我在下面举了一个玩具的例子。我有一个可行的解决方案,但它看起来很麻烦,也不太像蟒蛇。我也得到了SettingWithCopyWarning
。有没有更好或更推荐的方法在python中实现这一点?
#creating a toy dataframe with one column
df = pd.DataFrame({'col_1': [1,2,3,4,6]})
#the list we want to check if any value in col_1 is in
list_ = [2,3,3,3]
#creating a new empty column
df['col_2'] = None
col_1 col_2
0 1 None
1 2 None
2 3 None
3 4 None
4 6 None
我的解决方案是循环通过第一列并填充第二个
for index, i in enumerate(df['col_1']):
if i in list_:
df['col_2'].iloc[index] = 'in list'
else:
df['col_2'].iloc[index] = 'not in list'
col_1 col_2
0 1 not in list
1 2 in list
2 3 in list
3 4 not in list
4 6 not in list
这产生了正确的结果,但我想学习一种更像蟒蛇的方式来实现这一点。
将Series.isin
与Series.map
:一起使用
In [1197]: df['col_2'] = df.col_1.isin(list_).map({False: 'not in list', True: 'in list'})
In [1198]: df
Out[1198]:
col_1 col_2
0 1 not in list
1 2 in list
2 3 in list
3 4 not in list
4 6 not in list