如何检查数据帧中是否存在列表元素



我的数据帧中有一列包含不同长度的字符串列表,如下所示:

names                                            venue

[Instagrammable, Restaurants, Vegan]                          14 Hills
[Date Night, Vibes, Drinks]                                   Upper 14
[Date Night, Drinks, After Work Drinks, Cocktail]             Hills
.                                                   .                  
.                                                   .
.

现在,如果我想检查数据帧中是否存在某个列表,请参阅How to do it。

Example1:
Input :
find_list=[Date Night, Vibes, Drinks]
venue = 'Upper 14'
Output:
Record is present in my dataframe
Example 2:
Input :
find_list=[Date Night, Drinks]
venue='Hills 123'
Output:
Record is not present in my dataframe

示例

Input :
find_list=[   Date Night, Vibes, Drinks]
venue = 'Upper 14'
Output:
Record is not present in my dataframe

您可以使用.apply().any():

find_list = ["Date Night", "Vibes", "Drinks"]
if df["names"].apply(lambda x: x == find_list).any():
print("List is present in my dataframe")
else:
print("List is not present in my dataframe")

打印:

List is present in my dataframe

编辑:匹配记录:

find_list = ["Date Night", "Vibes", "Drinks"]
venue = "Upper 14"
if df.apply(
lambda x: x["names"] == find_list and x["venue"] == venue, axis=1
).any():
print("Record is present in my dataframe")
else:
print("Record is not present in my dataframe")

打印:

Record is present in my dataframe

编辑2:从输入列表中删除空白:

find_list = ["      Date Night", "Vibes", "Drinks"]
venue = "Upper 14"
if df.apply(
lambda x: all(a.strip() == b.strip() for a, b in zip(x["names"], find_list))
and x["venue"] == venue,
axis=1,
).any():
print("Record is present in my dataframe")
else:
print("Record is not present in my dataframe")

打印:

Record is present in my dataframe

编辑3:删除单词之间的额外空格:

import re
find_list = ["      Date     Night", "Vibes", "Drinks"]
venue = "Upper 14"
r = re.compile(r"s{2,}")
if df.apply(
lambda x: all(
r.sub(a.strip(), " ") == r.sub(b.strip(), " ")
for a, b in zip(x["names"], find_list)
)
and x["venue"] == venue,
axis=1,
).any():
print("Record is present in my dataframe")
else:
print("Record is not present in my dataframe")

最新更新