如何在pandas df列的条件子集中使用endswith()中的regexp

我想在我的dataframe中使用 .endswith()或regexp在 Sender name列的条件子集中。

dataFrame df有两列Sender email，Sender name，我将用来定义子集规则，以选择来自特定商店的所有邮件和此商店的特定电子邮件：

df = df[(df["Sender name"]=="Shop_name"]) & (df["Sender email"]=="reply@shop.com")]

，但后来我发现buy@shop.com，noreply@shop.com等也有邮件。是否有任何方法可以将所有这些邮箱整齐地引入第二个条件下的*@shop.com？
我尝试使用endswith()，但无法弄清楚如何使其适用于series对象。我发现我可能会首先与列中的所有邮件一起列出列表，然后检查是否使用pd.Series.isin在其中发送邮件。但是也许那里有一些更优雅的东西？

使用 Series.str.endswith或 Series.str.contains与正则 $进行字符串结束，也可以通过逃脱 .，因为 .是特殊的正则正值 - 任何字符 - 任何字符：

df1 = df[(df["Sender name"]=="Shop_name"]) & (df["Sender email"].str.endswith("@shop.com"))]

或：

df1 = df[(df["Sender name"]=="Shop_name"]) & (df["Sender email"].str.contains("@shop.com$"))]

使用 `.query`

由于pandas >= 0.25.0，我们可以将.query与PANDAS方法(.eq＆amp; str.endswith(一起使用，然后使用Backtick(`(与空格查询列名：

df.query('`Sender name`.eq("Shop_name") & `Sender email`.str.endswith("@shop.com")')

输出

       Sender email Sender name
2    reply@shop.com   Shop_name
3      buy@shop.com   Shop_name
4  noreply@shop.com   Shop_name

使用的示例数据框：

# Example dataframe
df = pd.DataFrame({'Sender email':['ex@example.com', 'ex2@example.com', "reply@shop.com", "buy@shop.com", "noreply@shop.com"],
                   'Sender name': ['example', 'example', 'Shop_name', 'Shop_name', 'Shop_name']})
       Sender email Sender name
0    ex@example.com     example
1   ex2@example.com     example
2    reply@shop.com   Shop_name
3      buy@shop.com   Shop_name
4  noreply@shop.com   Shop_name

使用 `.query`

相关内容

最新更新

热门标签：

如何在pandas df列的条件子集中使用endswith()中的regexp

使用 .query

相关内容

最新更新

热门标签：

使用 `.query`