Python熊猫数据帧:删除列中的值存在于另一列中的行



我有以下panda数据帧:

在此处输入图像描述

并且希望删除重复的行。

例如:

CCD_ 1。

最好的方法是什么?

谢谢!

对你有用的代码是:

df["team_a"] = np.minimum(df['team1'], df['team2'])
df["team_b"] = np.maximum(df['team1'], df['team2'])
df.drop_duplicates(["season","week","team_a","team_b"],inplace= True)
df.drop(columns= ["team_a","team_b"],inplace= True)

在执行此操作之前,请检查您的数据,因为当team1和team2反转时,列team1_score和team2_score没有反转,因此删除其中一行后可能会混淆。

因为OP没有提供可复制的数据集:

import pandas as pd
# dataset where the 1st and 5th observations are team A vs team F:
df = pd.DataFrame({
"season": [2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021],
"week": [12, 12, 12, 12, 12, 13, 13, 13, 13, 13],
"team1": ["A", "B", "C", "D", "F", "A", "B", "C", "D", "F"],
"team2": ["F", "G", "H", "I", "A", "F", "G", "H", "I", "A"]
})
df
season  week    team1   team2
0     2021    12        A       F
1     2021    12        B       G
2     2021    12        C       H
3     2021    12        D       I
4     2021    12        F       A
5     2021    13        A       F
6     2021    13        B       G
7     2021    13        C       H
8     2021    13        D       I
9     2021    13        F       A
# solution:
df[[df["team1"].str.contains(c) == False for c in df["team2"].tolist()][0]]
season  week    team1   team2
0     2021    12        A       F
1     2021    12        B       G
2     2021    12        C       H
3     2021    12        D       I
4     2021    13        A       F
5     2021    13        B       G
6     2021    13        C       H
7     2021    13        D       I

相关内容

最新更新