我有一个数据框架:
| from id | from group | to id | to group |
| 1 | A | 3 | B |
| 4 | B | 4 | X |
| 5 | F | 5 | J |
| 2 | B | 3 | A |
查看'from group'和'to group'列。我想删除'A和B'和'F和J'同时出现在两列中的行。
预期输出:
| from id | from group | to id | to group |
| 4 | B | 4 | X |
我正在寻找一个灵活的解决方案。这意味着如果添加了第三个条件,它不会改变
假设您正在使用pandas
,请尝试如下操作:
df.loc[~((df['from group'].isin(['A','B'])) & (df['to group'].isin(['A','B'])))]
第一个括号前的~
将定义即将到来的过滤器
看起来很长但很简单
df.query("not ((`from group` == 'A' and `to group` == 'B') or (`from group` == 'B' and `to group` == 'A') or (`from group` == 'F' and `to group` == 'J') or (`from group` == 'J' and `to group` == 'F'))")
-=- EDIT -=-
这里有一个更复杂但更可扩展的解决方案。只需将l1
定义为要排除的对
l1 = [['A', 'B'], ['F', 'J']]
l2 = list(itertools.chain.from_iterable([itertools.permutations(i, r=2) for i in l1]))
df[[j not in l2 for j in list(zip(df['from group'], df['to group']))]]