pandas:根据条件过滤df

假设我有一个类似的数据帧

A B 
11 2             # PASS 
22 4             # FAIL
33 5             # FAIL
44 4             # PASS

还有两条格言：

B_column_dct = {2: [2,3,5], 4: [33,22,121], 5: [1,2,3]}    # the dict key will have multiple values in a list
A_column_dct = {11: [3], 22: [4], 33: [5], 44: [22]}  # the dict key will always have a single value in a list

现在我想过滤上面的数据帧，这样对于列A和B中的每个值，它应该只出现在df中，如果：A_column_ct的值存在于B_column_ct对应的值中。

最终结果df:

A B 
11 2            
44 4

很抱歉，我无法完全理解您的值和您试图创建的过滤df，主要是考虑到dict不能包含重复的键(即在org.df的B列中，值4将无法正常工作。无论如何，我都试图让它工作，认为他们在B_dict中的键4表示B列的两个值，但在过滤后的df方面，我没有得出与您相同的结论。总之，下面是我使用的代码(可能是我迄今为止写的最长的一行，为了可读性，我建议重写(：

flat_a = list(set().union(*A_column_dct.values()))
flat_b = list(set().union(*B_column_dct.values()))

filtering = [(any(elem_a in flat_b for elem_a in A_column_dct[i])) and (any(elem_b in flat_a for elem_b in B_column_dct[j])) for i, j in zip(org_df["A"], org_df["B"])]
filtered_df = org_df[filtering]

相关内容

最新更新

热门标签：