根据列中字典列表中的值筛选pandas数据框



这是一个pandas数据框架,其中一列是一个字典列表。见下文:

第一行的示例

df.iloc[0].colexample 
[{'status': 'married',
'date': datetime.datetime(2022, 10, 1, 6, 27, 31, 118000)},
{'status': 'divorced',
'date': datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)},
{'status': 'sent',
'date': datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)},
{'status': 'other',
'date': datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)}]

我想通过只保留colexample的行来过滤数据框是否有状态== other或状态== sent

在这个例子中,我保留了这一行,但是还有其他的有不同值的状态

可以使用:

df['check']=df['colexample'].apply(lambda x: True if any(i in ['other','sent'] for i in [item for sublist in [[list(i.values()) for i in x]][0]for item in sublist]) else False)
df=df[df['check']==True]

细节:

#for each row it loops through the list and takes all the values ​​in dictionary and puts them in a list.
values = [item for sublist in [[list(i.values()) for i in a]][0]for item in sublist] 
#where a is the value in the row.
print(values)
'''
['married', datetime.datetime(2022, 10, 1, 6, 27, 31, 118000), 'divorced', datetime.datetime(2022, 10, 1, 6, 27, 52, 47000), 'sent', datetime.datetime(2022, 10, 1, 6, 27, 52, 47000), 'other', datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)]
'''
#then define filter list:
filter_list=['other','sent']
#compare this two lists and return True if a match is found like this:
if any(i in filter_list for i in [item for sublist in [[list(i.values()) for i in a]][0]for item in sublist]):

相关内容

  • 没有找到相关文章

最新更新