这是一个pandas数据框架,其中一列是一个字典列表。见下文:
第一行的示例
df.iloc[0].colexample
[{'status': 'married',
'date': datetime.datetime(2022, 10, 1, 6, 27, 31, 118000)},
{'status': 'divorced',
'date': datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)},
{'status': 'sent',
'date': datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)},
{'status': 'other',
'date': datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)}]
我想通过只保留colexample的行来过滤数据框是否有状态== other或状态== sent
在这个例子中,我保留了这一行,但是还有其他的有不同值的状态
可以使用:
df['check']=df['colexample'].apply(lambda x: True if any(i in ['other','sent'] for i in [item for sublist in [[list(i.values()) for i in x]][0]for item in sublist]) else False)
df=df[df['check']==True]
细节:
#for each row it loops through the list and takes all the values in dictionary and puts them in a list.
values = [item for sublist in [[list(i.values()) for i in a]][0]for item in sublist]
#where a is the value in the row.
print(values)
'''
['married', datetime.datetime(2022, 10, 1, 6, 27, 31, 118000), 'divorced', datetime.datetime(2022, 10, 1, 6, 27, 52, 47000), 'sent', datetime.datetime(2022, 10, 1, 6, 27, 52, 47000), 'other', datetime.datetime(2022, 10, 1, 6, 27, 52, 47000)]
'''
#then define filter list:
filter_list=['other','sent']
#compare this two lists and return True if a match is found like this:
if any(i in filter_list for i in [item for sublist in [[list(i.values()) for i in a]][0]for item in sublist]):