Pandas复杂数据的数据框架



我有一个Dataframe,它的数据如下

cert                              meta
{"alternate_names": [                  {"asset_name": "",
"audience": "External",
"asset_name": "",              "automation_utility": "",
"audience": "External",             "delegate_owner": "",
"automation_utility": "",               "environment": dev
"delegate_owner": "",               "l2_group_email": null,
"environment": dev              "l3_group_email": null,
"l2_group_email": null,             "requestor_email": "",
"l3_group_email": null,             "support_email": "",
"requestor_email": "",              "tech_delegate_email": null,
"support_email": "",                "tech_owner_email": null
"tech_delegate_email": null,            }
"tech_owner_email": null    
}   
cert does not exists                 cert does not exists
cert does not exists                 cert does not exists

我检查了列的数据类型,它显示了对象。我需要创建一个状态为support_email的Dataframe,但并非所有行都有类似的值。

如果状态不存在,则需要显示null。

我尝试过的东西-:

df = pd.DataFrame(data)
df["cert"] = df["cert"].apply(lambda x : dict(eval(x)) )
df2 = df["cert"].apply(pd.Series )
print(df) 

有人能帮我渡过难关吗。

看起来数据帧中有(损坏的?(JSON内容。您可能可以使用Python JSON库解析并制作编入词典。然后,您可以使用每个字典来加载状态并将support_email转换为数据帧。

请参阅下面的示例,其中我取了您的示例数据帧,更正了JSON错误,然后通过JSON运行装载机。

import json
s = '''
{"asset_name": "",
"audience": "External",
"automation_utility": "",
"delegate_owner": "",
"environment": "dev",
"l2_group_email": null,
"l3_group_email": null,
"requestor_email": "",
"support_email": "",
"tech_delegate_email": null,
"tech_owner_email": null,
"tech_delegate_email": null
}
'''
d1 = json.loads(s)
print(d1['environment'])
# dev

最新更新