以同样的方式将字典列表转换为Dataframe



zendesk api返回字段作为字典列表,但每个列表是一个单独的记录。我想知道是否有更好的方法将其全部转换为数据框架。如果它是字典的字典,那么json_normalize会处理它,没有问题。

注意:不是所有的记录都有相同的字段id

样本数据:

data = [{
"ticket_id": 4,
"customer_id": 8,
"created_at": "2022-05-01",
"custom_fields": [
{
"id": 15,
"value": "website"
},
{
"id": 16,
"value": "broken"
},
{
"id": 23,
"value": None
},
],
'group_id': 42
}]

运行任何形式的Dataframe,from_records,from_jsonjson_normalize给出了我想要的大部分内容,但列表在单列中:

t_df = pd.json_normalize(data)
t_df

输出:

group_id[{"id":15,"价值":"网站"},{"v"id":16日……

您应该先处理数据/字典,然后用它构造一个DataFrame。它将使您的生活更容易,比尝试使用pandas操作数据(即在DataFrame创建之后)更快。

import pandas as pd
data = [{
"ticket_id": 4,
"customer_id": 8,
"created_at": "2022-05-01",
"custom_fields": [
{
"id": 15,
"value": "website"
},
{
"id": 16,
"value": "broken"
},
{
"id": 23,
"value": None
},
],
'group_id': 42
}]
custom_fields = data[0].pop('custom_fields')
data[0].update({rec['id']: rec['value'] for rec in custom_fields})
t_df = pd.DataFrame(data)

输出:

>>> t_df 
ticket_id  customer_id  created_at  group_id       15      16    23
0          4            8  2022-05-01        42  website  broken  None

看起来pandas没有自动确定哪些字段是"元数据";哪些是"记录"?——比;如果您的数据是固定的,我建议硬编码以下内容:

>>> t_df = pd.json_normalize(
...     data,
...     meta=["ticket_id", "customer_id", "created_at", "group_id"],
...     record_path=["custom_fields"]
... )
id    value ticket_id customer_id  created_at group_id
0  15  website         4           8  2022-05-01       42
1  16   broken         4           8  2022-05-01       42
2  23     None         4           8  2022-05-01       42

文档:https://pandas.pydata.org/docs/reference/api/pandas.json_normalize.html

import pandas as pd
data = [{
"ticket_id": 4,
"customer_id": 8,
"created_at": "2022-05-01",
"custom_fields": [
{
"id": 15,
"value": "website"
},
{
"id": 16,
"value": "broken"
},
{
"id": 23,
"value": None
},
],
'group_id': 42
}]
df = pd.DataFrame(data)
for index in df.index:
for i in df.loc[index,'custom_fields']:
df.loc[index,i['id']] = i['value']
df.drop(columns = 'custom_fields',inplace = True)
df

最新更新