如何从数据帧列内的字典中提取值并将其提取到原始数据帧



我有类似的数据帧df

id          uuid                                         infotags  loc
20078612    0136ceb0-aba3-3de2-b577-bd3a8f5620d3

df[infotags]内部的值如下所示。如何提取信息阶段中的值并附加到原始数据帧。

[{'id': 3250638, 'type': {'id': 489211, 'icon': 'far fa-bell', 'type': 'alert_type', 'active': False, 'display': True, 'ordering': 1, 'displayName': 'Alert Status'}, 'value': 'Happens / Occurs', 'svalue': 'happens'}, {'id': 5585386, 'type': {'id': 691718, 'icon': 'far fa-industry', 'type': 'industry', 'active': False, 'display': True, 'ordering': 10, 'displayName': 'Industry'}, 'value': 'Transportation, Logistics'}, {'id': 3568415, 'type': {'id': 3568395, 'icon': 'far fa-sitemap', 'type': 'scope_organisational', 'active': True, 'display': True, 'ordering': 220, 'displayName': 'Scope - Organisational'}, 'value': 'Company'}, {'id': 6819597, 'icon': 'fas fa-prescription-bottle-alt', 'type': {'id': 2284612, 'type': 'event_type', 'active': False, 'display': True, 'ordering': 10001, 'displayName': 'Event Type'}, 'value': 'Corona Virus Update', 'svalue': 'disease_outbreak'}, {'id': 6061632, 'icon': 'fas fa-university', 'type': {'id': 2284612, 'type': 'event_type', 'active': False, 'display': True, 'ordering': 10001, 'displayName': 'Event Type'}, 'value': 'Innovation Activity', 'svalue': 'innovation_activity'}, {'id': 10322, 'type': {'id': 2284613, 'type': 'risk_level', 'active': False, 'display': False, 'ordering': 10002, 'displayName': 'Risk Level'}, 'value': 'High', 'svalue': 'high'}, {'id': 7041979, 'type': {'id': 7041971, 'type': 'datasource', 'active': False, 'display': False, 'ordering': 1000010, 'displayName': 'Datasource'}, 'value': 'Engine', 'svalue': 'engine'}]

Loc

[{'id': 1944901, 'geo': {'type': 'Feature', 'geometry': {'type': 'Point', 'coordinates': [151.20732, -33.86785]}, 'properties': {'name': 'Sydney', 'countryCode': 'AU'}}, 'name': 'Sydney', 'type': {'id': 2314671, 'type': 'location', 'ordering': 40, 'displayName': 'Location'}, 'ctype': {'id': 2290061, 'icon': 'fas fa-map-marker-alt', 'ctype': 'place', 'displayName': 'Place'}, 'sname': '2147714', 'isOwnTarget': False}]

预期输出

id         uuid                                    root_id   id     icon       active .....
20078612    0136ceb0-aba3-3de2-b577-bd3a8f5620d3    3250638  489211  far fa-bell ....

直到所有值都附加到新列。

如果信息标签已经是列的值,则可以先explode,然后用json.normalize:concat

df = df.explode("loc")
df = pd.concat([df.reset_index(drop=True).drop("loc", 1),
json.json_normalize(df["loc"]).rename(columns={"id":"loc_id"})],
axis=1)
df = df.explode("infotags")
print (pd.concat([df.reset_index(drop=True).drop("infotags",1),
json.json_normalize(df["infotags"]).rename(columns={"id":"root_id"})],
axis=1))
id                                  uuid   loc_id    name    sname  isOwnTarget geo.type geo.geometry.type geo.geometry.coordinates geo.properties.name geo.properties.countryCode  type.id type.type  type.ordering type.displayName  ctype.id             ctype.icon ctype.ctype ctype.displayName  root_id                      value               svalue  type.id        type.icon             type.type  type.active  type.display  type.ordering        type.displayName                            icon
0  20078612  0136ceb0-aba3-3de2-b577-bd3a8f5620d3  1944901  Sydney  2147714        False  Feature             Point   [151.20732, -33.86785]              Sydney                         AU  2314671  location             40         Location   2290061  fas fa-map-marker-alt       place             Place  3250638           Happens / Occurs              happens   489211      far fa-bell            alert_type        False          True              1            Alert Status                             NaN
1  20078612  0136ceb0-aba3-3de2-b577-bd3a8f5620d3  1944901  Sydney  2147714        False  Feature             Point   [151.20732, -33.86785]              Sydney                         AU  2314671  location             40         Location   2290061  fas fa-map-marker-alt       place             Place  5585386  Transportation, Logistics                  NaN   691718  far fa-industry              industry        False          True             10                Industry                             NaN
2  20078612  0136ceb0-aba3-3de2-b577-bd3a8f5620d3  1944901  Sydney  2147714        False  Feature             Point   [151.20732, -33.86785]              Sydney                         AU  2314671  location             40         Location   2290061  fas fa-map-marker-alt       place             Place  3568415                    Company                  NaN  3568395   far fa-sitemap  scope_organisational         True          True            220  Scope - Organisational                             NaN
3  20078612  0136ceb0-aba3-3de2-b577-bd3a8f5620d3  1944901  Sydney  2147714        False  Feature             Point   [151.20732, -33.86785]              Sydney                         AU  2314671  location             40         Location   2290061  fas fa-map-marker-alt       place             Place  6819597        Corona Virus Update     disease_outbreak  2284612              NaN            event_type        False          True          10001              Event Type  fas fa-prescription-bottle-alt
4  20078612  0136ceb0-aba3-3de2-b577-bd3a8f5620d3  1944901  Sydney  2147714        False  Feature             Point   [151.20732, -33.86785]              Sydney                         AU  2314671  location             40         Location   2290061  fas fa-map-marker-alt       place             Place  6061632        Innovation Activity  innovation_activity  2284612              NaN            event_type        False          True          10001              Event Type               fas fa-university
5  20078612  0136ceb0-aba3-3de2-b577-bd3a8f5620d3  1944901  Sydney  2147714        False  Feature             Point   [151.20732, -33.86785]              Sydney                         AU  2314671  location             40         Location   2290061  fas fa-map-marker-alt       place             Place    10322                       High                 high  2284613              NaN            risk_level        False         False          10002              Risk Level                             NaN
6  20078612  0136ceb0-aba3-3de2-b577-bd3a8f5620d3  1944901  Sydney  2147714        False  Feature             Point   [151.20732, -33.86785]              Sydney                         AU  2314671  location             40         Location   2290061  fas fa-map-marker-alt       place             Place  7041979                     Engine               engine  7041971              NaN            datasource        False         False        1000010              Datasource                             NaN

最新更新