在条件之后迭代地为df.column赋值

我有一个数据帧，它包含一个列，其中有一个列表。该列表可以为空，也可以在其第一个条目处有一个字典。

index                                   labels
0                                                     []
1      [{'id': 1178423440, 'node_id': 'MDU6TGFiZWwxMT...
2      [{'id': 1178425127, 'node_id': 'MDU6TGFiZWwxMT...
3      [{'id': 1213670757, 'node_id': 'MDU6TGFiZWwxMj...
4      [{'id': 1178430857, 'node_id': 'MDU6TGFiZWwxMT...

我想用key='id'分配值来代替列表(对于每个条目(。以下是我所做的。

for i in issues['labels']:
if not i: continue
i=i[0]['id']

我意识到这是赋值，因为df保持不变(即使它运行(。我做错了什么？

预期输出：

index          labels
0                                                    
1          1178423440
2          1178425127
3          1213670757
4          1178430857

编辑：

例如，如果每行中列表中的索引0包含2个或多个字典，如

[{'id': 1497192821, 'node_id': 'MDU6TGFiZWwxNDk3MTkyODIx', 'url': 'https://api.github.com/repos/chef/chef/labels/Focus:%20knife%20bootstrap', 'name': 'Focus: knife bootstrap', 'color': '92ef98', 'default': False, 'description': ''}, {'id': 1178425127, 'node_id': 'MDU6TGFiZWwxMTc4NDI1MTI3', 'url': 'https://api.github.com/repos/chef/chef/labels/Platform:%20Windows', 'name': 'Platform: Windows', 'color': 'a2c429', 'default': False, 'description': ''}, 
{'id': 1178435805, 'node_id': 'MDU6TGFiZWwxMTc4NDM1ODA1', 'url': 'https://api.github.com/repos/chef/chef/labels/Status:%20Waiting%20on%20Contributor', 'name': 'Status: Waiting on Contributor', 'color': '0052cc', 'default': False, 'description': 'A pull request that has unresolved requested actions from the author.'},
{'id': 525658991, 'node_id': 'MDU6TGFiZWw1MjU2NTg5OTE=', 'url': 'https://api.github.com/repos/chef/chef/labels/Type:%20Bug', 'name': 'Type: Bug', 'color': 'bfe5bf', 'default': False, 'description': "Doesn't work as expected."}]

如何解析key='id'的所有值并将其附加到labels列的同一位置？

预期操作：

index     labels
0          []                                  #has no entries
1          [1178423440,1178435805,525658991]    # has 3 dictionaries with 3 different id values (values with key='id)
2          [1178425127,132131,13213]           # slly, has 2 id values
3          [1389810]                           # has one id value

如果不匹配，请在此使用str方法进行正确工作，然后返回NaN:

issues['labels'] = issues['labels'].str[0].str.get('id')

如果需要缺少值的整数，请使用整数nan:

issues['labels'] = issues['labels'].str[0].str.get('id').astype('Int64')

编辑：如果每个字典都有id，请使用：

issues['labels'] = issues['labels'].apply(lambda x: [y['id'] for y in x])

如果可能的话，一些dict没有id添加测试：

issues['labels'] = issues['labels'].apply(lambda x: [y['id'] for y in x if 'id' in y])

相关内容

最新更新

热门标签：