我有一个包含两列的数据框架df
'data'列有一些值,例如:
df['data'] = [0, 1, 0, 1]
'Dict'列有多个字典,例如:
df['Dict'] = [{0: ['IPM_0'], 1: ['IPM_1']},
{0: ['SPM_0'], 1: ['SPM_1']},
{0: ['NPM_0'], 1: ['NPM_1']},
{0: ['DPM_0'], 1: ['DPM_1']}]
现在我想将每个'data'与'Dict'中的相对字典进行映射,例如结果看起来像:
df['result'] = [IPM_0, SPM_1, NPM_0, DPM_1]
我试图使用df['result'] = df.data.map(df.Dict)
,但我得到一个列与df.Dict中相同的值。
我也尝试了df['result'] = df.data.apply(lambda rows: rows.map(rows.Dict))
,但我得到一个错误'int' object has no attribute 'map'
我怎样才能达到这个结果?
起点:
data Dict
0 0 {0: ['IPM_0'], 1: ['IPM_1']}
1 1 {0: ['SPM_0'], 1: ['SPM_1']}
2 0 {0: ['NPM_0'], 1: ['NPM_1']}
3 1 {0: ['DPM_0'], 1: ['DPM_1']}
终点:
data Dict result
0 0 {0: ['IPM_0'], 1: ['IPM_1']} IPM_0
1 1 {0: ['SPM_0'], 1: ['SPM_1']} SPM_1
2 0 {0: ['NPM_0'], 1: ['NPM_1']} NPM_0
3 1 {0: ['DPM_0'], 1: ['DPM_1']} DPM_1
列表推导式总是很方便的:
df['result'] = [x[y][0] for x, y in zip(df['Dict'], df['data'])]
您可以使用apply
。请记住,在数据框架中存储复杂的结构是没有效率的
df['result'] = df.apply(lambda r: r['Dict'][r['data']], axis=1)
输出:
data Dict result
0 0 {0: ['IPM_0'], 1: ['IPM_1']} [IPM_0]
1 1 {0: ['SPM_0'], 1: ['SPM_1']} [SPM_1]
2 0 {0: ['NPM_0'], 1: ['NPM_1']} [NPM_0]
3 1 {0: ['DPM_0'], 1: ['DPM_1']} [DPM_1]
,如果你想要列表的第一项:
df['result'] = df.apply(lambda r: r['Dict'][r['data']][0], axis=1)
输出:
data Dict result
0 0 {0: ['IPM_0'], 1: ['IPM_1']} IPM_0
1 1 {0: ['SPM_0'], 1: ['SPM_1']} SPM_1
2 0 {0: ['NPM_0'], 1: ['NPM_1']} NPM_0
3 1 {0: ['DPM_0'], 1: ['DPM_1']} DPM_1