Pandas DataFrame中的链接元素



大家好!

我有一个pandas DataFrame的子集"mapSubset"由于过滤导致索引不连续。DataFrame包含'key'值。

mapSubset
val  key
0   12    0
2   18    1
4   24    2
6   30    3
8   36    4

在pandas DataFrame "link"中,其中包含与"mapSubset"中的'key'值对应的'key'值。DataFrame中,我想从mapsubset中添加相应的索引。(DataFrame)。

link
key
0    4
1    2
2    0

预期输出:

link
key  keyIndex
0    4         8
1    2         4
2    0         0

I Tried the following:

import pandas as pd
import numpy as np
# prepare dummy data
mapFull = pd.DataFrame()
mapFull['val'] = list(range(12,40,3))
mapSubset = mapFull[mapFull.val % 2 == 0]
mapSubset['key'] = list(range(len(mapSubset)))
link = pd.DataFrame()
link['key'] = [4, 2, 0]
# fill 'keyIndex' values into "link" DataFrame
# try No. 1:
# link['keyIndex'] = mapSubset.index[mapSubset.loc[:, 'key'] == link.loc[:, 'key']]
# --> ValueError: Can only compare identically-labeled Series objects
# try No. 2:
link['keyIndex'] = 9999
for pos in range(len(link)):
ii = mapSubset.index[mapSubset.loc[:,'key'] == link.loc[pos,'key']][0]
link.loc[pos,'keyIndex'] = ii

第1次尝试导致

ValueError:只能比较相同标记的Series对象

从技术上讲,我成功地使用了'try No. 2',尽管这是一个丑陋的变通方法。

此外

SettingWithCopyWarning:一个值正试图在一个副本上设置从DataFrame中获取切片。尝试使用.loc[row_indexer,col_indexer] =值而不是

mapSubset['key'] = list(range(len(mapSubset)))中被提升。

如何避免错误消息?怎样才能更好地达到预期的结果?

你可以创建dict他们映射回

link['Keyindex'] = link['key'].map(dict(zip(mapSubset.key,mapSubset.index)))
link
Out[12]: 
key  Keyindex
0    4         8
1    2         4
2    0         0

最新更新