Panda连接两个DataFrames，其中一个是数据透视表

假设我有两个数据帧，如下所示：

df1:
movieID 1 2 3 4
userID
0       2 0 0 2
1       1 1 4 0
2       0 2 3 0
3       1 2 0 0

和

df2:
userID movieID
0       0       2 
1       0       3
2       0       4
3       1       3

我试图实现的是将两者结合起来，使df2包含一个新列，其中包含特定电影的用户相关评级。因此，本例中的df2将变为：

df2:
userID movieID rating
0       0       2      0
1       0       3      0
2       0       4      2
3       1       3      4

我不相信简单地将df2重新格式化为与df1相同的形状会起作用，因为不能保证它会具有所有的用户ID或movieID，我已经研究了merge函数，但我对如何在这种情况下设置how和on参数感到困惑。如果有人能解释我是如何做到这一点的，我将不胜感激。

您可以apply()通过row索引df1.loc[row.userID, row.movieID]。

只需确保df1.columns的dtype与df2.userID匹配，df2.movieID与df1.index匹配即可。

df1.columns = df1.columns.astype(df2.movieID.dtype)
df1.index = df1.index.astype(df2.userID.dtype)
df2['rating'] = df2.apply(lambda row: df1.loc[row.userID, row.movieID], axis=1)
#    userID  movieID  rating
# 0       0        2       0
# 1       0        3       0
# 2       0        4       2
# 3       1        3       4

相关内容

最新更新

热门标签：