从数据帧的4列创建3D矩阵



我想从我的数据帧的4列创建一个3D矩阵

输入:

df = pd.DataFrame({
"u_id": [55218,55218,55218,55222],
"i_id": [0,0,1,1],
"Num": [0,2,1,2]
"rating":[-1,2,0,2]})

x轴:'u_id';y轴:"i_id"z轴:"Num">

矩阵中的值应该是"评级">

结果应该是

[[[NaN,NaN],
[-1 ,NaN]],
[[NaN,NaN],
[  0,NaN]],
[[  2,NaN],
[NaN,2]]]

到目前为止我尝试了什么:

x = df['u_id']
y = df['i_id']
z = df['Num']
value = df['rating']
Matrix = [[0 for m in len(z)] for m in len(z)] for c in len(x):
Matrix[c][r][m]= value

但这行不通。

我认为您的预期输出并不代表数据帧中的信息。但是,如果您希望将rating的值与其他列一起放置为形状为(3,2,2)的3D阵列中的索引

设置输入数据

import numpy as np
import pandas as pd
df = pd.DataFrame({
"u_id": [55218,55218,55218,55222],
"i_id": [0,0,1,1],
"Num": [0,2,1,2],      # <-- here was a small typo in your code
"rating":[-1,2,0,2]})
df

输出:

u_id  i_id  Num  rating
0  55218     0    0      -1
1  55218     0    2       2
2  55218     1    1       0
3  55222     1    2       2

首先将u_id转换为合适的索引

df['u_id'] = df['u_id'].astype('category').cat.codes
df[['Num','u_id','i_id','rating']] # order columns to correspond to coordinates

输出:

Num  u_id  i_id  rating
0    0     0     0      -1
1    2     0     0       2
2    1     0     1       0
3    2     1     1       2

然后创建输出数组并填写rating

x = np.full(df[['Num','u_id','i_id']].nunique(), np.nan)
x[df['Num'], df['u_id'], df['i_id']] = df['rating']
x

输出:

array([[[-1., nan],
[nan, nan]],
[[nan,  0.],
[nan, nan]],
[[ 2., nan],
[nan,  2.]]])

最新更新