将rec.array转换为数据帧



我一直在尝试将numpy rec.array转换为数据帧。当前数组看起来像:

[rec.array([([0.2], [ 1.76405235,  0.40015721,  0.97873798,  2.2408932 ]),
([0.2], [ 1.86755799, -0.97727788,  0.95008842, -0.15135721]),
([0.2], [-0.10321885,  0.4105985 ,  0.14404357,  1.45427351]),
([0.2], [ 0.76103773,  0.12167502,  0.44386323,  0.33367433]),
([0.2], [ 1.49407907, -0.20515826,  0.3130677 , -0.85409574])],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))]),
rec.array([([0.1], [ 1.76405235,  0.40015721,  0.97873798,  2.2408932 ]),
([0.1], [ 1.86755799, -0.97727788,  0.95008842, -0.15135721]),
([0.1], [-0.10321885,  0.4105985 ,  0.14404357,  1.45427351]),
([0.1], [ 0.76103773,  0.12167502,  0.44386323,  0.33367433]),
([0.1], [ 1.49407907, -0.20515826,  0.3130677 , -0.85409574]),
([0.1], [-2.55298982,  0.6536186 ,  0.8644362 , -0.74216502]),
([0.1], [ 2.26975462, -1.45436567,  0.04575852, -0.18718385]),
([0.1], [ 1.53277921,  1.46935877,  0.15494743,  0.37816252]),
([0.1], [-0.88778575, -1.98079647, -0.34791215,  0.15634897]),
([0.1], [ 1.23029068,  1.20237985, -0.38732682, -0.30230275])],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))]),
rec.array([([0.16666667], [ 1.76405235,  0.40015721,  0.97873798,  2.2408932 ]),
([0.16666667], [ 1.86755799, -0.97727788,  0.95008842, -0.15135721]),
([0.16666667], [-0.10321885,  0.4105985 ,  0.14404357,  1.45427351]),
([0.16666667], [ 0.76103773,  0.12167502,  0.44386323,  0.33367433]),
([0.16666667], [ 1.49407907, -0.20515826,  0.3130677 , -0.85409574]),
([0.16666667], [-2.55298982,  0.6536186 ,  0.8644362 , -0.74216502])],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))]),
rec.array([([0.05882353], [ 1.76405235,  0.40015721,  0.97873798,  2.2408932 ]),
([0.05882353], [ 1.86755799, -0.97727788,  0.95008842, -0.15135721]),
([0.05882353], [-0.10321885,  0.4105985 ,  0.14404357,  1.45427351]),
([0.05882353], [ 0.76103773,  0.12167502,  0.44386323,  0.33367433]),
([0.05882353], [ 1.49407907, -0.20515826,  0.3130677 , -0.85409574]),
([0.05882353], [-2.55298982,  0.6536186 ,  0.8644362 , -0.74216502]),
([0.05882353], [ 2.26975462, -1.45436567,  0.04575852, -0.18718385]),
([0.05882353], [ 1.53277921,  1.46935877,  0.15494743,  0.37816252]),
([0.05882353], [-0.88778575, -1.98079647, -0.34791215,  0.15634897]),
([0.05882353], [ 1.23029068,  1.20237985, -0.38732682, -0.30230275]),
([0.05882353], [-1.04855297, -1.42001794, -1.70627019,  1.9507754 ]),
([0.05882353], [-0.50965218, -0.4380743 , -1.25279536,  0.77749036]),
([0.05882353], [-1.61389785, -0.21274028, -0.89546656,  0.3869025 ]),
([0.05882353], [-0.51080514, -1.18063218, -0.02818223,  0.42833187]),
([0.05882353], [ 0.06651722,  0.3024719 , -0.63432209, -0.36274117]),
([0.05882353], [-0.67246045, -0.35955316, -0.81314628, -1.7262826 ]),
([0.05882353], [ 0.17742614, -0.40178094, -1.63019835,  0.46278226])]],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))])]

结果应该是一个五列数据帧,如下所示:

>v_2>.97873798>0.95008842>
权重v_1v_3v_4
0.21.764052350.40015721
0.21.86755799-0.7727788-0.1513721
0.058823530.17742614-0.40178094-1.630198350.46278226

我假设您的recarray存储在一个名为data的变量中。您可以使用pd.DataFramepd.concat将数组转换为数据帧。然后可以使用pandas.DataFrame.pop删除列表数组,使用pandas.DataFrame.explode将包含列表的列转换为多列中的数据。

读取数据

df = pd.DataFrame()
for record in data:
temp_df = pd.DataFrame(record.tolist())
df = pd.concat([df, temp_df])

预处理和解开数据

df[['v_1', 'v_2', 'v_3', 'v_4']] = pd.DataFrame(df[1].tolist(), index= df.index)
df['weights'] = df.pop(0).explode()
df.pop(1)

输出:

这给了我们预期的输出:

v_1       v_2       v_3       v_4   weights
0   1.764052  0.400157  0.978738  2.240893       0.2
1   1.867558 -0.977278  0.950088 -0.151357       0.2
2  -0.103219  0.410598  0.144044  1.454274       0.2
3   0.761038  0.121675  0.443863  0.333674       0.2
4   1.494079 -0.205158  0.313068 -0.854096       0.2
5   1.764052  0.400157  0.978738  2.240893       0.1
6   1.867558 -0.977278  0.950088 -0.151357       0.1
7  -0.103219  0.410598  0.144044  1.454274       0.1
8   0.761038  0.121675  0.443863  0.333674       0.1
9   1.494079 -0.205158  0.313068 -0.854096       0.1
10 -2.552990  0.653619  0.864436 -0.742165       0.1
11  2.269755 -1.454366  0.045759 -0.187184       0.1
12  1.532779  1.469359  0.154947  0.378163       0.1
13 -0.887786 -1.980796 -0.347912  0.156349       0.1
14  1.230291  1.202380 -0.387327 -0.302303       0.1
15  1.764052  0.400157  0.978738  2.240893  0.166667
16  1.867558 -0.977278  0.950088 -0.151357  0.166667
17 -0.103219  0.410598  0.144044  1.454274  0.166667
18  0.761038  0.121675  0.443863  0.333674  0.166667
19  1.494079 -0.205158  0.313068 -0.854096  0.166667
20 -2.552990  0.653619  0.864436 -0.742165  0.166667
21  1.764052  0.400157  0.978738  2.240893  0.058824
22  1.867558 -0.977278  0.950088 -0.151357  0.058824
23 -0.103219  0.410598  0.144044  1.454274  0.058824
24  0.761038  0.121675  0.443863  0.333674  0.058824
25  1.494079 -0.205158  0.313068 -0.854096  0.058824
26 -2.552990  0.653619  0.864436 -0.742165  0.058824
27  2.269755 -1.454366  0.045759 -0.187184  0.058824
28  1.532779  1.469359  0.154947  0.378163  0.058824
29 -0.887786 -1.980796 -0.347912  0.156349  0.058824
30  1.230291  1.202380 -0.387327 -0.302303  0.058824
31 -1.048553 -1.420018 -1.706270  1.950775  0.058824
32 -0.509652 -0.438074 -1.252795  0.777490  0.058824
33 -1.613898 -0.212740 -0.895467  0.386902  0.058824
34 -0.510805 -1.180632 -0.028182  0.428332  0.058824
35  0.066517  0.302472 -0.634322 -0.362741  0.058824
36 -0.672460 -0.359553 -0.813146 -1.726283  0.058824
37  0.177426 -0.401781 -1.630198  0.462782  0.058824

或者

同样的事情也可以使用np.hstack来完成,其中数据是重新数组的列表。

df = pd.DataFrame(np.hstack(data).tolist())
df['weights'] = df[0].explode()
df[['v_1', 'v_2', 'v_3', 'v_4']] = pd.DataFrame(df[1].tolist())
df.drop([0, 1], inplace=True, axis=1)

输出

这给了我们相同的输出

weights       v_1       v_2       v_3       v_4
0        0.2  1.764052  0.400157  0.978738  2.240893
1        0.2  1.867558 -0.977278  0.950088 -0.151357
2        0.2 -0.103219  0.410598  0.144044  1.454274
3        0.2  0.761038  0.121675  0.443863  0.333674
4        0.2  1.494079 -0.205158  0.313068 -0.854096
5        0.1  1.764052  0.400157  0.978738  2.240893
6        0.1  1.867558 -0.977278  0.950088 -0.151357
7        0.1 -0.103219  0.410598  0.144044  1.454274
8        0.1  0.761038  0.121675  0.443863  0.333674
9        0.1  1.494079 -0.205158  0.313068 -0.854096
10       0.1 -2.552990  0.653619  0.864436 -0.742165
11       0.1  2.269755 -1.454366  0.045759 -0.187184
12       0.1  1.532779  1.469359  0.154947  0.378163
13       0.1 -0.887786 -1.980796 -0.347912  0.156349
14       0.1  1.230291  1.202380 -0.387327 -0.302303
15  0.166667  1.764052  0.400157  0.978738  2.240893
16  0.166667  1.867558 -0.977278  0.950088 -0.151357
17  0.166667 -0.103219  0.410598  0.144044  1.454274
18  0.166667  0.761038  0.121675  0.443863  0.333674
19  0.166667  1.494079 -0.205158  0.313068 -0.854096
20  0.166667 -2.552990  0.653619  0.864436 -0.742165
21  0.058824  1.764052  0.400157  0.978738  2.240893
22  0.058824  1.867558 -0.977278  0.950088 -0.151357
23  0.058824 -0.103219  0.410598  0.144044  1.454274
24  0.058824  0.761038  0.121675  0.443863  0.333674
25  0.058824  1.494079 -0.205158  0.313068 -0.854096
26  0.058824 -2.552990  0.653619  0.864436 -0.742165
27  0.058824  2.269755 -1.454366  0.045759 -0.187184
28  0.058824  1.532779  1.469359  0.154947  0.378163
29  0.058824 -0.887786 -1.980796 -0.347912  0.156349
30  0.058824  1.230291  1.202380 -0.387327 -0.302303
31  0.058824 -1.048553 -1.420018 -1.706270  1.950775
32  0.058824 -0.509652 -0.438074 -1.252795  0.777490
33  0.058824 -1.613898 -0.212740 -0.895467  0.386902
34  0.058824 -0.510805 -1.180632 -0.028182  0.428332
35  0.058824  0.066517  0.302472 -0.634322 -0.362741
36  0.058824 -0.672460 -0.359553 -0.813146 -1.726283
37  0.058824  0.177426 -0.401781 -1.630198  0.462782

相关内容

  • 没有找到相关文章

最新更新