我一直在尝试将numpy rec.array转换为数据帧。当前数组看起来像:
[rec.array([([0.2], [ 1.76405235, 0.40015721, 0.97873798, 2.2408932 ]),
([0.2], [ 1.86755799, -0.97727788, 0.95008842, -0.15135721]),
([0.2], [-0.10321885, 0.4105985 , 0.14404357, 1.45427351]),
([0.2], [ 0.76103773, 0.12167502, 0.44386323, 0.33367433]),
([0.2], [ 1.49407907, -0.20515826, 0.3130677 , -0.85409574])],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))]),
rec.array([([0.1], [ 1.76405235, 0.40015721, 0.97873798, 2.2408932 ]),
([0.1], [ 1.86755799, -0.97727788, 0.95008842, -0.15135721]),
([0.1], [-0.10321885, 0.4105985 , 0.14404357, 1.45427351]),
([0.1], [ 0.76103773, 0.12167502, 0.44386323, 0.33367433]),
([0.1], [ 1.49407907, -0.20515826, 0.3130677 , -0.85409574]),
([0.1], [-2.55298982, 0.6536186 , 0.8644362 , -0.74216502]),
([0.1], [ 2.26975462, -1.45436567, 0.04575852, -0.18718385]),
([0.1], [ 1.53277921, 1.46935877, 0.15494743, 0.37816252]),
([0.1], [-0.88778575, -1.98079647, -0.34791215, 0.15634897]),
([0.1], [ 1.23029068, 1.20237985, -0.38732682, -0.30230275])],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))]),
rec.array([([0.16666667], [ 1.76405235, 0.40015721, 0.97873798, 2.2408932 ]),
([0.16666667], [ 1.86755799, -0.97727788, 0.95008842, -0.15135721]),
([0.16666667], [-0.10321885, 0.4105985 , 0.14404357, 1.45427351]),
([0.16666667], [ 0.76103773, 0.12167502, 0.44386323, 0.33367433]),
([0.16666667], [ 1.49407907, -0.20515826, 0.3130677 , -0.85409574]),
([0.16666667], [-2.55298982, 0.6536186 , 0.8644362 , -0.74216502])],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))]),
rec.array([([0.05882353], [ 1.76405235, 0.40015721, 0.97873798, 2.2408932 ]),
([0.05882353], [ 1.86755799, -0.97727788, 0.95008842, -0.15135721]),
([0.05882353], [-0.10321885, 0.4105985 , 0.14404357, 1.45427351]),
([0.05882353], [ 0.76103773, 0.12167502, 0.44386323, 0.33367433]),
([0.05882353], [ 1.49407907, -0.20515826, 0.3130677 , -0.85409574]),
([0.05882353], [-2.55298982, 0.6536186 , 0.8644362 , -0.74216502]),
([0.05882353], [ 2.26975462, -1.45436567, 0.04575852, -0.18718385]),
([0.05882353], [ 1.53277921, 1.46935877, 0.15494743, 0.37816252]),
([0.05882353], [-0.88778575, -1.98079647, -0.34791215, 0.15634897]),
([0.05882353], [ 1.23029068, 1.20237985, -0.38732682, -0.30230275]),
([0.05882353], [-1.04855297, -1.42001794, -1.70627019, 1.9507754 ]),
([0.05882353], [-0.50965218, -0.4380743 , -1.25279536, 0.77749036]),
([0.05882353], [-1.61389785, -0.21274028, -0.89546656, 0.3869025 ]),
([0.05882353], [-0.51080514, -1.18063218, -0.02818223, 0.42833187]),
([0.05882353], [ 0.06651722, 0.3024719 , -0.63432209, -0.36274117]),
([0.05882353], [-0.67246045, -0.35955316, -0.81314628, -1.7262826 ]),
([0.05882353], [ 0.17742614, -0.40178094, -1.63019835, 0.46278226])]],
dtype=[('weights', '<f8', (1,)), ('integration', '<f8', (4,))])]
结果应该是一个五列数据帧,如下所示:
权重 | v_1 | >v_2v_3 | >v_4 | |
---|---|---|---|---|
0.2 | 1.76405235 | 0.40015721 | .97873798||
0.2 | 1.86755799 | -0.7727788 | >0.95008842-0.1513721 | |
0.05882353 | 0.17742614 | -0.40178094 | >-1.63019835 | 0.46278226 |
我假设您的recarray
存储在一个名为data
的变量中。您可以使用pd.DataFrame
和pd.concat
将数组转换为数据帧。然后可以使用pandas.DataFrame.pop
删除列表数组,使用pandas.DataFrame.explode
将包含列表的列转换为多列中的数据。
读取数据
df = pd.DataFrame()
for record in data:
temp_df = pd.DataFrame(record.tolist())
df = pd.concat([df, temp_df])
预处理和解开数据
df[['v_1', 'v_2', 'v_3', 'v_4']] = pd.DataFrame(df[1].tolist(), index= df.index)
df['weights'] = df.pop(0).explode()
df.pop(1)
输出:
这给了我们预期的输出:
v_1 v_2 v_3 v_4 weights
0 1.764052 0.400157 0.978738 2.240893 0.2
1 1.867558 -0.977278 0.950088 -0.151357 0.2
2 -0.103219 0.410598 0.144044 1.454274 0.2
3 0.761038 0.121675 0.443863 0.333674 0.2
4 1.494079 -0.205158 0.313068 -0.854096 0.2
5 1.764052 0.400157 0.978738 2.240893 0.1
6 1.867558 -0.977278 0.950088 -0.151357 0.1
7 -0.103219 0.410598 0.144044 1.454274 0.1
8 0.761038 0.121675 0.443863 0.333674 0.1
9 1.494079 -0.205158 0.313068 -0.854096 0.1
10 -2.552990 0.653619 0.864436 -0.742165 0.1
11 2.269755 -1.454366 0.045759 -0.187184 0.1
12 1.532779 1.469359 0.154947 0.378163 0.1
13 -0.887786 -1.980796 -0.347912 0.156349 0.1
14 1.230291 1.202380 -0.387327 -0.302303 0.1
15 1.764052 0.400157 0.978738 2.240893 0.166667
16 1.867558 -0.977278 0.950088 -0.151357 0.166667
17 -0.103219 0.410598 0.144044 1.454274 0.166667
18 0.761038 0.121675 0.443863 0.333674 0.166667
19 1.494079 -0.205158 0.313068 -0.854096 0.166667
20 -2.552990 0.653619 0.864436 -0.742165 0.166667
21 1.764052 0.400157 0.978738 2.240893 0.058824
22 1.867558 -0.977278 0.950088 -0.151357 0.058824
23 -0.103219 0.410598 0.144044 1.454274 0.058824
24 0.761038 0.121675 0.443863 0.333674 0.058824
25 1.494079 -0.205158 0.313068 -0.854096 0.058824
26 -2.552990 0.653619 0.864436 -0.742165 0.058824
27 2.269755 -1.454366 0.045759 -0.187184 0.058824
28 1.532779 1.469359 0.154947 0.378163 0.058824
29 -0.887786 -1.980796 -0.347912 0.156349 0.058824
30 1.230291 1.202380 -0.387327 -0.302303 0.058824
31 -1.048553 -1.420018 -1.706270 1.950775 0.058824
32 -0.509652 -0.438074 -1.252795 0.777490 0.058824
33 -1.613898 -0.212740 -0.895467 0.386902 0.058824
34 -0.510805 -1.180632 -0.028182 0.428332 0.058824
35 0.066517 0.302472 -0.634322 -0.362741 0.058824
36 -0.672460 -0.359553 -0.813146 -1.726283 0.058824
37 0.177426 -0.401781 -1.630198 0.462782 0.058824
或者
同样的事情也可以使用np.hstack
来完成,其中数据是重新数组的列表。
df = pd.DataFrame(np.hstack(data).tolist())
df['weights'] = df[0].explode()
df[['v_1', 'v_2', 'v_3', 'v_4']] = pd.DataFrame(df[1].tolist())
df.drop([0, 1], inplace=True, axis=1)
输出
这给了我们相同的输出
weights v_1 v_2 v_3 v_4
0 0.2 1.764052 0.400157 0.978738 2.240893
1 0.2 1.867558 -0.977278 0.950088 -0.151357
2 0.2 -0.103219 0.410598 0.144044 1.454274
3 0.2 0.761038 0.121675 0.443863 0.333674
4 0.2 1.494079 -0.205158 0.313068 -0.854096
5 0.1 1.764052 0.400157 0.978738 2.240893
6 0.1 1.867558 -0.977278 0.950088 -0.151357
7 0.1 -0.103219 0.410598 0.144044 1.454274
8 0.1 0.761038 0.121675 0.443863 0.333674
9 0.1 1.494079 -0.205158 0.313068 -0.854096
10 0.1 -2.552990 0.653619 0.864436 -0.742165
11 0.1 2.269755 -1.454366 0.045759 -0.187184
12 0.1 1.532779 1.469359 0.154947 0.378163
13 0.1 -0.887786 -1.980796 -0.347912 0.156349
14 0.1 1.230291 1.202380 -0.387327 -0.302303
15 0.166667 1.764052 0.400157 0.978738 2.240893
16 0.166667 1.867558 -0.977278 0.950088 -0.151357
17 0.166667 -0.103219 0.410598 0.144044 1.454274
18 0.166667 0.761038 0.121675 0.443863 0.333674
19 0.166667 1.494079 -0.205158 0.313068 -0.854096
20 0.166667 -2.552990 0.653619 0.864436 -0.742165
21 0.058824 1.764052 0.400157 0.978738 2.240893
22 0.058824 1.867558 -0.977278 0.950088 -0.151357
23 0.058824 -0.103219 0.410598 0.144044 1.454274
24 0.058824 0.761038 0.121675 0.443863 0.333674
25 0.058824 1.494079 -0.205158 0.313068 -0.854096
26 0.058824 -2.552990 0.653619 0.864436 -0.742165
27 0.058824 2.269755 -1.454366 0.045759 -0.187184
28 0.058824 1.532779 1.469359 0.154947 0.378163
29 0.058824 -0.887786 -1.980796 -0.347912 0.156349
30 0.058824 1.230291 1.202380 -0.387327 -0.302303
31 0.058824 -1.048553 -1.420018 -1.706270 1.950775
32 0.058824 -0.509652 -0.438074 -1.252795 0.777490
33 0.058824 -1.613898 -0.212740 -0.895467 0.386902
34 0.058824 -0.510805 -1.180632 -0.028182 0.428332
35 0.058824 0.066517 0.302472 -0.634322 -0.362741
36 0.058824 -0.672460 -0.359553 -0.813146 -1.726283
37 0.058824 0.177426 -0.401781 -1.630198 0.462782