Pandas系列到2D阵列

因此，我使用将2D数组放入pandas系列中的答案将2D numpy阵列放入熊猫系列。简而言之，它是

a = np.zeros((5,2))
s = pd.Series(list(a))

现在，将Pandas系列转换回2D数组的最便宜方法是什么？如果我尝试s.values，我会使用object DTYPE获得数组。

到目前为止，我尝试了np.vstack(s.values)，但它复制了数据，当然。

我相信您需要：

a = np.array(s.values.tolist())
print (a)
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]

a = np.zeros((50000,2))
s = pd.Series(list(a))
In [131]: %timeit (np.vstack(s.values))
10 loops, best of 3: 107 ms per loop
In [132]: %timeit (np.array(s.values.tolist()))
10 loops, best of 3: 19.7 ms per loop
In [133]: %timeit (np.array(s.tolist()))
100 loops, best of 3: 19.6 ms per loop

但是，如果转置差异很小(但缓存(：

a = np.zeros((2,50000))
s = pd.Series(list(a))
#print (s)
In [159]: %timeit (np.vstack(s.values))
The slowest run took 23.31 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 55.7 µs per loop
In [160]: %timeit (np.array(s.values.tolist()))
The slowest run took 7.20 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 49.8 µs per loop
In [161]: %timeit (np.array(s.tolist()))
The slowest run took 7.31 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 62.6 µs per loop

相关内容

最新更新

热门标签：