提取numpy数组的滞后特征(+展开维度)|用步长=1重塑numpy数组



我有一个形状为(#timestamp,#features)的时间序列数据数组。我想为每一行提取(timestamp(n_lags(前几行(,并重塑阵列,使其具有用于Keras的LSTM层的输入的形状(#samples, #lags+now,#features)

以这个玩具为例:

import numpy as np
n_rows = 6
n_feat= 3
n_lag = 2
a = np.array(range(n_rows*n_feat)).reshape(n_rows, n_feat)
>>> a.shape = (6, 3)
>>> a = array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]])

通过在行上迭代,我实现了预期输出

b = np.empty(shape=(0, (n_lag + 1), n_feat))
for idx, row in enumerate(a):
temp = np.expand_dims(a[max(0, idx-n_lag):idx+1, :], 0)
if temp.shape[1:] == b.shape[1:]:
b = np.append(b, temp, axis=0)

>>> b.shape = (4, 3, 3)
>>> b = array([[[ 0.,  1.,  2.],
[ 3.,  4.,  5.],
[ 6.,  7.,  8.]],
[[ 3.,  4.,  5.],
[ 6.,  7.,  8.],
[ 9., 10., 11.]],
[[ 6.,  7.,  8.],
[ 9., 10., 11.],
[12., 13., 14.]],
[[ 9., 10., 11.],
[12., 13., 14.],
[15., 16., 17.]]])

注意:第一个n_lags-1行没有足够的数据,将在最终输出中丢弃

问题:我想知道是否有一种比迭代行更优雅/更漂亮的方式。

您可以使用此的新np.lib.stride_ticks.sliding_window_view

n_rows = 6
n_feat= 3
n_lag = 2
a = np.array(range(n_rows*n_feat)).reshape(n_rows, n_feat)
b = np.lib.stride_tricks.sliding_window_view(a, window_shape=(n_feat, n_feat))
b

输出:

array([[[[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8]]],

[[[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11]]],

[[[ 6,  7,  8],
[ 9, 10, 11],
[12, 13, 14]]],

[[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]]]])

b只会改变a的形状和步幅,因此它会多次包含a的相同记忆位置。换句话说,不需要分配新的数组。

最新更新