如何使用to_categorical将[[4,7,10],[10,20,30]]转换为一个热编码



我正在研究LSTM。

输出是分类的。

其格式为[[t11,t12,t13],[t21,t22,t23]

我能够在1d阵列中做到这一点,但我发现在2d阵列中很难做到。

from keras.utils import to_categorical
print(to_categorical([[9,10,11],[10,11,12]]))

输出

[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]

有两个不同的输入,每个都有3个时间步长,但在输出中,它们都是组合的。

我需要它,

[[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]],
[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]]

如果形状很奇怪,请尝试将其设为1D,使用函数,然后将其重新整形:

originalShape = myData.shape
totalFeatures = myData.max() + 1
categorical = myData.reshape((-1,))
categorical = to_categorical(categorical)
categorical = categorical.reshape(originalShape + (totalFeatures,))

我意识到我可以通过重塑来实现我想要的

print(a.reshape(2,3,13))

[[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]]
[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]]

重新整形时会出现错误,因为最高类索引是12,因此有13个类(0,1,…,12(。为了进一步避免这种错误,您可以通过调用one_hot.reshape(sparse.shape + [-1])来推断这些维度,其中one_hot是由to_categorical()产生的一个热编码向量,sparse是原始向量。

最新更新