DeepLearning4J:权重数组的形状错误



我有一个用Python构造和训练Tensorflow的模型,我需要在Java和Eclipse中使用它。它的配置如下所示:

### define model
model = keras.Sequential()
## leakage of the leaky relu
LRU = 0.01
### the first set of convolutional layers
model.add(keras.layers.Conv1D(filters=6, kernel_size=2, strides=1, padding="same", input_shape=(6, 1)))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
model.add(keras.layers.Conv1D(filters=12, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
model.add(keras.layers.Conv1D(filters=24, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
model.add(keras.layers.Conv1D(filters=48, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
### the second set of convolutional layers
model.add(keras.layers.Conv1D(filters=24, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.Dropout(0.02))
model.add(keras.layers.Conv1D(filters=12, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.Dropout(0.01))
model.add(keras.layers.Conv1D(filters=6, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.Flatten())

为了更好的可读性,model.summary()的输出为:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d (Conv1D)              (None, 6, 6)              18        
_________________________________________________________________
leaky_re_lu (LeakyReLU)      (None, 6, 6)              0         
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 5, 6)              0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 5, 12)             156       
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 5, 12)             0         
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 4, 12)             0         
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 4, 24)             600       
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 4, 24)             0         
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 3, 24)             0         
_________________________________________________________________
conv1d_3 (Conv1D)            (None, 3, 48)             2352      
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 3, 48)             0         
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 2, 48)             0         
_________________________________________________________________
conv1d_4 (Conv1D)            (None, 2, 24)             2328      
_________________________________________________________________
leaky_re_lu_4 (LeakyReLU)    (None, 2, 24)             0         
_________________________________________________________________
dropout (Dropout)            (None, 2, 24)             0         
_________________________________________________________________
conv1d_5 (Conv1D)            (None, 2, 12)             588       
_________________________________________________________________
leaky_re_lu_5 (LeakyReLU)    (None, 2, 12)             0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 2, 12)             0         
_________________________________________________________________
conv1d_6 (Conv1D)            (None, 2, 6)              150       
_________________________________________________________________
leaky_re_lu_6 (LeakyReLU)    (None, 2, 6)              0         
_________________________________________________________________
flatten (Flatten)            (None, 12)                0         
=================================================================
Total params: 6,192
Trainable params: 6,192
Non-trainable params: 0

我用model.save('my_model.h5')保存了训练好的模型。在Eclipse中,我使用DeepLearning4J,因为它提供了keras特性。我用

加载模型
MultiLayerNetwork model = KerasModelImport.importKerasSequentialModelAndWeights('my_model.h5');

Eclipse中model.summary()的输出为:

=======================================================================================
LayerName (LayerType)                  nIn,nOut   TotalParams   ParamsShape            
=======================================================================================
conv1d (Convolution1DLayer)            1,6        18            b:{1,6}, W:{6,1,2,1}   
leaky_re_lu (ActivationLayer)          -,-        0             -                      
max_pooling1d (Subsampling1DLayer)     -,-        0             -                      
conv1d_1 (Convolution1DLayer)          6,12       156           b:{1,12}, W:{12,6,2,1} 
leaky_re_lu_1 (ActivationLayer)        -,-        0             -                      
max_pooling1d_1 (Subsampling1DLayer)   -,-        0             -                      
conv1d_2 (Convolution1DLayer)          12,24      600           b:{1,24}, W:{24,12,2,1}
leaky_re_lu_2 (ActivationLayer)        -,-        0             -                      
max_pooling1d_2 (Subsampling1DLayer)   -,-        0             -                      
conv1d_3 (Convolution1DLayer)          24,48      2,352         b:{1,48}, W:{48,24,2,1}
leaky_re_lu_3 (ActivationLayer)        -,-        0             -                      
max_pooling1d_3 (Subsampling1DLayer)   -,-        0             -                      
conv1d_4 (Convolution1DLayer)          48,24      2,328         b:{1,24}, W:{24,48,2,1}
leaky_re_lu_4 (ActivationLayer)        -,-        0             -                      
dropout (DropoutLayer)                 -,-        0             -                      
conv1d_5 (Convolution1DLayer)          24,12      588           b:{1,12}, W:{12,24,2,1}
leaky_re_lu_5 (ActivationLayer)        -,-        0             -                      
dropout_1 (DropoutLayer)               -,-        0             -                      
conv1d_6 (Convolution1DLayer)          12,6       150           b:{1,6}, W:{6,12,2,1}  
leaky_re_lu_6 (ActivationLayer)        -,-        0             -                      
flatten_loss (LossLayer)               -,-        0             -                      
---------------------------------------------------------------------------------------
Total Parameters:  6,192
Trainable Parameters:  6,192
Frozen Parameters:  0
=======================================================================================

现在我正在尝试使用网络在一个特定的输入进行预测(使用1个输入所需的形状是(1,6,1)):

INDArray input = Nd4j.create(new double[][][] {{{0.0702},{0.1191},{0.1702},{0.1310},{0.2248},{0.3205}}});

运行命令

INDArray output = model.output(input,false);

现在导致以下错误:

Error at [/home/runner/work/deeplearning4j/deeplearning4j/libnd4j/include/ops/declarable/generic/nn/convo/conv1d.cpp:70:0]:
CUSTOM CONV1D OP: wrong shape of weights array, expected is [2, 3, 12], but got [2, 6, 12] instead !
Exception in thread "main" 14:04:45.925 [main] ERROR org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner - Failed to execute op conv1d. Attempted to execute with 3 inputs, 1 outputs, 0 targs,0 bargs and 6 iargs. Inputs: [(FLOAT,[1,3,5],c), (FLOAT,[2,6,12],c), (FLOAT,[12],c)]. Outputs: [(FLOAT,[1,12,5],c)]. tArgs: -. iArgs: [2, 1, 0, 1, 1, 0]. bArgs: -. Op own name: "d3142824-92ad-41c4-99e0-1111f4f75a5f" - Please see above message (printed out from c++) for a possible cause of error.

我真的不明白为什么它期望形状的权重是[2,3,12]?这在summary()命令中没有出现…

有人知道这里出了什么问题吗?

编辑:我试着在Python中做同样的事情,似乎一切都工作:

import tensorflow as tf
import pandas as pd
import numpy as np
input = np.array(0.0702,0.1191,0.1702,0.1310,0.2248,0.3205)
input = input.reshape((1,6,1))
model = tf.keras.models.load_model('my_model.h5')
model.predict(input)

给出输出

array([[0.01571162, 0.02778796, 0.07743346, 0.1355067 , 0.19171888,
0.25232363, 0.32630318, 0.3799559 , 0.42786202, 0.46094775,
0.50001675, 0.54188776]], dtype=float32)

这是cnn 1d import的一个问题。

对于后代来说,发生的事情是预期的通道数实际上是实际通道数的两倍。这是一个错误,并已修复。

它被记录在这里:https://github.com/eclipse/deeplearning4j/issues/9476

修复:https://github.com/eclipse/deeplearning4j/pull/9477

一旦这个修复被合并,你可以使用快照,或者作为一个解决方案,尝试我们的tensorflow导入。

最新更新