我有一个用Python构造和训练Tensorflow的模型,我需要在Java和Eclipse中使用它。它的配置如下所示:
### define model
model = keras.Sequential()
## leakage of the leaky relu
LRU = 0.01
### the first set of convolutional layers
model.add(keras.layers.Conv1D(filters=6, kernel_size=2, strides=1, padding="same", input_shape=(6, 1)))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
model.add(keras.layers.Conv1D(filters=12, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
model.add(keras.layers.Conv1D(filters=24, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
model.add(keras.layers.Conv1D(filters=48, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.MaxPooling1D(pool_size=2, strides=1))
### the second set of convolutional layers
model.add(keras.layers.Conv1D(filters=24, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.Dropout(0.02))
model.add(keras.layers.Conv1D(filters=12, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.Dropout(0.01))
model.add(keras.layers.Conv1D(filters=6, kernel_size=2, strides=1, padding="same"))
model.add(keras.layers.LeakyReLU(LRU))
model.add(keras.layers.Flatten())
为了更好的可读性,model.summary()
的输出为:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d (Conv1D) (None, 6, 6) 18
_________________________________________________________________
leaky_re_lu (LeakyReLU) (None, 6, 6) 0
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 5, 6) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 5, 12) 156
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 5, 12) 0
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 4, 12) 0
_________________________________________________________________
conv1d_2 (Conv1D) (None, 4, 24) 600
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 4, 24) 0
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 3, 24) 0
_________________________________________________________________
conv1d_3 (Conv1D) (None, 3, 48) 2352
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 3, 48) 0
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 2, 48) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 2, 24) 2328
_________________________________________________________________
leaky_re_lu_4 (LeakyReLU) (None, 2, 24) 0
_________________________________________________________________
dropout (Dropout) (None, 2, 24) 0
_________________________________________________________________
conv1d_5 (Conv1D) (None, 2, 12) 588
_________________________________________________________________
leaky_re_lu_5 (LeakyReLU) (None, 2, 12) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 2, 12) 0
_________________________________________________________________
conv1d_6 (Conv1D) (None, 2, 6) 150
_________________________________________________________________
leaky_re_lu_6 (LeakyReLU) (None, 2, 6) 0
_________________________________________________________________
flatten (Flatten) (None, 12) 0
=================================================================
Total params: 6,192
Trainable params: 6,192
Non-trainable params: 0
我用model.save('my_model.h5')
保存了训练好的模型。在Eclipse中,我使用DeepLearning4J,因为它提供了keras特性。我用
MultiLayerNetwork model = KerasModelImport.importKerasSequentialModelAndWeights('my_model.h5');
Eclipse中model.summary()
的输出为:
=======================================================================================
LayerName (LayerType) nIn,nOut TotalParams ParamsShape
=======================================================================================
conv1d (Convolution1DLayer) 1,6 18 b:{1,6}, W:{6,1,2,1}
leaky_re_lu (ActivationLayer) -,- 0 -
max_pooling1d (Subsampling1DLayer) -,- 0 -
conv1d_1 (Convolution1DLayer) 6,12 156 b:{1,12}, W:{12,6,2,1}
leaky_re_lu_1 (ActivationLayer) -,- 0 -
max_pooling1d_1 (Subsampling1DLayer) -,- 0 -
conv1d_2 (Convolution1DLayer) 12,24 600 b:{1,24}, W:{24,12,2,1}
leaky_re_lu_2 (ActivationLayer) -,- 0 -
max_pooling1d_2 (Subsampling1DLayer) -,- 0 -
conv1d_3 (Convolution1DLayer) 24,48 2,352 b:{1,48}, W:{48,24,2,1}
leaky_re_lu_3 (ActivationLayer) -,- 0 -
max_pooling1d_3 (Subsampling1DLayer) -,- 0 -
conv1d_4 (Convolution1DLayer) 48,24 2,328 b:{1,24}, W:{24,48,2,1}
leaky_re_lu_4 (ActivationLayer) -,- 0 -
dropout (DropoutLayer) -,- 0 -
conv1d_5 (Convolution1DLayer) 24,12 588 b:{1,12}, W:{12,24,2,1}
leaky_re_lu_5 (ActivationLayer) -,- 0 -
dropout_1 (DropoutLayer) -,- 0 -
conv1d_6 (Convolution1DLayer) 12,6 150 b:{1,6}, W:{6,12,2,1}
leaky_re_lu_6 (ActivationLayer) -,- 0 -
flatten_loss (LossLayer) -,- 0 -
---------------------------------------------------------------------------------------
Total Parameters: 6,192
Trainable Parameters: 6,192
Frozen Parameters: 0
=======================================================================================
现在我正在尝试使用网络在一个特定的输入进行预测(使用1个输入所需的形状是(1,6,1)):
INDArray input = Nd4j.create(new double[][][] {{{0.0702},{0.1191},{0.1702},{0.1310},{0.2248},{0.3205}}});
运行命令
INDArray output = model.output(input,false);
现在导致以下错误:
Error at [/home/runner/work/deeplearning4j/deeplearning4j/libnd4j/include/ops/declarable/generic/nn/convo/conv1d.cpp:70:0]:
CUSTOM CONV1D OP: wrong shape of weights array, expected is [2, 3, 12], but got [2, 6, 12] instead !
Exception in thread "main" 14:04:45.925 [main] ERROR org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner - Failed to execute op conv1d. Attempted to execute with 3 inputs, 1 outputs, 0 targs,0 bargs and 6 iargs. Inputs: [(FLOAT,[1,3,5],c), (FLOAT,[2,6,12],c), (FLOAT,[12],c)]. Outputs: [(FLOAT,[1,12,5],c)]. tArgs: -. iArgs: [2, 1, 0, 1, 1, 0]. bArgs: -. Op own name: "d3142824-92ad-41c4-99e0-1111f4f75a5f" - Please see above message (printed out from c++) for a possible cause of error.
我真的不明白为什么它期望形状的权重是[2,3,12]?这在summary()
命令中没有出现…
有人知道这里出了什么问题吗?
编辑:我试着在Python中做同样的事情,似乎一切都工作:
import tensorflow as tf
import pandas as pd
import numpy as np
input = np.array(0.0702,0.1191,0.1702,0.1310,0.2248,0.3205)
input = input.reshape((1,6,1))
model = tf.keras.models.load_model('my_model.h5')
model.predict(input)
给出输出
array([[0.01571162, 0.02778796, 0.07743346, 0.1355067 , 0.19171888,
0.25232363, 0.32630318, 0.3799559 , 0.42786202, 0.46094775,
0.50001675, 0.54188776]], dtype=float32)
这是cnn 1d import的一个问题。
对于后代来说,发生的事情是预期的通道数实际上是实际通道数的两倍。这是一个错误,并已修复。
它被记录在这里:https://github.com/eclipse/deeplearning4j/issues/9476
修复:https://github.com/eclipse/deeplearning4j/pull/9477
一旦这个修复被合并,你可以使用快照,或者作为一个解决方案,尝试我们的tensorflow导入。