Caffe网络初始化出现问题



末尾附有Caffe Training初始化的输出。

我的网络由4个卷积层和3个完全连接层组成。

我计算了过滤器和特征图的大小,计算结果显示它们是一致的。

  1. 输入是一个尺寸(195 X 65)的图像
  2. 第一个conv1层过滤器是(32 X 13 X 5),其中32是过滤器的数量。沿两个轴的步幅均为1
  3. conv1的每个输出映射的大小应该是(183 X 61)
  4. 一个最大池层的大小(2 X 2)与步幅2
  5. 池化层的输出大小应为(92 X 31)
  6. 接着是尺寸为(48 X 7 X 3)的conv2层
  7. conv2层的每个特征图的大小应为(86 X 29)
  8. 然后是窗口大小为(2X2)的平均池
  9. 池化层的输出大小为(43 X 15)
  10. 接着是尺寸为(64 X 7 X 3)的conv3层,其中64是过滤器的数量
  11. conv3的输出应为(37 X 13)
  12. 紧接着是尺寸为(64 X 5 X 3)的conv4层
  13. conv4输出的每个特征图的大小应为(33 X 11)
  14. Conv4之后是窗口大小的平均池(2X2)
  15. 输出的大小为(17 X 6)
  16. 然后是3个完全连接的层,每个层有300个输出神经元

任何层都不进行填充。以下是caffe训练初始化的输出。它显示了在conv2、conv3和fc6处的一些错误。我无法理解这些错误的来源。有人能澄清一下吗?

I1119 17:55:45.301447 14339 caffe.cpp:156] Using GPUs 0
I1119 17:55:45.447502 14339 solver.cpp:33] Initializing solver from parameters:
test_iter: 970
test_interval: 2908
base_lr: 0.01
display: 50
max_iter: 87240
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 28790
snapshot: 2908
snapshot_prefix: "snapshot"
solver_mode: GPU
device_id: 0
net: "train_val.prototxt"
solver_type: SGD
I1119 17:55:45.447551 14339 solver.cpp:81] Creating training net from net file: train_val.prototxt
I1119 17:55:45.448559 14339 net.cpp:316] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I1119 17:55:45.448586 14339 net.cpp:316] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I1119 17:55:45.448849 14339 net.cpp:47] Initializing net from parameters:
state {
phase: TRAIN
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
mean_file: "/home/uujjwal/libraries/digits/DIGITS/digits/jobs/20151118-160932-e0c7/mean.binaryproto"
crop_h: 195
crop_w: 65
}
data_param {
source: "/home/uujjwal/libraries/digits/DIGITS/digits/jobs/20151118-160932-e0c7/train_db"
batch_size: 100
backend: LMDB
prefetch: 4
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
kernel_h: 13
kernel_w: 5
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 0
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 48
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
kernel_h: 7
kernel_w: 3
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "norm2"
top: "pool2"
pooling_param {
pool: AVE
kernel_size: 2
stride: 2
pad: 0
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 0
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
kernel_h: 7
kernel_w: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 0
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
kernel_h: 5
kernel_w: 3
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "norm4"
type: "LRN"
bottom: "conv4"
top: "norm4"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool4"
type: "Pooling"
bottom: "norm4"
top: "pool4"
pooling_param {
pool: AVE
kernel_size: 2
stride: 2
pad: 0
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool4"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 300
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 300
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8_retrain_retrain_retrain"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8"
bottom: "label"
top: "loss"
}
I1119 17:55:45.448966 14339 layer_factory.hpp:75] Creating layer data
I1119 17:55:45.449409 14339 net.cpp:99] Creating Layer data
I1119 17:55:45.449422 14339 net.cpp:409] data -> data
I1119 17:55:45.449450 14339 net.cpp:409] data -> label
I1119 17:55:45.449457 14339 net.cpp:131] Setting up data
I1119 17:55:45.453604 14349 db.cpp:34] Opened lmdb /home/uujjwal/libraries/digits/DIGITS/digits/jobs/20151118-160932-e0c7/train_db
I1119 17:55:45.454463 14339 data_layer.cpp:37] Decoding Datum
I1119 17:55:45.454485 14339 data_layer.cpp:65] output data size: 100,1,195,65
I1119 17:55:45.454495 14339 data_transformer.cpp:28] Loading mean file from: /home/uujjwal/libraries/digits/DIGITS/digits/jobs/20151118-160932-e0c7/mean.binaryproto
I1119 17:55:45.466743 14339 net.cpp:140] Top shape: 100 1 195 65 (1267500)
I1119 17:55:45.466761 14339 net.cpp:140] Top shape: 100 (100)
I1119 17:55:45.466770 14339 layer_factory.hpp:75] Creating layer conv1
I1119 17:55:45.466789 14339 net.cpp:99] Creating Layer conv1
I1119 17:55:45.466796 14339 net.cpp:453] conv1 <- data
I1119 17:55:45.466811 14339 net.cpp:409] conv1 -> conv1
I1119 17:55:45.466825 14339 net.cpp:131] Setting up conv1
I1119 17:55:45.468318 14339 net.cpp:140] Top shape: 100 32 183 61 (35721600)
I1119 17:55:45.468335 14339 layer_factory.hpp:75] Creating layer relu1
I1119 17:55:45.468343 14339 net.cpp:99] Creating Layer relu1
I1119 17:55:45.468348 14339 net.cpp:453] relu1 <- conv1
I1119 17:55:45.468354 14339 net.cpp:396] relu1 -> conv1 (in-place)
I1119 17:55:45.468360 14339 net.cpp:131] Setting up relu1
I1119 17:55:45.468371 14339 net.cpp:140] Top shape: 100 32 183 61 (35721600)
I1119 17:55:45.468375 14339 layer_factory.hpp:75] Creating layer norm1
I1119 17:55:45.468382 14339 net.cpp:99] Creating Layer norm1
I1119 17:55:45.468386 14339 net.cpp:453] norm1 <- conv1
I1119 17:55:45.468391 14339 net.cpp:409] norm1 -> norm1
I1119 17:55:45.468399 14339 net.cpp:131] Setting up norm1
I1119 17:55:45.468408 14339 net.cpp:140] Top shape: 100 32 183 61 (35721600)
I1119 17:55:45.468412 14339 layer_factory.hpp:75] Creating layer pool1
I1119 17:55:45.468422 14339 net.cpp:99] Creating Layer pool1
I1119 17:55:45.468426 14339 net.cpp:453] pool1 <- norm1
I1119 17:55:45.468431 14339 net.cpp:409] pool1 -> pool1
I1119 17:55:45.468438 14339 net.cpp:131] Setting up pool1
I1119 17:55:45.468456 14339 net.cpp:140] Top shape: 100 32 92 31 (9126400)
I1119 17:55:45.468461 14339 layer_factory.hpp:75] Creating layer conv2
I1119 17:55:45.468468 14339 net.cpp:99] Creating Layer conv2
I1119 17:55:45.468472 14339 net.cpp:453] conv2 <- pool1
I1119 17:55:45.468478 14339 net.cpp:409] conv2 -> conv2
I1119 17:55:45.468509 14339 net.cpp:131] Setting up conv2
OpenCV Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in Mat, file /home/uujjwal/libraries/opencv-3.0.0/modules/core/src/matrix.cpp, line 495
terminate called after throwing an instance of 'cv::Exception'
what():  /home/uujjwal/libraries/opencv-3.0.0/modules/core/src/matrix.cpp:495: error: (-215) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function Mat
*** Aborted at 1447952145 (unix time) try "date -d @1447952145" if you are using GNU date ***
PC: @     0x7f9446ec75d7 __GI_raise
*** SIGABRT (@0x9f71700003803) received by PID 14339 (TID 0x7f9430a6f700) from PID 14339; stack trace: ***
I1119 17:55:45.470507 14339 net.cpp:140] Top shape: 100 48 86 29 (11971200)
I1119 17:55:45.470520 14339 layer_factory.hpp:75] Creating layer relu2
I1119 17:55:45.470535 14339 net.cpp:99] Creating Layer relu2
I1119 17:55:45.470541 14339 net.cpp:453] relu2 <- conv2
I1119 17:55:45.470547 14339 net.cpp:396] relu2 -> conv2 (in-place)
I1119 17:55:45.470553 14339 net.cpp:131] Setting up relu2
I1119 17:55:45.470559 14339 net.cpp:140] Top shape: 100 48 86 29 (11971200)
I1119 17:55:45.470563 14339 layer_factory.hpp:75] Creating layer norm2
I1119 17:55:45.470569 14339 net.cpp:99] Creating Layer norm2
I1119 17:55:45.470573 14339 net.cpp:453] norm2 <- conv2
I1119 17:55:45.470580 14339 net.cpp:409] norm2 -> norm2
I1119 17:55:45.470587 14339 net.cpp:131] Setting up norm2
I1119 17:55:45.470593 14339 net.cpp:140] Top shape: 100 48 86 29 (11971200)
I1119 17:55:45.470598 14339 layer_factory.hpp:75] Creating layer pool2
I1119 17:55:45.470604 14339 net.cpp:99] Creating Layer pool2
I1119 17:55:45.470614 14339 net.cpp:453] pool2 <- norm2
I1119 17:55:45.470619 14339 net.cpp:409] pool2 -> pool2
I1119 17:55:45.470625 14339 net.cpp:131] Setting up pool2
I1119 17:55:45.470633 14339 net.cpp:140] Top shape: 100 48 43 15 (3096000)
I1119 17:55:45.470636 14339 layer_factory.hpp:75] Creating layer conv3
I1119 17:55:45.470643 14339 net.cpp:99] Creating Layer conv3
I1119 17:55:45.470648 14339 net.cpp:453] conv3 <- pool2
I1119 17:55:45.470654 14339 net.cpp:409] conv3 -> conv3
I1119 17:55:45.470661 14339 net.cpp:131] Setting up conv3
@     0x7f945ba27130 (unknown)
@     0x7f9446ec75d7 __GI_raise
@     0x7f9446ec8cc8 __GI_abort
@     0x7f94477cb9b5 (unknown)
@     0x7f94477c9926 (unknown)
@     0x7f94477c9953 (unknown)
@     0x7f94477c9b73 (unknown)
I1119 17:55:45.473515 14339 net.cpp:140] Top shape: 100 64 37 13 (3078400)
I1119 17:55:45.473528 14339 layer_factory.hpp:75] Creating layer relu3
I1119 17:55:45.473542 14339 net.cpp:99] Creating Layer relu3
I1119 17:55:45.473546 14339 net.cpp:453] relu3 <- conv3
I1119 17:55:45.473552 14339 net.cpp:396] relu3 -> conv3 (in-place)
I1119 17:55:45.473558 14339 net.cpp:131] Setting up relu3
I1119 17:55:45.473565 14339 net.cpp:140] Top shape: 100 64 37 13 (3078400)
I1119 17:55:45.473569 14339 layer_factory.hpp:75] Creating layer conv4
I1119 17:55:45.473577 14339 net.cpp:99] Creating Layer conv4
I1119 17:55:45.473580 14339 net.cpp:453] conv4 <- conv3
I1119 17:55:45.473587 14339 net.cpp:409] conv4 -> conv4
I1119 17:55:45.473592 14339 net.cpp:131] Setting up conv4
@     0x7f945269b58d cv::error()
@     0x7f945269b6f0 cv::error()
I1119 17:55:45.475775 14339 net.cpp:140] Top shape: 100 64 33 11 (2323200)
I1119 17:55:45.475786 14339 layer_factory.hpp:75] Creating layer relu4
I1119 17:55:45.475793 14339 net.cpp:99] Creating Layer relu4
I1119 17:55:45.475797 14339 net.cpp:453] relu4 <- conv4
I1119 17:55:45.475803 14339 net.cpp:396] relu4 -> conv4 (in-place)
I1119 17:55:45.475810 14339 net.cpp:131] Setting up relu4
I1119 17:55:45.475816 14339 net.cpp:140] Top shape: 100 64 33 11 (2323200)
I1119 17:55:45.475819 14339 layer_factory.hpp:75] Creating layer norm4
I1119 17:55:45.475826 14339 net.cpp:99] Creating Layer norm4
I1119 17:55:45.475829 14339 net.cpp:453] norm4 <- conv4
I1119 17:55:45.475834 14339 net.cpp:409] norm4 -> norm4
I1119 17:55:45.475841 14339 net.cpp:131] Setting up norm4
I1119 17:55:45.475847 14339 net.cpp:140] Top shape: 100 64 33 11 (2323200)
I1119 17:55:45.475852 14339 layer_factory.hpp:75] Creating layer pool4
I1119 17:55:45.475858 14339 net.cpp:99] Creating Layer pool4
I1119 17:55:45.475862 14339 net.cpp:453] pool4 <- norm4
I1119 17:55:45.475867 14339 net.cpp:409] pool4 -> pool4
I1119 17:55:45.475873 14339 net.cpp:131] Setting up pool4
I1119 17:55:45.475880 14339 net.cpp:140] Top shape: 100 64 17 6 (652800)
I1119 17:55:45.475884 14339 layer_factory.hpp:75] Creating layer fc6
I1119 17:55:45.475894 14339 net.cpp:99] Creating Layer fc6
I1119 17:55:45.475899 14339 net.cpp:453] fc6 <- pool4
I1119 17:55:45.475905 14339 net.cpp:409] fc6 -> fc6
I1119 17:55:45.475914 14339 net.cpp:131] Setting up fc6
@     0x7f945253219a cv::Mat::Mat()
@     0x7f945c14a3ea caffe::DataTransformer<>::Transform()
@     0x7f945c1cb478 caffe::DataLayer<>::load_batch()
@     0x7f945c1b0f98 caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@     0x7f945c144d64 caffe::InternalThread::entry()
@     0x7f945bc4124a (unknown)
@     0x7f945ba1fdf5 start_thread
@     0x7f9446f881ad __clone

查看您得到的错误的堆栈跟踪(我从日志中出现的不同位置将其放在一起):

OpenCV Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in Mat, file /home/uujjwal/libraries/opencv-3.0.0/modules/core/src/matrix.cpp, line 495
terminate called after throwing an instance of 'cv::Exception'
what():  /home/uujjwal/libraries/opencv-3.0.0/modules/core/src/matrix.cpp:495: error: (-215) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function Mat
*** Aborted at 1447952145 (unix time) try "date -d @1447952145" if you are using GNU date ***
PC: @     0x7f9446ec75d7 __GI_raise
*** SIGABRT (@0x9f71700003803) received by PID 14339 (TID 0x7f9430a6f700) from PID 14339; stack trace: ***
@     0x7f945ba27130 (unknown)
@     0x7f9446ec75d7 __GI_raise
@     0x7f9446ec8cc8 __GI_abort
@     0x7f94477cb9b5 (unknown)
@     0x7f94477c9926 (unknown)
@     0x7f94477c9953 (unknown)
@     0x7f94477c9b73 (unknown)
@     0x7f945269b58d cv::error()
@     0x7f945269b6f0 cv::error()
@     0x7f945253219a cv::Mat::Mat()
@     0x7f945c14a3ea caffe::DataTransformer<>::Transform()
@     0x7f945c1cb478 caffe::DataLayer<>::load_batch()
@     0x7f945c1b0f98 caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@     0x7f945c144d64 caffe::InternalThread::entry()
@     0x7f945bc4124a (unknown)
@     0x7f945ba1fdf5 start_thread
@     0x7f9446f881ad __clone

你可以清楚地看到:

  1. 这是一个单个错误,而不是多个错误
  2. 它发生在不是主线程的线程中(这就是为什么它的错误消息和跟踪被打印在构建模型的主线程的输出行之间)
  3. 当数据层中的预获取线程试图转换其中一个输入图像时,就会发生此错误

你能做什么?

  • 您可以尝试更改输入数据层中的prefetch: 4参数
  • 您可以尝试使用不同的输入进行测试,lmdb构建了一个新的,您可以在其中验证新数据集中的所有图像确实有效
  • 使用convert_imageset实用程序重建输入lmdb,不要相信DIGITS会为您完成这项工作

您可以阅读下一页中的教程,其中有层的解释,以显示如何计算inner_product层。您可以检查FC6的尺寸或调试代码以跟踪相关变量。http://caffe.berkeleyvision.org/tutorial/layers.html

最新更新