这是否意味着我需要更多的vram ?我目前正在gtx 1050 2 GB模型上训练模型
2021-02-12 22:51:38.033037: W tensorflow/core/common_runtime/bfc_allocator.cc:419] Allocator (GPU_0_bfc) ran out of memory trying to allocate 84.38MiB (rounded to 88473600). Current allocation summary follows.
当我运行脚本,它获取数据,使图像和分割训练数据开始和完成训练给出了几个结果大约20然后它崩溃
这是我用来训练的模型
img_width, img_height = 150, 150
# Enter the number of samples, training + validation
nb_train_samples = x1 + y2
nb_validation_samples = x2 + y2
nb_filter1 = 16
nb_filter2 = 16
nb_filter3 = 32
conv1_size = 3
conv2_size = 2
conv3_size = 5
pool_size = 2
# We have 2 classes
classes_num = 2
batch_size = 10
lr = 0.001
chanDim =3
model = Sequential()
model.add(Convolution2D(nb_filter1, conv1_size, conv1_size, border_mode ='same', input_shape=(img_height, img_width , 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size)))
model.add(Convolution2D(nb_filter2, conv2_size, conv2_size, border_mode ="same"))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size), dim_ordering='th'))
model.add(Convolution2D(nb_filter3, conv3_size, conv3_size, border_mode ='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size), dim_ordering='th'))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(classes_num, activation='softmax'))
有没有办法解决这个问题
Limit: 1406277838
InUse: 1388972288
MaxInUse: 1388972288
NumAllocs: 667963
MaxAllocSize: 176941568
2021-02-12 22:51:38.055777: W tensorflow/core/common_runtime/bfc_allocator.cc:424] *******x*******xx********_**********************************************************************xxxx
2021-02-12 22:51:38.055870: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at cwise_ops_common.cc:82 : Resource exhausted: OOM when allocating tensor with shape[21600,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
基于模型架构和数据大小,没有解决VRAM耗尽的方法。你可以尝试减少过滤器的数量,特别是在第一层。从技术上讲,你可以使用CPU进行训练,但它可能会非常慢。