当我用keras训练网络时,为什么我的预测形状不准确?



我是tensorflow keras来训练一个模型来分类图像是a还是b。我有20,000个随机生成的图像用于训练(一半a,一半b)。图像示例b image的例子

首先导入必要的包

import tensorflow
from matplotlib import pyplot as plt
import cv2
from matplotlib import pyplot as plt
import random
from tensorflow.keras import models 
from tensorflow.keras import layers 
import numpy as np

之后,我从我的文件夹中加载图像,并处理它们,将它们转换为只有0和1的数组,并将它们保存在一起,如果图像是a,则为1,如果图像是b,则为0。一旦我完成了这些,我将它们放在一个列表中,并对列表进行洗牌以使其随机。

a_letters = []
b_letters = []
folder_path_a = 'C:/path/to/folder/'
folder_path_b = 'C:/path/to/folder/'
count = 0
while count < 10000:
path = folder_path_a + f'a{count}.png'
img = cv2.imread(path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
for row_number, row in enumerate(gray_image):
for column_number, column in enumerate(row):
if gray_image[row_number][column_number] > 50:
gray_image[row_number][column_number] = 1
else:
gray_image[row_number][column_number] = 0
#gray_image = np.expand_dims(gray_image, axis=2)
image_and_label = [gray_image, 1]
a_letters.append(image_and_label)
count = count + 1

count = 0    
while count < 10000:
path = folder_path_b + f'b{count}.png'
img = cv2.imread(path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
for row_number, row in enumerate(gray_image):
for column_number, column in enumerate(row):
if gray_image[row_number][column_number] > 50:
gray_image[row_number][column_number] = 1
else:
gray_image[row_number][column_number] = 0
# gray_image = np.expand_dims(gray_image, axis=2)
image_and_label = [gray_image, 0]
b_letters.append(image_and_label)    
count = count + 1    

unified_list = a_letters + b_letters
random.shuffle(unified_list)

接下来,我将标签和图像分离到它们自己的列表中,并将它们分成训练数据和验证数据。

images = []
labels = []
for image, label in unified_list:
images.append(image)
labels.append(float(label))
x_train = images[:15000]
y_train = labels[:15000]
x_val = images[15000:]
y_val = labels[15000:]

然后我将列表转换为numpy数组,并扩展标签的维度(之前,我试图训练模型,我得到了一个错误,说logits和标签需要是相同的维度,所以我扩展了标签的维度,使它们与图像相同)

x_train_array = np.asarray(x_train)
y_train_array = np.asarray(y_train)
x_val_array = np.asarray(x_val)
y_val_array = np.asarray(y_val)
y_train_array = np.expand_dims(y_train_array, axis =1)
y_val_array = np.expand_dims(y_val_array, axis = 1)

接下来,我建立一个模型并训练它:

model = models.Sequential()
model.add(layers.Dense(512, activation='relu', input_shape=(169,191,)))
model.add(layers.Dense(150, activation='relu'))
model.add(layers.Dense(250, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x_train_array, y_train_array, epochs=10, batch_size=500, validation_data=(x_val_array, y_val_array))

模型总结如下:模型总结

当我尝试用我的模型通过使用以下代码进行预测时:

predictions = model.predict(x_val_array)

我得到一个预测。shape of(5000,169,1).似乎不是每张图像得到一个预测,而是得到169?我研究这个问题已经有一段时间了,但我似乎还是搞不明白。

形状169来自输入图像的宽度。

它被延续是因为如果你添加一个致密层,它只与前一个张量的一维相连。

你可以尝试的第一件事是平坦你的图像:

趋平
model = models.Sequential()
model.add(layers.Flatten(input_shape = (169,191,)))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(150, activation='relu'))
model.add(layers.Dense(250, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
predictions = model.predict(example)
predictions.shape
Model: "sequential_12"
_________________________________________________________________
Layer (type)                Output Shape              Param #   
=================================================================
flatten_7 (Flatten)         (None, 32279)             0         

dense_40 (Dense)            (None, 512)               16527360  

dense_41 (Dense)            (None, 150)               76950     

dense_42 (Dense)            (None, 250)               37750     

dense_43 (Dense)            (None, 1)                 251       

=================================================================
Total params: 16,642,311
Trainable params: 16,642,311
Non-trainable params: 0
_________________________________________________________________
(50, 1)

但是,不建议这样做,因为与它可能传达的信息相比,模型太大了。该模型有18M个参数,计算效率很低。我宁愿使用ResNet-18的15m参数模型。

否则,你可以利用卷积层。下面是一个例子:

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(169,191,1)))
model.add(layers.Conv2D(32, (3, 3), activation='relu'))
model.add(layers.MaxPool2D(pool_size=(4, 4)))
model.add(layers.Conv2D(32, (3, 3), activation='relu'))
model.add(layers.Conv2D(32, (3, 3), activation='relu'))
model.add(layers.MaxPool2D(pool_size=(4, 4)))
model.add(layers.Flatten())
model.add(layers.Dense(150, activation='relu'))
model.add(layers.Dense(250, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
predictions = model.predict(example)
predictions.shape
Model: "sequential_17"
_________________________________________________________________
Layer (type)                Output Shape              Param #   
=================================================================
conv2d_24 (Conv2D)          (None, 167, 189, 32)      320       

conv2d_25 (Conv2D)          (None, 165, 187, 32)      9248      

max_pooling2d_9 (MaxPooling  (None, 41, 46, 32)       0         
2D)                                                             

conv2d_26 (Conv2D)          (None, 39, 44, 32)        9248      

conv2d_27 (Conv2D)          (None, 37, 42, 32)        9248      

max_pooling2d_10 (MaxPoolin  (None, 9, 10, 32)        0         
g2D)                                                            

flatten_12 (Flatten)        (None, 2880)              0         

dense_56 (Dense)            (None, 150)               432150    

dense_57 (Dense)            (None, 250)               37750     

dense_58 (Dense)            (None, 1)                 251       

=================================================================
Total params: 498,215
Trainable params: 498,215
Non-trainable params: 0
_________________________________________________________________
(50, 1)

它小了30倍,但性能会好得多,因为卷积层擅长提取特征。

最新更新