如何将数据集加载到Pytorch或Keras中

我正在学习使用Pytorch或Keras构建神经网络。我把我的图像放在两个单独的文件夹中进行训练和测试，在两个csv文件中有相应的标签，我有一个基本问题，就是用Pytorch或Keras将它们加载到其中，这样我就可以开始构建NN了。我试过的教程

https://towardsdatascience.com/training-neural-network-from-scratch-using-pytorch-in-just-7-cells-e6e904070a1d

和

https://www.tensorflow.org/tutorials/keras/classification

以及其他一些数据集，但它们似乎都使用了像MNIST这样的预先存在的数据集，在那里它可以导入或从链接下载。我试过这样的东西：

import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
from tqdm import tqdm
DATADIR = r"Path to my image folder"
CATEGORIES = ["High", "Low"]
for category in CATEGORIES:                                                       
path = os.path.join(DATADIR,category)                                         
for img in os.listdir(path):                                                  
img_array = cv2.imread(os.path.join(path,img) ,cv2.IMREAD_GRAYSCALE)      
plt.imshow(img_array, cmap='gray')  
plt.show()                                                               
break                                                                    
break

但追求的更像是：

fashion_mnist = tf.keras.datasets.fashion_mnist       
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

有人有想法吗？

谢谢，C

如果您将数据放在csv文件中，并将图像作为目标放在不同的文件夹中，那么最好的方法之一是使用keras库中的flow_from_dataframe生成器。这里有一个例子，还有一个关于keras库的更详细的例子。这也是文件。

以下是一些示例代码：

import pandas as pd            #import pandas library
from tensorflow import keras
df = pd.read_csv(r".train.csv")    #read csv file
datagen = keras.preprocessing.image.ImageDataGenerator(
rescale=1./255) #dividing pixels by 255 is arbitrary
train_generator = datagen.flow_from_dataframe(
dataframe=df,             #dataframe object you have defined above
directory=".train_imgs", #the dir where your images are stored
x_col="id",               #column of image names
y_col="label",            #column of class name
class_mode="categorical", #type of the problem
target_size=(32,32),      #resizing image target according to your model input
batch_size=32)            #batch size of data it should create

然后，您可以将其传递给model.fit():

model.fit(train_generator, epochs=10)

相关内容

最新更新

热门标签：