数据集与jpg导入到jupyter笔记本

我必须使用tensorflow和keras通过jupyter笔记本电脑用python构建一个机器学习模型。我有一个1000张图片的数据集。其中800个我想用于训练模型，200个用于测试和验证。这是一个性别和年龄预测模型。现在我该如何导入我的数据集，或者我该如何在upyter笔记本或谷歌colab中写入路径来导入数据集。

我所做的是为我的项目导入包。

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.utils import to_categorical, plot_model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization, Conv2D, MaxPooling2D, Activation, Flatten, Dropout, Dense
from tensorflow.keras import backend as K
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import numpy as np
import random
import cv2
import os
import glob
import pandas as pd

致以亲切的问候。

如果数据集在本地系统中，则有两种方法可以在googlecolab中上传数据集。

您可以将数据集上传到Google驱动器

共享文件的最简单方法是将Google Drive安装在Google Colab笔记本中。

为此，请在代码单元中运行以下操作：

from google.colab import drive
drive.mount('/content/drive')

它将要求您访问ALLOW"的链接；谷歌文件流"；访问您的驱动器。之后，将显示一个需要在Colab的笔记本中输入的长字母数字身份验证代码。

之后，您的驱动器文件将被安装，您可以使用侧面板中的文件浏览器进行浏览。

您可以通过浏览本地文件系统手动上传文件

用这种方法上传需要更长的时间。

from google.colab import files
uploaded = files.upload()

这里有两个例子供您参考：

https://colab.research.google.com/drive/1srw_HFWQ2SMgmWIawucXfusGzrj1_U0q
带jpg导入jupyter笔记本的数据集

这里我在Tensorflow中以一种简单的方式解释了如何直接从TXT文件加载图像和标签。希望这能帮助到你。下面的代码说明了我是如何做到这一点的。然而，这并不意味着这是最好的方法，而且这种方法将有助于其他步骤。

例如，我在单个整数值{0,1}中加载标签，而文档使用单个热向量[0,1]。

#Learning how to import images and labels from a TXT file
#
#TXT file format
#
#path/to/imagefile_1 label_1
#path/to/imagefile_2 label_2
#...                 ...
#where label_X is either {0,1}
#Importing Libraries
import os
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
#File containing the path to images and the labels [path/to/images label]
filename = '/path/to/List.txt'
#Lists where to store the paths and labels
filenames = []
labels = []
#Reading file and extracting paths and labels
with open(filename, 'r') as File:
infoFile = File.readlines() #Reading all the lines from File
for line in infoFile: #Reading line-by-line
words = line.split() #Splitting lines in words using space character as separator
filenames.append(words[0])
labels.append(int(words[1]))
NumFiles = len(filenames)
#Converting filenames and labels into tensors
tfilenames = ops.convert_to_tensor(filenames, dtype=dtypes.string)
tlabels = ops.convert_to_tensor(labels, dtype=dtypes.int32)
#Creating a queue which contains the list of files to read and the value of the labels
filename_queue = tf.train.slice_input_producer([tfilenames, tlabels], num_epochs=10, shuffle=True, capacity=NumFiles)
#Reading the image files and decoding them
rawIm= tf.read_file(filename_queue[0])
decodedIm = tf.image.decode_png(rawIm) # png or jpg decoder
#Extracting the labels queue
label_queue = filename_queue[1]
#Initializing Global and Local Variables so we avoid warnings and errors
init_op = tf.group(tf.local_variables_initializer() ,tf.global_variables_initializer())
#Creating an InteractiveSession so we can run in iPython
sess = tf.InteractiveSession()
with sess.as_default():
sess.run(init_op)

# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(NumFiles): #length of your filenames list
nm, image, lb = sess.run([filename_queue[0], decodedIm, label_queue])

print image.shape
print nm
print lb

#Showing the current image
plt.imshow(image)
plt.show()
coord.request_stop()
coord.join(threads)

如果您使用panda来定位CSV文件，请尝试提供完整路径

df = pd.read_csv(r"C:UsersmaheDesktophomeprices.csv")

用这种方式或

import matplotlib.pyplot as plt
import os
import cv2
from tqdm import tqdm
DATADIR = "X:/Datasets/PetImages" #(give your full path)

相关内容

最新更新

热门标签：