pytorch dataset.imageFolder与Google Colab中的自定义数据集存在问题



我正在尝试使用pytorch为分类任务加载数据集,这是我使用的代码:

data_transforms = {
'train': transforms.Compose([
transforms.RandomRotation(2.8),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.5), (0.5))
]),
'valid': transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize((0.5), (0.5))
])
}
print(os.listdir())
# TODO: Load the datasets with ImageFolder
image_datasets = {x: datasets.ImageFolder(os.path.join("/content/drive/MyDrive/DatasetPersonale", x),
data_transforms[x])
for x in ['train', 'valid']}
# TODO: Using the image datasets and the trainforms, define the dataloaders
batch_size = 32
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size,
shuffle=True, num_workers=4)
for x in ['train', 'valid']}
class_names = image_datasets['train'].classes
print(class_names)
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid']}

代码运行良好,但由于我的数据集是灰度级的,我需要将其转换为RGB,所以我使用了以下代码:


rootdir = '/content/drive/MyDrive/DatasetPersonale/trainRGB'
print("Train")
for subdir, dirs, files in os.walk(rootdir):
for file in files:
filePath = os.path.join(subdir, file)
name = os.path.basename(filePath)
img=Image.open(filePath, mode="r")
print(img.mode)
if img.mode != "RGB":
RGBimg=img.convert("RGB")
RGBimg.save(filePath,format=jpeg)


现在我的图像仍然是jpeg,但现在它们是RGB而不是L。问题是,如果我重新运行代码加载数据集,我会得到这个错误

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-15-3dace4b0f21b> in <module>()
19 image_datasets = {x: datasets.ImageFolder(os.path.join("/content/drive/MyDrive/DatasetPersonale", x),
20                                           data_transforms[x])
---> 21                   for x in ['trainRGB', 'validRGB']}
22 
23 # TODO: Using the image datasets and the trainforms, define the dataloaders
4 frames
<ipython-input-15-3dace4b0f21b> in <dictcomp>(.0)
19 image_datasets = {x: datasets.ImageFolder(os.path.join("/content/drive/MyDrive/DatasetPersonale", x),
20                                           data_transforms[x])
---> 21                   for x in ['trainRGB', 'validRGB']}
22 
23 # TODO: Using the image datasets and the trainforms, define the dataloaders
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in __init__(self, root, transform, target_transform, loader, is_valid_file)
311                                           transform=transform,
312                                           target_transform=target_transform,
--> 313                                           is_valid_file=is_valid_file)
314         self.imgs = self.samples
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in __init__(self, root, loader, extensions, transform, target_transform, is_valid_file)
144                                             target_transform=target_transform)
145         classes, class_to_idx = self.find_classes(self.root)
--> 146         samples = self.make_dataset(self.root, class_to_idx, extensions, is_valid_file)
147 
148         self.loader = loader
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in make_dataset(directory, class_to_idx, extensions, is_valid_file)
190                 "The class_to_idx parameter cannot be None."
191             )
--> 192         return make_dataset(directory, class_to_idx, extensions=extensions, is_valid_file=is_valid_file)
193 
194     def find_classes(self, directory: str) -> Tuple[List[str], Dict[str, int]]:
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in make_dataset(directory, class_to_idx, extensions, is_valid_file)
100         if extensions is not None:
101             msg += f"Supported extensions are: {', '.join(extensions)}"
--> 102         raise FileNotFoundError(msg)
103 
104     return instances
FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp

有人知道为什么会出现这个错误吗?我检查了所有文件的扩展名,它们都是jpeg。

谢谢。

问题:这是因为文件夹/content/drive/MyDrive/DatasetPersonale/trainRGB中的.ipynb_checkpoints文件夹包含的文件(无效图像(无法读取为具有有效扩展名(.jpg、.jpeg、.png、.ppm、.bmp、.pgm、.tif、.tif和.webp(的图像。

解决方案:您可以将所有图像保存在一个子文件夹中,即"images",然后将根文件夹更改为/content/drive/MyDrive/DatasetPersonale/trainRGB/images,以避免与图像一起读取.ipynb_checkpoints文件夹。

最新更新