我有一个数据集,其中一个文件夹包含图像和其他文件夹包含相应的文本文件。每个文本文件包含一个对应Class的标签。
Images folder
image_0000.jpeg
image_0001.jpeg
Label folder
image_0000.txt
image_0001.txt
标签文本文件的值为0或1或2。
我想将与标签0对应的图像保存在另一个文件夹中。类似地,对于剩余的标签1,2
如下图所示。
数据集描述def read_image_list(image_list_file):
f = open(image_list_file, 'r')
filenames = []
for line in f:
filename, label = line[:-1].split(' ')
filenames.append(filename)
return filenames
只需更改前三个目录变量就可以使代码工作得很好。Prem J给出的答案也很酷。
import os
import shutil
# assumung that the names of the images and the labels are same excepth the extension
image_dir = "f1/im" # Images directory path
label_dir = "f1/la" # Labels directory path
final_dir = "f1/dataset" # the dataset directory where you want to save the images as label folders
images = os.listdir(image_dir) # list of all the images names
label = os.listdir(label_dir) # list of all the label names
# Sorting it to make sure the images and the labels are on the same index
images = sorted(images) # ['test 1.jpg', 'test 2.jpg', 'test 3.jpg', 'test 4.jpg', 'test 5.jpg']
label = sorted(label) # ['test 1.txt', 'test 2.txt', 'test 3.txt', 'test 4.txt', 'test 5.txt']
if not os.path.exists(final_dir):
os.mkdir(final_dir)
for c, txt in enumerate(label):
txt_path = os.path.join(label_dir, txt)
img_path = os.path.join(image_dir, images[c])
with open(txt_path, "r") as r:
l = r.read()
dst = f"{final_dir}/{l}"
if not os.path.exists(dst):
os.mkdir(dst)
shutil.copy(img_path, dst)
shutil.copy(txt_path, dst)