过滤相同的标签图像,并使用谷歌colaboratory在谷歌驱动器上保存另一个文件夹



我已经下载了数据集,其中包括洪水、火灾、火山等图像。在这些图像之上,我想过滤洪水标签图像并将其保存在另一个文件夹中。我的所有图片都包含一个谷歌驱动器中的文件夹。怎么做?

数据集链接:https://drive.google.com/drive/folders/1lGD1LSnPnyoCOLfPXiZ_Y4zWgyh93ltn?usp=sharing

由于图像已经被标记,因此最好使用这些标签对图像进行分类。在labels文件夹中,有包含图像标签数据的JSON文件。您可以从JSON文件中获取图像名称和相关的灾难类型。

"metadata": {
"sensor": "GEOEYE01",
"provider_asset_type": "GEOEYE01",
"gsd": 2.0916247,
"capture_date": "2018-09-20T16:04:41.000Z",
"off_nadir_angle": 28.017313,
"pan_resolution": 0.52282465,
"sun_azimuth": 153.94543,
"sun_elevation": 53.722378,
"target_azimuth": 190.82309,
"disaster": "hurricane-florence",
"disaster_type": "flooding",
"catalog_id": "1050010012411600",
"original_width": 1024,
"original_height": 1024,
"width": 1024,
"height": 1024,
"id": "MjU0Njk0MQ.clApx1C8IcFymibsGi1JLu1eKhU",
"img_name": "hurricane-florence_00000324_post_disaster.png"
}

您可以使用以下代码段。它是为了将图像复制到其相关类别文件夹中而编写的(例如:image with disaster_type'fire'->/categrated/fire/(。最终,所有图像都将被分类到单独的文件夹中。

from google.colab import drive 
import os 
import json 
import shutil
drive.mount('/content/drive')
# change paths according to yours
main_folder_path = "/content/drive/My Drive/Backup/train"
images_folder_path = main_folder_path+"/images"
labels_folder_path = main_folder_path+"/labels"
categorized_folder_path = "/content/drive/My Drive/Backup/categorized"
os.chdir(main_folder_path)
for json_filename in os.listdir(labels_folder_path):
json_path = os.path.join(main_folder_path, "labels", json_filename)
f = open(json_path, 'r')
data = json.load(f)
disaster_type = data["metadata"]["disaster_type"]
img_name = data["metadata"]["img_name"]
print("disaster:", disaster_type, "image:", img_name)
f.close()
img_filepath = os.path.join(main_folder_path, "images", img_name)
category_folderpath = os.path.join(categorized_folder_path, disaster_type)
if os.path.exists(img_filepath):
if not os.path.exists(category_folderpath):
os.mkdir(category_folderpath)
shutil.copy(img_filepath, category_folderpath)

您发布的数据集有6种类型的自然灾害,即hurricanevolcanoearthquakefloodingtsunamiwildfire。每个图像文件名都包含其中一个单词,因此可以轻松过滤与洪水相关的图像。

from google.colab import drive 
import os
i_path = "/content/drive/My Drive/images"
flood_dir = "/content/drive/My Drive/flood_images"
drive.mount('/content/drive')
os.chdir(i_path)  
for file_name in os.listdir():
file_path = f"{i_path}/{file_name}"
if "flooding" in file_name:
s_p = os.path.join(i_path, file_path)
d_p = os.path.join(flood_dir)
!mv "$s_p" "$d_p"

最新更新