如何在python中向量化任务

I(将(有一个坐标列表；使用python的枕头模块，我想将一系列(裁剪的(较小的图像保存到磁盘上。目前，我正在使用for循环来一次确定一个坐标，然后在转到下一个坐标之前裁剪/保存图像。

有没有一种方法可以将这项工作分割开来，以便同时裁剪/保存多个图像？我知道这会占用更多的RAM，但会减少性能时间。

我确信这是可能的，但我不确定这是否简单。我听过像"矢量化"one_answers"多线程"这样的术语，听起来很适合这种情况。但这些话题超出了我的经验范围。

我附上了代码以供参考。然而，我只是想征求一下推荐的策略。(即，我应该学习哪些技术来更好地调整我的方法，一次收获多种作物等？(


def parse_image(source, square_size, count, captures, offset=0, offset_type=0, print_coords=False):
"""
Starts at top left corner of image. Iterates through image by square_size (width = height)
across x values and after exhausting the row, begins next row lower by function of 
square_size. Offset parameter is available such that, with multiple function calls, 
overlapping images could be generated.
"""
src = Image.open(source)
dimensions = src.size
max_down = int(src.height/square_size) * square_size + square_size
max_right = int(src.width/square_size) * square_size + square_size
if offset_type == 1:
tl_x = 0 + offset
tl_y = 0
br_x = square_size + offset 
br_y = square_size
for y in range(square_size,max_down,square_size):
for x in range(square_size + offset,max_right - offset,square_size):
if (tl_x,tl_y) not in captures:
sample = src.crop((tl_x,tl_y,br_x,br_y))
sample.save(f"{source[:-4]}_sample_{count}_x{tl_x}_y{tl_y}.jpg")
captures.append((tl_x,tl_y))
if print_coords == True: 
print(f"image {count}: top-left (x,y): {(tl_x,tl_y)}, bottom-right (x,y): {(br_x,br_y)}")
tl_x = x
br_x = x + square_size
count +=1                
else:
continue
tl_x = 0 + offset
br_x = square_size + offset
tl_y = y
br_y = y + square_size
else:
tl_x = 0
tl_y = 0 + offset
br_x = square_size 
br_y = square_size + offset
for y in range(square_size + offset,max_down - offset,square_size):
for x in range(square_size,max_right,square_size):
if (tl_x,tl_y) not in captures:
sample = src.crop((tl_x,tl_y,br_x,br_y))
sample.save(f"{source[:-4]}_sample_{count}_x{tl_x}_y{tl_y}.jpg")
captures.append((tl_x,tl_y))
if print_coords == True: 
print(f"image {count}: top-left (x,y): {(tl_x,tl_y)}, bottom-right (x,y): {(br_x,br_y)}")
tl_x = x
br_x = x + square_size
count +=1
else:
continue
tl_x = 0
br_x = square_size 
tl_y = y + offset
br_y = y + square_size + offset
return count

你想在这里实现的是具有更高的并行度，首先要做的是了解你在这里需要做的最低任务是什么，并从中思考如何更好地分配它。

这里首先要注意的是，有两种行为，第一种是如果您有offset_type 0，另一种是如果你有offsettype 1，将其拆分为两个不同的函数。

第二件事是：给定一张图像，你以给定的偏移量(x，y(为整个图像拍摄给定大小的作物。例如，您可以简化此函数，在给定图像偏移量(x，y(的情况下，对图像进行一次裁剪。然后，您可以对图像的所有x和y并行调用此函数。这几乎是大多数图像处理框架试图实现的，甚至是在GPU内部运行代码的框架，即在图像中本地运行的小块代码。

假设你的图像宽度为100，高度为100，你正试图制作w=10，h=10的裁剪。考虑到我描述的简单函数，我将称之为crop(img, x, y, crop_size_x, crop_size_y)。您所要做的就是创建图像：

img = Image.open(source)
crop_size_x = 10
crop_size_y = 10
crops = [crop(img, x, y, crop_size_x, crop_size_y) for x, y in zip(range(img.width), range(img.height))]

稍后，您可以将列表理解替换为多处理库，该库实际上可以生成许多进程，实现真正的并行性，甚至可以在GPU内核/着色器中编写这样的代码，并使用GPU并行性来实现高性能。

相关内容

最新更新

热门标签：