我有一个3通道numpy数组，我想对每个像素应用一个函数。具体来说，我想处理一张图像，并返回一张灰度图像，突出显示图像中出现的特定颜色。如果红色、绿色和蓝色通道与颜色的L2距离在10英寸以内：(30,70130(，则将灰度图像上的像素值设置为255，否则为0。

我目前的做法是：

def L2_dist(p1,p2):
dist = ( (p1[0]-p2[0] )**2 + (p1[1]-p2[1] )**2 + (p1[2]-p2[2] )**2 ) **0.5
if dist<10: return 255
return 0
def colour_img(image):
colour = my_colour
img_dim = image.shape
new_img = np.zeros((img_dim[0],img_dim[1])) # no alpha channel replica
for c in range(img_dim[0]):
for r in range(img_dim[1]):
pixel = image[r,c,:3]
new_img[r,c] = L2_dist(colour,pixel)
return new_img

但速度很慢。我怎样才能更快地完成这项工作而不是使用循环？

简单的单行解决方案

你可以在这样的一行中做你想做的事：

new_img = (((image - color)**2).sum(axis=2)**.5 <= 10) * 255

优化的双线解决方案

上面的行并不是执行OP想要的所有操作的最有效的方式。这里有一个明显更快的方法(感谢Paul Panzer在评论中建议的优化，可读性没有保证(：

d = image - color
new_img = (np.einsum('...i, ...i', d, d) <= 100) * 255

时间安排：

给定一些100x100像素的测试数据：

import numpy as np
color = np.array([30, 70, 130])
# random data within [20,60,120]-[40,80,140] for demo purposes
image = np.random.randint(10*2 + 1, size=[100,100,3]) + color - 10

以下是OP方法的时间安排与这个答案的解决方案的比较。单线解决方案比OP快约100倍，而完全优化的版本快约300倍：

%%timeit
# OP's code
img_dim = image.shape
new_img = np.zeros((img_dim[0],img_dim[1])) # no alpha channel replica
for c in range(img_dim[0]):
for r in range(img_dim[1]):
pixel = image[r,c,:3]
new_img[r,c] = L2_dist(color,pixel)
43.8 ms ± 502 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
# one line solution
new_img = (((image - color)**2).sum(axis=2)**.5 <= 10) * 255
439 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
# fully optimized solution
d = image - color
new_img = (np.einsum('...i, ...i', d, d) <= 100) * 255
145 µs ± 2.29 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

简单单线解决方案的说明

作为第一个解决方案给出的简单的一个线性：

求image(将是形状(m, n, 3)的阵列(和color(将是颜色(3)的阵列(中每个像素之间的欧几里得距离。
检查这些距离中是否有任何一个在10内，并在满足条件的地方返回一个布尔数组True，否则返回False。
布尔数组实际上只是0s和1s的数组，所以我们将布尔数组乘以255，得到您想要的最终结果。

优化解决方案说明

以下是使用的优化列表：

使用einsum计算距离计算所需的平方和。在引擎盖下，einsum利用Numpy封装的BLAS库来计算所需的和积，因此它应该更快。
通过将距离的平方与阈值的平方进行比较，跳过取平方根。

我试图找到一种方法来最大限度地减少数组的分配/复制，但这实际上让事情变得更慢。这是一个优化解决方案的版本，它只分配两个数组(一个用于中间结果，一个用于最终结果(，并且不进行其他复制：

%%timeit
# fully optimized solution, makes as few array copies as possible
scr = image.astype(np.double)
new_img = np.zeros(image.shape[:2], dtype=np.uint8)
np.multiply(np.less_equal(np.einsum('...i,...i', np.subtract(image, color, out=scr), scr, out=scr[:,:,0]), 100, out=new_img), 255, out=new_img)
232 µs ± 7.72 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

您可以执行类似的操作

color = np.array([30, 70, 130])
L2 = np.sqrt(np.sum((image - color) ** 2, axis=2))  # L2 distance of each pixel from color
img_dim = image.shape
new_img = np.zeros((img_dim[0], img_dim[1]))
new_img[L2 < 10] = 255

但正如您所看到的，我们对数组进行了两次迭代，首先是计算L2，然后在L2 < 10中进行阈值处理。我们可以通过嵌套循环对其进行改进，就像您的代码中所做的那样。但是，python中的循环很慢。因此，JIT编译函数以获得最快的版本。下面我用麻木：

import numba as nb
@nb.njit(cache=True)
def L2_dist(p1,p2):
dist = (p1[0]-p2[0] )**2 + (p1[1]-p2[1] )**2 + (p1[2]-p2[2] )**2
if dist < 100: return 255
return 0
@nb.njit(cache=True)
def color_img(image):
n_rows, n_cols, _ = image.shape
new_img = np.zeros((n_rows, n_cols), dtype=np.int32)
for c in range(n_rows):
for r in range(n_cols):
pixel = image[r, c, :3]
new_img[r,c] = L2_dist(color,pixel)
return new_img

计时：

# @tel's fully optimised solution(using einsum to short circuit np to get to BLAS directly, the no sqrt trick)
128 µs ± 6.94 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# JITed version without the sqrt trick
30.8 µs ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
# JITed version with the sqrt trick
24.8 µs ± 11.9 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

HTH。

如何遍历数组将阈值应用于每个像素

简单的单行解决方案

优化的双线解决方案

时间安排：

简单单线解决方案的说明

优化解决方案说明

相关内容

最新更新

热门标签：