在图像数组上使用cv2.resize()可以在不转换为字节的情况下对其进行哈希

hashlib.md5((不能将ndarray作为参数，所以我必须使用.tobytes((将ndarray转换为字节。如果我不这样做，我会得到以下错误：

ValueError: ndarray is not C-contiguous

然而，如果我在ndarray上使用cv2.resize((，即使cv2.resize((的输出仍然是ndarray:，我也可以在不转换为字节的情况下对其进行哈希

import cv2
image = cv2.imread("img.png")
r_img = cv2.resize(image, (0, 0), fx=1, fy=1)
print(type(r_img))

给出输出：

<class 'numpy.ndarray'>

这是我的完整代码：

import timeit
import cv2
import hashlib
image = cv2.imread("img.png")

def hash_without_resize(img):
b_img = img.tobytes()
return hashlib.md5(b_img).hexdigest()

def hash_with_resize(img):
# Resize the image with scale factor 1 in both directions (the image stays exactly the same)
r_img = cv2.resize(img, (0, 0), fx=1, fy=1)
return hashlib.md5(r_img).hexdigest()

print(timeit.timeit('hash_without_resize(image)', 'from __main__ import hash_without_resize, image', number=1000)
print(timeit.timeit('hash_with_resize(image)', 'from __main__ import hash_with_resize, image', number=1000)
print(hash_without_resize(image), hash_with_resize(image))

给出输出：

0.4001011
0.40305579999999985
62da8968d9cb37790811ff16624d8cc7 62da8968d9cb37790811ff16624d8cc7

正如您所看到的，尽管调整大小的ndarray仍然是一个ndarray，但在对其进行哈希处理时不会引发错误。此外，散列是相同的，因此图像根本不会因调整大小而改变。有人能解释一下为什么会发生这种事吗？因为我很困惑。它在使用cv2.flip((和其他opencv函数时也能工作。我也在其他散列函数中尝试过，得到了相同的结果。

原始图像的错误消息已经说明了一切：

ValueError: ndarray is not C-contiguous

hash函数只适用于内存的连续区域，但ndarrays(以及通常的Python缓冲区(可以是不连续的。许多创建新ndarray的函数都会将其创建为连续的，因此您可以始终使用其中一个函数来获得可哈希的ndarray。

然而，您应该意识到，图像上的哈希很少有用，除非您想识别像素精确的副本。只要对图像进行了微小的修改，例如压缩为jpeg和解压缩，即使看起来完全相同，它也会有一个完全不同的哈希。

相关内容

最新更新

热门标签：