获得整数阵列的锤子距离的最快方法



让a和b为8位整数(0-255)的相同大小的向量。我想计算这些向量有所不同的位数,即通过这些数字的二进制表示形式形成的向量之间的锤击距离。例如:

a = [127,255]
b= [127,240]

使用numpy库

np.bitwise_xor(a,b)
# Output: array([ 0, 15])

我现在需要的是二进制代表上述数组的每个元素,而数组的所有元素中的计数数为1s。上面的示例将给出0 4 = 4的锤子距离。

方法#1:我们可以将它们广播到二进制位&计数不同位的数量,例如 -

def hamming_distance(a, b):
    r = (1 << np.arange(8))[:,None]
    return np.count_nonzero( (a & r) != (b & r) )

样本运行 -

In [144]: a = [127,255]
     ...: b = [127,240]
     ...: 
In [145]: hamming_distance(a, b)
Out[145]: 4

方法#2:使用bitwise-xor操作,我们可以找出ab之间的不同二进制位数 -

def hamming_distance_v2(a, b):
    r = (1 << np.arange(8))[:,None]
    return np.count_nonzero((np.bitwise_xor(a,b) & r) != 0)

如果您要在程序执行过程中多次调用距离函数,则可以使用预先计算的位计数表来获得一定的速度。这是锤距函数的另一个版本:

# _nbits[k] is the number of 1s in the binary representation of k for 0 <= k < 256.
_nbits = np.array(
      [0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3,
       4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4,
       4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2,
       3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5,
       4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4,
       5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3,
       3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2,
       3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6,
       4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5,
       6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5,
       5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6,
       7, 7, 8], dtype=np.uint8)

def hamming_distance1(a, b):
    c = np.bitwise_xor(a, b)
    n = _nbits[c].sum()
    return n

在以下内容中, ab是对问题的评论中给出的长度32的python列表。divakar_hamming_distance()divakar_hamming_distance_v2()来自 @Divakar的答案。

这是 @Divakar功能的时间:

In [116]: %timeit divakar_hamming_distance(a, b)
The slowest run took 5.57 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.3 µs per loop
In [117]: %timeit divakar_hamming_distance_v2(a, b)
The slowest run took 5.35 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 10.3 µs per loop

hamming_distance1(a, b)速度更快:

In [118]: %timeit hamming_distance1(a, b)
The slowest run took 6.04 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.42 µs per loop

在我的计算机上,初始化 _nbits的时间约为11 µs,因此,只要您一次调用一次函数,使用hamming_distance1就没有优势。如果您将其称为三次或更多次,则性能净收益。

如果输入已经是numpy阵列,则所有功能都显着更快:

In [119]: aa = np.array(a)
In [120]: bb = np.array(b)
In [121]: %timeit divakar_hamming_distance_v2(aa, bb)
The slowest run took 8.22 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.72 µs per loop
In [122]: %timeit hamming_distance1(aa, bb)
The slowest run took 12.67 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.77 µs per loop

当然,如果您在计算锤子距离之前始终立即执行此操作,则进行转换的时间必须包括在整个时间安排中。但是,如果您编写生成ab的代码以较早地利用Numpy,那么您可能已经将它们作为Numpy数组,而当您计算Hamming距离时。


(我还用2-D阵列进行了一些实验,其中8位值之间的预先计算的锤距距离 - 具有形状的阵列(256,256) - 但初始化成本较高,并且性能增长很小。

也许不是最有效的方法,但是最简单的IMO是将您的Ouptut数组转换为二进制形式的字符串,然后将所有字符的总和转换回ints ...

import numpy as np
output = np.random.randint(0,63,10)
hamming = ['{:b}'.format(x).count('1') for x in output]
sum(bin(x).count("1") for x in np.bitwise_xor(a,b))

最新更新