使用PCLMULQDQ计算CRC32的常数

我正在阅读以下关于如何使用Intel Westmere和AMD Bulldozer中引入的PCLMULQDQ指令高效实现CRC32的论文：

V。Gopal等人，"使用PCLMULQDQ指令对一般多项式进行快速CRC计算"，2009年。http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf

我理解算法，但有一件事我不确定，那就是如何计算常数$k_I$。例如，它们提供了IEEE 802.3多项式的常数值：

k1=x^(4*128+64)mod P(x)=0x8833794C
k4=x^128 mod P(x)=0xE8A45605
μ=x^64div P(x)=0x104D101DF

等等。我可以使用这些常数，因为我只需要支持一个多项式，但我很感兴趣：他们是如何计算这些数字的？我不能只使用典型的bignum实现(例如Python提供的实现)，因为算术必须在GF(2)中进行。

它就像常规除法一样，只是您使用了exclusive或而不是减法。所以从股息中最重要的1开始。互斥或多项式的被除数，将多项式中最重要的1与被除数中的1对齐，将其变为零。重复此操作，直到消除了低位n以上的所有1，其中n是多项式的阶数。结果是余数。

确保您的多项式在n+1^th位中具有高项。即，使用0x104C11DB7，而不是0x4C11DB7。

如果你想要商(你把它写成"div")，那么跟踪你消除的1的位置。这个集合向下移动n，就是商。

方法如下：

/* Placed in the public domain by Mark Adler, Jan 18, 2014. */
#include <stdio.h>
#include <inttypes.h>
/* Polynomial type -- must be an unsigned integer type. */
typedef uintmax_t poly_t;
#define PPOLY PRIxMAX
/* Return x^n mod p(x) over GF(2).  x^deg is the highest power of x in p(x).
The positions of the bits set in poly represent the remaining powers of x in
p(x).  In addition, returned in *div are as many of the least significant
quotient bits as will fit in a poly_t. */
static poly_t xnmodp(unsigned n, poly_t poly, unsigned deg, poly_t *div)
{
poly_t mod, mask, high;
if (n < deg) {
*div = 0;
return poly;
}
mask = ((poly_t)1 << deg) - 1;
poly &= mask;
mod = poly;
*div = 1;
deg--;
while (--n > deg) {
high = (mod >> deg) & 1;
*div = (*div << 1) | high;  /* quotient bits may be lost off the top */
mod <<= 1;
if (high)
mod ^= poly;
}
return mod & mask;
}
/* Compute and show x^n modulo the IEEE 802.3 CRC-32 polynomial.  If d is true,
also show the low bits of the quotient. */
static void show(unsigned n, int showdiv)
{
poly_t div;
printf("x^%u mod p(x) = %#" PPOLY "n", n, xnmodp(n, 0x4C11DB7, 32, &div));
if (showdiv)
printf("x^%u div p(x) = %#" PPOLY "n", n, div);
}
/* Compute the constants required to use PCLMULQDQ to compute the IEEE 802.3
32-bit CRC.  These results appear on page 16 of the Intel paper "Fast CRC
Computation Using PCLMULQDQ Instruction". */
int main(void)
{
show(4*128+64, 0);
show(4*128, 0);
show(128+64, 0);
show(128, 0);
show(96, 0);
show(64, 1);
return 0;
}

相关内容

最新更新

热门标签：