Python CRC16实现中字节字符串的使用不正确



我正在尝试用Python实现我自己的循环冗余校验(CRC(。我的程序布局如下:

  1. random_message(n(生成长度为n的随机字节消息
  2. 使用CRC码crc16生成校验和值
  3. 对生成的消息运行损坏代码corrupt_data
  4. 检查校验和是否不同(我使用==进行了此操作(
  5. 多次重复步骤1至4,查看错误(即损坏(被忽视的频率

我相信方法crc16corrupt_data是正确的,所以我认为没有太多理由过于仔细地分析它们。我认为问题是从我在程序的后半部分使用字节字符串开始的,在这两个方法之后。

我的代码如下:

from random import random
from random import choice
from string import ascii_uppercase
CORRUPTION_RATE = 0.25
def crc16(data: bytes):
xor_in = 0x0000  # initial value
xor_out = 0x0000  # final XOR value
poly = 0x8005  # generator polinom (normal form)
reg = xor_in
for octet in data:
# reflect in
for i in range(8):
topbit = reg & 0x8000
if octet & (0x80 >> i):
topbit ^= 0x8000
reg <<= 1
if topbit:
reg ^= poly
reg &= 0xFFFF
# reflect out
return reg ^ xor_out
from random import randbytes

def corrupt_data(data : bytes):
'''
some random corruption of byte data
can be modified as needed using the CORRUPTION_RATE global constant
''' 
temp = data[:]
while True:
location = int(len(temp) * random())
data_list = list(temp)
if random() < 0.5:
data_list[location] = (data_list[location] + 1) % 256
else: 
data_list[location] = (data_list[location] - 1) % 256
temp = bytes(data_list)
if random() < CORRUPTION_RATE and temp != data:
break
return temp
# Generate random byte message of length n
def random_message(n):

randomBytes = ''.join(choice(ascii_uppercase) for i in range(n)).encode()
print("randomBytes is " + str(randomBytes))
print("The class type of randomBytes is " + str(type(randomBytes)))
return randomBytes


numberOfErrors = 0;
for i in range(10000):
# generating random byte message of length n
randomMessage = random_message(5)
# generating the checksum value using the CRC code
checksumValue = crc16(randomMessage)
#print("checksumValue is " + str(checksumValue))
#print("The class type of checksumValue is " + str(type(checksumValue)))
# running the corruption on the generated message
#print("The class type of bchecksumValue is " + str(type(b"checksumValue")))
corrupt = corrupt_data(b"checksumValue")
#print("The class type of corrupt_data(bchecksumValue) is " + str(type(corrupt)))
#print("Checking whether the checksum is different ... ")
different = (b"checksumValue" == corrupt)
#print("bchecksumValue == corrupt is " + str(different))
#print("bchecksumValue was " + str(b"checksumValue") + ", and corrupt was " + str(corrupt))

if(different == False):
numberOfErrors += 1

print("numberOfErrors is " + str(numberOfErrors))

正如您所看到的,我插入了各种(现在已经注释掉了(打印语句来帮助我进行调试。

问题是,当我运行上面的代码时,我得到了numberOfErrors is 10000。显然,这不可能是正确的,因为我们预计其中一些是正确的。因此,我们预计numberOfErrors略小于10000。

正如我所说,我确信crc16corrupt_data函数是正确的,并且我怀疑问题是在我使用for循环中的字节字符串时出现的:

numberOfErrors = 0;
for i in range(10000):
# generating random byte message of length n
randomMessage = random_message(5)
# generating the checksum value using the CRC code
checksumValue = crc16(randomMessage)
#print("checksumValue is " + str(checksumValue))
#print("The class type of checksumValue is " + str(type(checksumValue)))
# running the corruption on the generated message
#print("The class type of bchecksumValue is " + str(type(b"checksumValue")))
corrupt = corrupt_data(b"checksumValue")
#print("The class type of corrupt_data(bchecksumValue) is " + str(type(corrupt)))
#print("Checking whether the checksum is different ... ")
different = (b"checksumValue" == corrupt)
#print("bchecksumValue == corrupt is " + str(different))
#print("bchecksumValue was " + str(b"checksumValue") + ", and corrupt was " + str(corrupt))

if(different == False):
numberOfErrors += 1

print("numberOfErrors is " + str(numberOfErrors))

我从来没有真正用字节/字节字符串编程过,而且我最近才开始学习Python,所以我不明白我做错了什么。我的错误在哪里?我该如何修复它?


编辑

正如user2357112在评论中提到的那样,支持Monica,问题可能是corrupt = corrupt_data(b"checksumValue")中的b"checksumValue"。我遇到的问题是函数crc16返回一个int,因此,为了将其转换回传递到函数corrupt_data(data : bytes)的字节,我尝试使用b前缀。我想这是我对Python缺乏经验的表现。


编辑2

好的,所以我正在尝试这个答案中提供的解决方案。修改后的代码如下:

# running the corruption on the generated message
bs = str(checksumValue).encode('ascii')
print("str(checksumValue).encode('ascii') is " + str(bs))
#print("The class type of bchecksumValue is " + str(type(b"checksumValue")))
print("The class type of str(checksumValue).encode('ascii') is " + str(type(bs)))
#corrupt = corrupt_data(b"checksumValue")
corrupt = corrupt_data(bs)
#print("The class type of corrupt_data(bchecksumValue) is " + str(type(corrupt)))
print("The class type of corrupt_data(bs) is " + str(type(corrupt)))

输出为

randomBytes is b'BBVFC'
The class type of randomBytes is <class 'bytes'>
checksumValue is 10073
The class type of checksumValue is <class 'int'>
str(checksumValue).encode('ascii') is b'10073'
The class type of str(checksumValue).encode('ascii') is <class 'bytes'>
The class type of corrupt_data(bs) is <class 'bytes'>

因此,这些类似乎与我们所期望的相匹配。


编辑3

在for循环中实现EDIT2中的更改,我仍然得到numberOfErrors is 10000作为我的输出。代码如下:

numberOfErrors = 0;
for i in range(10000):
# generating random byte message of length n
randomMessage = random_message(5)
# generating the checksum value using the CRC code
checksumValue = crc16(randomMessage)
#print("checksumValue is " + str(checksumValue))
#print("The class type of checksumValue is " + str(type(checksumValue)))
# running the corruption on the generated message
bs = str(checksumValue).encode('ascii')
#print("str(checksumValue).encode('ascii') is " + str(bs))
#print("The class type of str(checksumValue).encode('ascii') is " + str(type(bs)))
corrupt = corrupt_data(bs)
#print("The class type of corrupt_data(bs) is " + str(type(corrupt)))

#print("Checking whether the checksum is different ... ")
different = (bs == corrupt)
#print("bs == corrupt is " + str(different))
#print("bs was " + str(bs) + ", and corrupt was " + str(corrupt))

if(different == False):
numberOfErrors += 1

print("numberOfErrors is " + str(numberOfErrors))

您的问题实际上与字节字符串无关,而是一个逻辑错误。你试图破坏错误的东西。您不希望损坏校验和,而是希望损坏原始邮件,然后获取损坏版本的校验和。然后您可以比较这两个校验和是否匹配。

尝试:

undetected_errors = 0
for i in range(10000):
good_message = random_message(5)
good_checksum = crc16(good_message)
corrupted_message = corrupt_data(good_message)
corrupted_checksum = crc16(corrupted_message)
if good_checksum == corrupted_checksum:
undetected_errors += 1

最新更新