使用python,我可以运行哪些统计测试来测试二进制字符串的随机性



我在Python中实现块频率测试以了解二进制字符串的随机性时遇到了问题。我想知道是否有人能帮助我理解为什么代码无法运行。

此外,在Python或可能的Matlab中,是否有任何统计测试来测试二进制字符串的随机性?

from importlib import import_module
import_module
from tokenize import Special
import math
def block_frequency(self, bin_data: str, block_size=4):
"""
Note that this description is taken from the NIST documentation [1]
[1] http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf
The focus of this tests is the proportion of ones within M-bit blocks. The purpose of this tests is to determine
whether the frequency of ones in an M-bit block is approximately M/2, as would be expected under an assumption
of randomness. For block size M=1, this test degenerates to the monobit frequency test.
:param bin_data: a binary string
:return: the p-value from the test
:param block_size: the size of the blocks that the binary sequence is partitioned into
"""
# Work out the number of blocks, discard the remainder
(num_blocks)= math.floor((1010110001001011011010111110010000000011010110111000001101) /4)
block_start, block_end = 0, 4
# Keep track of the proportion of ones per block 
proportion_sum = 0.0
for i in range(num_blocks):
# Slice the binary string into a block 
block_data = (101010001001011011010111110010000000011010110111000001101)[block_start:block_end]
# Keep track of the number of ones 
ones_count = 0
for char in block_data:
if char == '1':
ones_count += 1
pi = ones_count / 4
proportion_sum += pow(pi - 0.5, 2.0) 
# Update the slice locations 
block_start += 4
block_end += 4 
# Calculate the p-value
chi_squared = 4.0 * 4 * proportion_sum
p_val = Special.gammaincc(num_blocks / 2, chi_squared / 2)
print(p_val)

我在您的代码中看到了三个问题。

  1. 在两个不同的位置使用硬编码值。这是一种糟糕的做法,而且容易出错。我知道这可能不是OP所指的,但在我们做这件事的时候,它值得修复
  2. 二进制位的字符串(尤其是与"1"比较的字符串(应该封装在引号中,而不是括号中。这是抛出的错误之一,因为现在的书写方式是,你有一个大整数,你试图";索引";。(这与在必要时使用len以及其他一些小的更改一起进行(
  3. 您使用了错误的模块。。。您的意思可能是使用scipy.special.gammainc,而不是tokenize.Special.gammaincc,它无论如何都不存在

把所有的东西放在一起,试试这样的东西:

from importlib import import_module
from scipy.special import gammainc
import_module
import math

def block_frequency(self, bin_data: str, block_size=4):
"""
Note that this description is taken from the NIST documentation [1]
[1] http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf
The focus of this tests is the proportion of ones within M-bit blocks. The purpose of this tests is to determine
whether the frequency of ones in an M-bit block is approximately M/2, as would be expected under an assumption
of randomness. For block size M=1, this test degenerates to the monobit frequency test.
:param bin_data: a binary string
:return: the p-value from the test
:param block_size: the size of the blocks that the binary sequence is partitioned into
"""

# Work out the number of blocks, discard the remainder
my_binary_string = '101010001001011011010111110010000000011010110111000001101'
num_blocks = math.floor(len(my_binary_string) / 4)
block_start, block_end = 0, 4
# Keep track of the proportion of ones per block 
proportion_sum = 0.0
for i in range(num_blocks):
# Slice the binary string into a block 
block_data = my_binary_string[block_start:block_end]
# Keep track of the number of ones 
ones_count = 0
for char in block_data:
if char == '1':
ones_count += 1
pi = ones_count / 4
proportion_sum += pow(pi - 0.5, 2.0)
# Update the slice locations 
block_start += 4
block_end += 4
# Calculate the p-value
chi_squared = 4.0 * 4 * proportion_sum
p_val = gammainc(num_blocks / 2, chi_squared / 2)
print(p_val)

最新更新