我创建了一个函数来定义一个测试统计量,我想用python对它进行测试。它从现有样本(例如matrix2,它只是一列)中重新采样1000次,并取这些样本的模态。基本上,它通过引导模式来为matrix2和matrix3创建模式的采样分布。然后,使用KS测试比较这些分布。
def newTestStat(matrix2, matrix3):
num_samples = 1000
sample_size_2 = len(matrix2)
replications_2 = np.array([np.random.choice(matrix2, sample_size_2, replace=True) for _ in range(num_samples)])
mode_2 = stats.mode(replications_2, axis=1)
sampleModes2 = mode_2.mode.flatten().tolist()
sample_size_3 = len(matrix3)
replications_3 = np.array([np.random.choice(matrix3, sample_size_3, replace=True) for _ in range(num_samples)])
mode_3 = stats.mode(replications_3, axis=1)
sampleModes3 = mode_3.mode.flatten().tolist()
return ks_2samp(np.array(sampleModes2), np.array(sampleModes3))
dataToUseMatrix= (matrix2,matrix3)
pTest = permutation_test(dataToUseMatrix,newTestStat,n_resamples=1000)
print('exact p-value:',pTest.pvalue)
然而,我目前得到以下错误,尽管事实上np.array(sampleModes2)和np.array(sampleModes3)有相同的形状(1000,):
Traceback (most recent call last):
pTest = permutation_test(dataToUseMatrix,newTestStat,n_resamples=1000)
pvalues = compare[alternative](null_distribution, observed)
pvalues_less = less(null_distribution, observed)
cmps = null_distribution <= observed + gamma
ValueError: operands could not be broadcast together with shapes (2,1000) (2,)
有人看到问题在这里吗?
问题是ks_2samp
具有特定的Kstest
返回类型。如果需要数值返回类型,则需要指定:
(ks_2samp(np.array(sampleModes2), np.array(sampleModes3))).statistic