我正在尝试开发一个代码来计算数据集的协方差矩阵,使用For Loops而不是Numpy。到目前为止,我的代码生成了一个错误:
def cov_naive(X):
"""Compute the covariance for a dataset of size (D,N)
where D is the dimension and N is the number of data points"""
D, N = X.shape
### Edit the code below to compute the covariance matrix by iterating over the dataset.
covariance = np.zeros((D, D))
mean = np.mean(X, axis=1)
for i in range(D):
for j in range(D):
covariance[i,j] += (X[:,i] - mean[i]) @ (X[:,j] - mean[j])
return covariance/N
我正在尝试执行以下测试以验证它是否有效:
# Let's first test the functions on some hand-crafted dataset.
X_test = np.arange(6).reshape(2,3)
expected_test_mean = np.array([1., 4.]).reshape(-1, 1)
expected_test_cov = np.array([[2/3., 2/3.], [2/3.,2/3.]])
print('X:n', X_test)
print('Expected mean:n', expected_test_mean)
print('Expected covariance:n', expected_test_cov)
np.testing.assert_almost_equal(mean(X_test), expected_test_mean)
np.testing.assert_almost_equal(mean_naive(X_test), expected_test_mean)
np.testing.assert_almost_equal(cov(X_test), expected_test_cov)
np.testing.assert_almost_equal(cov_naive(X_test), expected_test_cov)
并收到以下错误:
AssertionError:
Arrays are not almost equal to 7 decimals
AssertionError Traceback (most recent call last)
<ipython-input-21-6a6498089109> in <module>()
12
13 np.testing.assert_almost_equal(cov(X_test), expected_test_cov)
---> 14 np.testing.assert_almost_equal(cov_naive(X_test), expected_test_cov)
任何帮助将不胜感激!
错误就在那行
mean = np.mean(X, axis=1)
它应该是:
mean = np.mean(X, axis=0)
当您计算列的平均值(即数据集维度(时