一个点和平均向量之间的Mahalanobis距离始终相同

我最近尝试执行一些数据清洁算法。当我尝试计算数据集和平均向量之间点之间的Mahalanobis距离时，似乎是相同的。

例如，我有一个数据集，例如：

{{2,2,3},{4,5,9},{7,8,9}}

平均向量为：

{13/3,5,7}

和协方差矩阵是：

{{6.333333333333333,7.5,7.0},{7.5,9.0,9.0},{7.0,9.0,12.0}}

然后{2,2,3}，{4,5,9}，{7,8,9}和平均向量之间的距离为8290542，这是非常奇怪的。在纸上计算后，结果相同。

有人知道我的代码或想法怎么了？如果有人能帮助我，我会非常感激。以下是我用于处理此问题的一些代码。

import org.apache.commons.math3.linear.RealMatrix;
import org.apache.commons.math3.stat.correlation.Covariance;
import org.apache.mahout.math.*;
import org.apache.mahout.common.distance.MahalanobisDistanceMeasure;
public class Test {
    public static void main(String[] args) {
        double[] a = {2,2,3};
        Vector aVector = new DenseVector(a);
        double[] b = {4,5,9};
        Vector bVector = new DenseVector(b);
        double[] c = {7,8,9};
        Vector cVector = new DenseVector(b);

        double[] mean = {13/3,5,7};
        Vector meanVector = new DenseVector(mean);
        MahalanobisDistanceMeasure measure = new MahalanobisDistanceMeasure();
        double[][] ma = {{2,2,3},{4,5,9},{7,8,9}};
        RealMatrix matrix = new Covariance(ma).getCovarianceMatrix();
        Matrix math = new DenseMatrix(matrix.getData());
        measure.setCovarianceMatrix(math);
        measure.setMeanVector(meanVector);
        System.out.println(matrix.toString());
        System.out.println(measure.distance(meanVector,cVector));
    }

}

you 需要使用更多数据。

平均向量协方差矩阵否则将过度拟合，并给出相同的距离。

对于3D数据，使用至少20分。

相关内容

最新更新

热门标签：