在 2D 数组中查找到最近邻居的距离



我有一个二维数组,我想找到每个(x, y)点到最近邻居的距离

我可以使用scipy.spatial.distance.cdist来做到这一点:

import numpy as np
from scipy.spatial.distance import cdist
# Random data
data = np.random.uniform(0., 1., (1000, 2))
# Distance between the array and itself
dists = cdist(data, data)
# Sort by distances
dists.sort()
# Select the 1st distance, since the zero distance is always 0.
# (distance of a point with itself)
nn_dist = dists[:, 1]

这有效,但我觉得它的工作太多了,KDTree 应该能够处理这个问题,但我不确定如何。我对最近邻居的坐标不感兴趣,我只想要距离(并且尽可能快(。

KDTree可以做到这一点。该过程与使用 cdist 时几乎相同。但是cdist要快得多。正如评论中指出的那样,cKDTree甚至更快:

import numpy as np
from scipy.spatial.distance import cdist
from scipy.spatial import KDTree
from scipy.spatial import cKDTree
import timeit
# Random data
data = np.random.uniform(0., 1., (1000, 2))
def scipy_method():
    # Distance between the array and itself
    dists = cdist(data, data)
    # Sort by distances
    dists.sort()
    # Select the 1st distance, since the zero distance is always 0.
    # (distance of a point with itself)
    nn_dist = dists[:, 1]
    return nn_dist
def KDTree_method():
    # You have to create the tree to use this method.
    tree = KDTree(data)
    # Then you find the closest two as the first is the point itself
    dists = tree.query(data, 2)
    nn_dist = dists[0][:, 1]
    return nn_dist
def cKDTree_method():
    tree = cKDTree(data)
    dists = tree.query(data, 2)
    nn_dist = dists[0][:, 1]
    return nn_dist
print(timeit.timeit('cKDTree_method()', number=100, globals=globals()))
print(timeit.timeit('scipy_method()', number=100, globals=globals()))
print(timeit.timeit('KDTree_method()', number=100, globals=globals()))

输出:

0.34952507635557595
7.904083715193579
20.765962179145546

再一次,非常不需要证明 C 很棒!

最新更新