如何在Python中找到从点列表中的每个点到所有其他点的最小距离



如图所示,我有一个异常值,我想删除它(不是红色的,而是上面的绿色,它与其他点不对齐(,因此我试图找到最小距离,然后尝试消除它。但考虑到庞大的数据集,执行它需要很长时间。这是我下面的代码。感谢任何有帮助的解决方案,谢谢!在此处输入图像描述

import math
#list of 11600 points
dataset = [[2478, 3534], [4217, 953],......,11600 points]  
copy_dataset = dataset
Indices =[]
Min_Dists =[]
Distance = []
Copy_Dist=[]
for p1 in range(len(dataset)):
p1_x= dataset[p1][0]
p1_y= dataset[p1][1]
for p2 in range(len(copy_dataset)):

p2_x= copy_dataset[p2][0]
p2_y= copy_dataset[p2][1]
dist = math.sqrt((p1_x - p2_x) ** 2 + (p1_y - p2_y) ** 2)
Distance.append(dist)
Copy_Dist.append(dist)

min_dist_1= min(Distance)
Distance.remove(min_dist_1)

if(min_dist_1 !=0):
Min_Dists.append(min_dist_1)
ind_1 = Copy_Dist.index(min_dist_1)
Indices.append(ind_1)
min_dist_2=min(Distance)
Distance.remove(min_dist_2)    
if(min_dist_2 !=0):
Min_Dists.append(min_dist_2)
ind_2 = Copy_Dist.index(min_dist_2)
Indices.append(ind_2)
To_Remove = copy_dataset.index([p1_x, p1_y])
copy_dataset.remove(copy_dataset[To_Remove])

不知道如何解决这个问题,但以矢量化的方式计算距离可能会快得多。

dataset_copy = dataset.copy()
dataset_copy = dataset_copy[:, np.newaxis]
distance = np.sqrt(np.sum(np.square(dataset - dataset_copy), axis=~0))

谢谢你的回答,伙计们!我尝试了以下方法来解决这个问题,效果很快。

from statistics import mean
from scipy.spatial import distance
D = distance.squareform(distance.pdist(dataset))
closest = np.argsort(D, axis=1)
d1 =[]
for i in range(len(dataset)): 
d1.append(D[i][closest[i][1]])
avg_dist = int(mean(d1))
for i in range(len(dataset)):
d1= D[i][closest[i][1]]
d2= D[i][closest[i][2]]
if(abs(avg_dist-d1)>2):
if(abs(avg_dist-d2)>2):
print(dataset[i])
dataset.remove(dataset[i])

如果您同时需要所有距离:

distances = scipy.spatial.distance_matrix(dataset, dataset)

如果您需要一个点到所有其他点的距离:

for pt in dataset:
distances = scipy.spatial.distance_matrix([pt], dataset)[0]
# distances.min() will be 0 because the point has 0 distance to itself
# the nearest neighbor will be the second element in sorted order
indices = np.argpartition(distances, 1) # or use argsort for a complete sort
nearest_neighbor = indices[1]

文档:distance_matrixargpartition

相关内容

最新更新