查找每3个最接近坐标(lan和long)的子集,Python



我有共享单车的数据集。每个站点的数据都有lan和long。数据示例如下。我想找到在坐标方面彼此接近的每3个站,并总结每个子类别(3个最接近的点(的计数。

我知道如何计算两点之间的距离。但我不知道如何编程,就找到最接近坐标的每3个子集而言。

计算2点之间距离的代码:

from math import cos, asin, sqrt, pi
def distance(lat1, lon1, lat2, lon2):
p = pi/180
a = 0.5 - cos((lat2-lat1)*p)/2 + cos(lat1*p) * cos(lat2*p) * (1-cos((lon2-lon1)*p))/2
return 12742 * asin(sqrt(a)) 

数据:


start_station_name  start_station_latitude  start_station_longitude. count
0   Schous plass                59.920259       10.760629.                 2
1   Pilestredet                 59.926224       10.729625.                 4
2   Kirkeveien                  59.933558       10.726426.                 8
3   Hans Nielsen Hauges plass   59.939244       10.774319.                 0
4   Fredensborg                 59.920995       10.750358.                 8
5   Marienlyst                  59.932454       10.721769.                 9
6   Sofienbergparken nord       59.923229       10.766171.                 3
7   Stensparken                 59.927140       10.730981.                 4
8   Vålerenga                   59.908576       10.786856.                 6
9   Schous plass trikkestopp    59.920728       10.759486.                 5
10  Griffenfeldts gate          59.933703       10.751930.                 4
11  Hallénparken                59.931530       10.762169.                 8
12  Alexander Kiellands Plass   59.928058       10.751397.                 3
13  Uranienborgparken           59.922485       10.720896.                 2
14  Sommerfrydhagen             59.911453       10.776072                  1
15  Vestkanttorvet              59.924403       10.713069.                 8
16  Bislettgata                 59.923834       10.734638                  9
17  Biskop Gunnerus' gate       59.912334       10.752292                  1
18  Botanisk Hage sør           59.915282       10.769620                  1
19  Hydroparken.                59.914145       10.715505                  1
20  Bøkkerveien                 59.927375       10.796015                  1

我想要的是:


closest                                   count_sum

Schous plass, Pilestredet, Kirkeveien.      14
.
.
.


The Error: 

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-1a4d3a72c23d> in <module>
7     for idx_1, idx_2 in [(0, 1), (1, 2), (0, 2)]:
8         total_distance += distance(
----> 9             combination[idx_1]['start_station_latitude'],
10             combination[idx_1]['start_station_longitude'],
11             combination[idx_2]['start_station_latitude'],
TypeError: 'int' object is not subscriptable

您可以使用itertools.combinations((尝试所有可能的组合,并保存总距离最短的站对。

from itertools import combinations
best = (float('inf'), None)
for combination in combinations(data, 3):
total_distance = 0
for idx_1, idx_2 in [(0, 1), (1, 2), (0, 2)]:
total_distance += distance(
combination[idx_1]['start_station_latitude'], 
combination[idx_1]['start_station_longitude'], 
combination[idx_2]['start_station_latitude'], 
combination[idx_2]['start_station_longitude'], 
)
if total_distance < best[0]:
best = (total_distance, combination)
print(f'Best combination is {best[1]}, total distance: {best[0]}')

请记住,仍有优化的空间,例如缓存等两个站点之间的距离

lru_cache(maxsize=None)
def distance(lat1, lon1, lat2, lon2):
p = pi/180
...

最新更新