我有一个带有GPS点的Panda数据帧,看起来像这样:
import pandas as pd
d = {'user': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'], 'lat': [ 37.75243634842733, 37.75344580658182, 37.75405656449232, 37.753649393112181,37.75409897804892, 37.753937806404586, 37.72767062183685, 37.72710631810977, 37.72605407110467, 37.71141865080228, 37.712199505873926, 37.713285899241896, 37.71428740401767, 37.712810604103346], 'lon': [-122.41924881935118, -122.42006421089171, -122.419216632843, -122.41784334182738, -122.4169099330902, -122.41549372673035, -122.3878937959671, -122.3884356021881, -122.38841414451599, -122.44688630104064, -122.44474053382874, -122.44361400604248, -122.44260549545288, -122.44156479835509]}
df = pd.DataFrame(data=d)
user lat lon
0 A 37.752436 -122.419249
1 A 37.753446 -122.420064
2 A 37.754057 -122.419217
3 A 37.753649 -122.417843
4 A 37.754099 -122.416910
5 A 37.753938 -122.415494
6 B 37.727671 -122.387894
7 B 37.727106 -122.388436
8 B 37.726054 -122.388414
9 C 37.711419 -122.446886
10 C 37.712200 -122.444741
11 C 37.713286 -122.443614
12 C 37.714287 -122.442605
13 C 37.712811 -122.441565
使用下面的函数,我可以将所有这些坐标从df直接输入到(OSRM(请求,以映射匹配这些GPS点
import numpy as np
from typing import Dict, Any, List, Tuple
import requests
# Format NumPy array of (lat, lon) coordinates into a concatenated string formatted for OSRM server
def format_coords(coords: np.ndarray) -> str:
coords = ";".join([f"{lon:f},{lat:f}" for lat, lon in coords])
return coords
# Forward request to the OSRM server and return a dictionary of the JSON response.
def make_request(
coords: np.ndarray,
) -> Dict[str, Any]:
coords = format_coords(coords)
url = f"http://router.project-osrm.org/match/v1/car/{coords}"
r = requests.get(url)
return r.json()
coords=df[['lat','lon']].values
# Make request against the OSRM HTTP server
output = make_request(coords)
然而,由于df由不同用户生成的不同GPS轨迹组成,我想编写一个函数,在该数据帧中循环,并将相应的坐标集提供给每个用户组的请求,而不是一次全部提供。最好的方法是什么?
您可以在user
列上groupby
数据帧,然后将make_request
应用于每个组,并将输出保存到output
dict(以用户为键(:
output = {}
for user, g in df.groupby('user'):
output[user] = make_request(g[['lat', 'lon']].values)