循环通过数据帧并填充每个用户组的url请求



我有一个带有GPS点的Panda数据帧,看起来像这样:

import pandas as pd
d = {'user': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'], 'lat': [ 37.75243634842733, 37.75344580658182, 37.75405656449232, 37.753649393112181,37.75409897804892, 37.753937806404586, 37.72767062183685, 37.72710631810977, 37.72605407110467, 37.71141865080228, 37.712199505873926, 37.713285899241896, 37.71428740401767, 37.712810604103346], 'lon': [-122.41924881935118, -122.42006421089171, -122.419216632843, -122.41784334182738, -122.4169099330902, -122.41549372673035, -122.3878937959671, -122.3884356021881, -122.38841414451599, -122.44688630104064, -122.44474053382874, -122.44361400604248, -122.44260549545288, -122.44156479835509]}
df = pd.DataFrame(data=d)

user    lat         lon
0   A       37.752436   -122.419249
1   A       37.753446   -122.420064
2   A       37.754057   -122.419217
3   A       37.753649   -122.417843
4   A       37.754099   -122.416910
5   A       37.753938   -122.415494
6   B       37.727671   -122.387894
7   B       37.727106   -122.388436
8   B       37.726054   -122.388414
9   C       37.711419   -122.446886
10  C       37.712200   -122.444741
11  C       37.713286   -122.443614
12  C       37.714287   -122.442605
13  C       37.712811   -122.441565

使用下面的函数,我可以将所有这些坐标从df直接输入到(OSRM(请求,以映射匹配这些GPS点

import numpy as np
from typing import Dict, Any, List, Tuple
import requests
# Format NumPy array of (lat, lon) coordinates into a concatenated string formatted for OSRM server
def format_coords(coords: np.ndarray) -> str:
coords = ";".join([f"{lon:f},{lat:f}" for lat, lon in coords])
return coords
# Forward request to the OSRM server and return a dictionary of the JSON response.
def make_request(
coords: np.ndarray,
) -> Dict[str, Any]:
coords = format_coords(coords)
url = f"http://router.project-osrm.org/match/v1/car/{coords}"
r = requests.get(url)
return r.json()
coords=df[['lat','lon']].values    
# Make request against the OSRM HTTP server
output = make_request(coords)

然而,由于df由不同用户生成的不同GPS轨迹组成,我想编写一个函数,在该数据帧中循环,并将相应的坐标集提供给每个用户组的请求,而不是一次全部提供。最好的方法是什么?

您可以在user列上groupby数据帧,然后将make_request应用于每个组,并将输出保存到outputdict(以用户为键(:

output = {}
for user, g in df.groupby('user'):
output[user] = make_request(g[['lat', 'lon']].values)

最新更新