从熊猫数据帧到元组(用于哈弗辛模块)



我有一个熊猫数据帧my_df,其中包含以下列:

id  lat1 lon1 lat2 lon2
1   45   0    41   3
2   40   1    42   4
3   42   2    37   1

基本上,我想执行以下操作:

import haversine
haversine.haversine((45, 0), (41, 3)) # just to show syntax of haversine()
> 507.20410687342115
# what I'd like to do
my_df["dist"] = haversine.haversine((my_df["lat1"], my_df["lon1"]),(my_df["lat2"], my_df["lon2"]))

类型错误:无法将系列转换为类"浮点"<>

使用它,我尝试了以下方法:

my_df['dist'] = haversine.haversine(
list(zip(*[my_df[['lat1','lon1']][c].values.tolist() for c in my_df[['lat1','lon1']]]))
, 
list(zip(*[my_df[['lat2','lon2']][c].values.tolist() for c in my_df[['lat2','lon2']]]))
)

文件 "blabla\lib\site-packages\haversine__init__.py",第 20 行,在 haversine 中 LAT1, LNG1 = 点 1

值错误:要解压缩的值太多(预期为 2(

知道我做错了什么/我如何实现我想要的?

applyaxis=1一起使用:

my_df["dist"] = my_df.apply(lambda row : haversine.haversine((row["lat1"], row["lon1"]),(row["lat2"], row["lon2"])), axis=1)

要在每行上调用haversine函数,该函数可以理解标量值,而不是类似数组的值,因此会出现错误。通过使用axis=1调用apply,您可以逐行迭代,以便我们可以访问每个列值并以方法期望的形式传递这些值。

我也不知道有什么区别,但有一个矢量化版本的哈弗正弦公式

使用矢量化方法怎么样:

import pandas as pd
# vectorized haversine function
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of http://stackoverflow.com/a/29546836/2901002
Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)
All (lat, lon) coordinates must have numeric dtypes and be of equal length.
"""
if to_radians:
lat1, lon1, lat2, lon2 = pd.np.radians([lat1, lon1, lat2, lon2])
a = pd.np.sin((lat2-lat1)/2.0)**2 + 
pd.np.cos(lat1) * pd.np.cos(lat2) * pd.np.sin((lon2-lon1)/2.0)**2
return earth_radius * 2 * pd.np.arcsin(np.sqrt(a))

演示:

In [38]: df
Out[38]:
id  lat1  lon1  lat2  lon2
0   1    45     0    41     3
1   2    40     1    42     4
2   3    42     2    37     1
In [39]: df['dist'] = haversine(df.lat1, df.lon1, df.lat2, df.lon2)
In [40]: df
Out[40]:
id  lat1  lon1  lat2  lon2        dist
0   1    45     0    41     3  507.204107
1   2    40     1    42     4  335.876312
2   3    42     2    37     1  562.543582

最新更新