在Geopandas中查找线串的开头X和Y



我在任何地方都找不到这个,所以希望我不会在这里被激怒太多。

我有一个折线形状文件,我正在尝试将开始和结束 XY 提取为新列,但似乎找不到如何使用 geopandas 执行此操作。

我想以四个新列结束,StartX,StartY,EndX,EndY

有人知道如何得到这个吗?

这是一个包含 100 个随机长度线串的 MRE

import pandas as pd, numpy as np, shapely.geometry, geopandas as gpd
gdf = gpd.GeoDataFrame(
geometry=[
shapely.geometry.LineString([tuple(i) for i in np.cumsum(np.random.random(size=(np.random.randint(2, 20), 2)), axis=1)])
for _ in range(100)
],
)

数据帧如下所示:

In [2]: gdf
Out[2]:
geometry
0   LINESTRING (0.36610 1.03088, 0.06126 0.29416, ...
1   LINESTRING (0.46164 1.26251, 0.16294 0.45719, ...
2   LINESTRING (0.45853 1.00003, 0.81500 0.92658, ...
3   LINESTRING (0.89925 1.11712, 0.22847 0.97792, ...
4   LINESTRING (0.05748 1.04220, 0.19561 0.86062, ...
..                                                ...
95  LINESTRING (0.62349 0.71080, 0.91981 1.44771, ...
96  LINESTRING (0.18924 0.91123, 0.94212 1.39855, ...
97  LINESTRING (0.79314 1.29408, 0.20462 0.73740, ...
98  LINESTRING (0.07744 0.87544, 0.87101 0.97909, ...
99  LINESTRING (0.31411 0.53442, 0.63755 0.78146, ...
[100 rows x 1 columns]

您可以使用gdf.boundary将 LineString 的边界作为多点获取:

In [3]: gdf.boundary
Out[3]:
0     MULTIPOINT (0.36610 1.03088, 0.32418 0.81727)
1     MULTIPOINT (0.46164 1.26251, 0.30703 0.95910)
2     MULTIPOINT (0.45853 1.00003, 0.95016 1.53127)
3     MULTIPOINT (0.89925 1.11712, 0.95730 1.13740)
4     MULTIPOINT (0.05748 1.04220, 0.42954 1.36282)
...
95    MULTIPOINT (0.62349 0.71080, 0.93710 1.55117)
96    MULTIPOINT (0.18924 0.91123, 0.48047 1.08956)
97    MULTIPOINT (0.79314 1.29408, 0.24173 0.56003)
98    MULTIPOINT (0.07744 0.87544, 0.23844 1.23815)
99    MULTIPOINT (0.31411 0.53442, 0.00648 0.76329)
Length: 100, dtype: geometry

然后可以将其与explode()结合使用,后者会将任何多部分几何图形转换为单独的行,然后unstack生成的额外索引组以创建新列、01,用于起点和终点。每列将包含shapely.geometry.Point对象:

In [4]: bounds = gdf.geometry.boundary.explode(index_parts=True).unstack()

最后,我们可以直接获取这些点的x值和y值:

In [5]: gdf['StartX'] = bounds[0].x
...: gdf['StartY'] = bounds[0].y
...: gdf['EndX'] = bounds[1].x
...: gdf['EndY'] = bounds[1].y

这给出了您正在寻找的最终结果:

In [6]: gdf
Out[6]:
geometry    StartX    StartY      EndX      EndY
0   LINESTRING (0.36610 1.03088, 0.06126 0.29416, ...  0.366098  1.030880  0.324176  0.817272
1   LINESTRING (0.46164 1.26251, 0.16294 0.45719, ...  0.461642  1.262513  0.307032  0.959099
2   LINESTRING (0.45853 1.00003, 0.81500 0.92658, ...  0.458530  1.000032  0.950164  1.531267
3   LINESTRING (0.89925 1.11712, 0.22847 0.97792, ...  0.899254  1.117123  0.957299  1.137399
4   LINESTRING (0.05748 1.04220, 0.19561 0.86062, ...  0.057482  1.042202  0.429537  1.362817
..                                                ...       ...       ...       ...       ...
95  LINESTRING (0.62349 0.71080, 0.91981 1.44771, ...  0.623486  0.710795  0.937098  1.551175
96  LINESTRING (0.18924 0.91123, 0.94212 1.39855, ...  0.189237  0.911229  0.480470  1.089565
97  LINESTRING (0.79314 1.29408, 0.20462 0.73740, ...  0.793135  1.294084  0.241726  0.560030
98  LINESTRING (0.07744 0.87544, 0.87101 0.97909, ...  0.077441  0.875441  0.238441  1.238148
99  LINESTRING (0.31411 0.53442, 0.63755 0.78146, ...  0.314106  0.534418  0.006481  0.763287

最新更新