pandas 将点转换为线 Geopandas

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51071365/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:44:53  来源:igfitidea点击:

Convert Points to Lines Geopandas

pythonpandasgeopandas

提问by mm_nieder

Hello I am trying to convert a list of X and Y coordinates to lines. I want to mapped this data by groupbythe IDs and also by time. My code executes successfully as long as I groubyone column, but two columns is where I run into errors. I referenced to this question.

您好,我正在尝试将 X 和 Y 坐标列表转换为线。我想通过groupbyID 和时间映射这些数据。只要我grouby一列,我的代码就会成功执行,但两列是我遇到错误的地方。我参考了这个问题

Here's some sample data:

以下是一些示例数据:

ID  X           Y           Hour
1   -87.78976   41.97658    16
1   -87.66991   41.92355    16
1   -87.59887   41.708447   17
2   -87.73956   41.876827   16
2   -87.68161   41.79886    16
2   -87.5999    41.7083     16
3   -87.59918   41.708485   17
3   -87.59857   41.708393   17
3   -87.64391   41.675133   17

Here's my code:

这是我的代码:

df = pd.read_csv("snow_gps.csv", sep=';')

#zip the coordinates into a point object and convert to a GeoData Frame
geometry = [Point(xy) for xy in zip(df.X, df.Y)]
geo_df = GeoDataFrame(df, geometry=geometry)

# aggregate these points with the GrouBy
geo_df = geo_df.groupby(['track_seg_point_id', 'Hour'])['geometry'].apply(lambda x: LineString(x.tolist()))
geo_df = GeoDataFrame(geo_df, geometry='geometry')

Here is the error: ValueError: LineStrings must have at least 2 coordinate tuples

这是错误:ValueError: LineStrings must have at least 2坐标元组

This is the final result I am trying to get:

这是我试图得到的最终结果:

ID          Hour     geometry
1           16       LINESTRING (-87.78976 41.97658, -87.66991 41.9... 
1           17       LINESTRING (-87.78964000000001 41.976634999999... 
1           18       LINESTRING (-87.78958 41.97663499999999, -87.6... 
2           16       LINESTRING (-87.78958 41.976612, -87.669785 41... 
2           17       LINESTRING (-87.78958 41.976624, -87.66978 41.... 
3           16       LINESTRING (-87.78958 41.97666, -87.6695199999... 
3           17       LINESTRING (-87.78954 41.976665, -87.66927 41.... 

Please any suggestions or ideas would be great on how to groupby multiple parameters.

请在如何对多个参数进行分组方面有任何建议或想法。

回答by Tom G.

Your code is good, the problem is your data.

你的代码很好,问题是你的数据。

You can see that if you group by ID and Hour, then there is only 1 point that is grouped with an ID of 1 and an hour of 17. A LineString has to consist of 1 or more Points (must have at least 2 coordinate tuples). I added another point to your sample data:

可以看到,如果按 ID 和 Hour 分组,那么只有 1 个点与 ID 为 1 和小时为 17 的分组。一个 LineString 必须由 1 个或多个点组成(必须至少有 2 个坐标元组)。我在您的示例数据中添加了另一点:

    ID   X          Y           Hour
    1   -87.78976   41.97658    16
    1   -87.66991   41.92355    16
    1   -87.59887   41.708447   17
    1   -87.48234   41.677342   17
    2   -87.73956   41.876827   16
    2   -87.68161   41.79886    16
    2   -87.5999    41.7083     16
    3   -87.59918   41.708485   17
    3   -87.59857   41.708393   17
    3   -87.64391   41.675133   17

and as you can see below the code below is almost identical to yours:

正如您在下面看到的,下面的代码几乎与您的相同:

    import pandas as pd
    import geopandas as gpd
    from shapely.geometry import Point, LineString, shape

    df = pd.read_csv("snow_gps.csv", sep='\s*,\s*')

    #zip the coordinates into a point object and convert to a GeoData Frame
    geometry = [Point(xy) for xy in zip(df.X, df.Y)]
    geo_df = gpd.GeoDataFrame(df, geometry=geometry)

    geo_df2 = geo_df.groupby(['ID', 'Hour'])['geometry'].apply(lambda x:                 LineString(x.tolist()))
    geo_df2 = gpd.GeoDataFrame(geo_df2, geometry='geometry')