pandas 合并 geopandas 中的地理数据框(CRS 不匹配)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34420285/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:25:15  来源:igfitidea点击:

Merging geodataframes in geopandas (CRS do not match)

pandasgisgeopandas

提问by Alexis Eggermont

I am trying to merge two geodataframes (want to see which polygon each point is in).

我正在尝试合并两个地理数据框(想查看每个点所在的多边形)。

The following code gets me a warning first ("CRS does not match!") and then an error ("RTreeError: Coordinates must not have minimums more than maximums").

下面的代码首先给我一个警告(“ CRS does not match!”),然后是一个错误(“ RTreeError: Coordinates must not have minimums more than maximums”)。

What exactly is wrong in there? Are CRS coordinates systems? If so, why are they not loaded the same way?

究竟是哪里出了问题?CRS 是坐标系统吗?如果是这样,为什么它们的加载方式不同?

import geopandas as gpd
from shapely.geometry import Point, mapping,shape
from geopandas import GeoDataFrame, read_file
#from geopandas.tools import overlay
from geopandas.tools import sjoin

print('Reading points...')
points=pd.read_csv(points_csv)
points['geometry'] = points.apply(lambda z: Point(z.Latitude, z.Longitude), axis=1)
PointsGeodataframe = gpd.GeoDataFrame(points)
print PointsGeodataframe.head()
print('Reading polygons...')
PolygonsGeodataframe = gpd.GeoDataFrame.from_file(china_shapefile+".shp")
print PolygonsGeodataframe.head()
print('Merging GeoDataframes...')
merged=sjoin(PointsGeodataframe, PolygonsGeodataframe, how='left', op='intersects')

#merged = PointsGeodataframe.merge(PolygonsGeodataframe, left_on='iso_alpha2', right_on='ISO2', how='left')
print(merged.head(5))

Link to data for reproduction: Shapefile, GPS points

复制数据的链接: ShapefileGPS 点

回答by Gord Stephen

As noted in the comments on the question, you can eliminate the CRS does not match!warning by manually setting PointsGeodataframe.crs = PolygonsGeodataframe.crs(assuming the CRSs are indeed the same for both datasets).

如对该问题的评论所述,您可以CRS does not match!通过手动设置来消除警告PointsGeodataframe.crs = PolygonsGeodataframe.crs(假设两个数据集的 CRS 确实相同)。

However, that doesn't address the RTreeError. It's possible that you have missing lat/lon data in points_csv- in that case you would end up creating Pointobjects containing NaN values (i.e. Point(nan nan)), which go on to cause issues in rtree. I had a similar problem and the fix was just to filter out rows with missing coordinate data when loading in the CSV:

但是,这并没有解决RTreeError. 您可能缺少 lat/lon 数据points_csv- 在这种情况下,您最终会创建Point包含 NaN 值的对象(即Point(nan nan)),这会继续导致rtree. 我有一个类似的问题,修复只是在加载 CSV 时过滤掉缺少坐标数据的行:

points=pd.read_csv(points_csv).dropna(subset=["Latitude", "Longitude"])

回答by sawyer

I'll add an answer here since I was recently struggling with this and couldn't find a great answer here. The Geopandas documentationhas a good example for how to solve the "CRS does not match" issue.

我会在这里添加一个答案,因为我最近在为此苦苦挣扎并且在这里找不到很好的答案。该Geopandas文档对如何解决“CRS不匹配”的问题一个很好的例子。

I copied the entire code chunk from the documentation below, but the most relevant line is this one, where the to_crs()method is used to reproject a geodataframe. You can call mygeodataframe.crsto find the CRS of each dataframe, and then to_crs()to reproject one to match the other, like so:

我从下面的文档中复制了整个代码块,但最相关的是这一行,该to_crs()方法用于重新投影地理数据框。您可以调用mygeodataframe.crs以查找每个数据帧的 CRS,然后to_crs()重新投影一个以匹配另一个,如下所示:

world = world.to_crs({'init': 'epsg:3395'})

Simply setting PointsGeodataframe.crs = PolygonsGeodataframe.crswill stop the error from being thrown, but will not correctly reproject the geometry.

简单的设置PointsGeodataframe.crs = PolygonsGeodataframe.crs将阻止抛出错误,但不会正确地重新投影几何体。

Full documentation code for reference:

完整文档代码供参考:

# load example data
In [1]: world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))

# Check original projection
# (it's Platte Carre! x-y are long and lat)
In [2]: world.crs
Out[2]: {'init': 'epsg:4326'}

# Visualize
In [3]: ax = world.plot()

In [4]: ax.set_title("WGS84 (lat/lon)");

# Reproject to Mercator (after dropping Antartica)
In [5]: world = world[(world.name != "Antarctica") & (world.name != "Fr. S. Antarctic Lands")]

In [6]: world = world.to_crs({'init': 'epsg:3395'}) # world.to_crs(epsg=3395) would also work

In [7]: ax = world.plot()

In [8]: ax.set_title("Mercator");