将 netCDF 文件导入 Pandas 数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14035148/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:33:23  来源:igfitidea点击:

Import netCDF file to Pandas dataframe

pythondataframepandasnetcdf

提问by user1911866

Merry Christmas! I am still very new to Python and Pandas, so any help is appreciated. I am trying to read in a netCDF file, which I can do and then import that into a Pandas Dataframe. The netcDF file is 2D so I just want to 'dump it in'. I have tried the DataFrame method but it doesn't recognize the object. Presumably I need to convert the netCDF object to a 2D numpy array? Again, thanks for any ideas on the best way to do this.

圣诞节快乐!我对 Python 和 Pandas 还是很陌生,所以感谢任何帮助。我正在尝试读取一个 netCDF 文件,我可以这样做,然后将其导入到 Pandas 数据帧中。netcDF 文件是 2D 的,所以我只想“转储它”。我尝试过 DataFrame 方法,但它无法识别该对象。大概我需要将 netCDF 对象转换为 2D numpy 数组?再次感谢您提供有关执行此操作的最佳方法的任何想法。

回答by naught101

The xarraylibrary handles arbitrary-dimensional netCDF data, and retains metadata. Xarray provides a simple method of opening netCDF files, and converting them to pandas dataframes:

所述xarray库句柄任意维NetCDF数据,并保持元数据。Xarray 提供了一种打开 netCDF 文件并将它们转换为 Pandas 数据帧的简单方法:

import xarray as xr

ds = xr.open_dataset('/path/to/netcdf')
df = ds.to_dataframe()

This will create a dataframe with a multi-index with all of the dimensions in it. Unfortunately, Pandas doesn't support arbitrary metadata, so that will be lost in the conversion, but you can keep the dsaround, and use the metadata from that.

这将创建一个带有多索引的数据框,其中包含所有维度。不幸的是,Pandas 不支持任意元数据,因此会在转换中丢失,但您可以保留ds周围的元数据,并使用其中的元数据。

回答by Rich Signell

If your NetCDF file (or OPeNDAPdataset) follows CF Metadata conventions you can take advantage of them by using the NetCDF4-Python package, which makes accessing them in Pandas really easy. (I'm using the Enthought Python Distribution which includes both Pandas and NetCDF4-Python).

如果您的 NetCDF 文件(或OPeNDAP数据集)遵循 CF 元数据约定,您可以通过使用 来利用它们NetCDF4-Python package,这使得在 Pandas 中访问它们变得非常容易。(我正在使用 Enthought Python Distribution,其中包括 Pandas 和 NetCDF4-Python)。

In the example below, the NetCDF file is being served via OPeNDAP, and the NetCDF4-Python library lets you open and work with a remote OPeNDAP dataset just as if it was a local NetCDF file, which is pretty slick. If you want to see the attributes of the NetCDF4 file, point your browser at this link http://geoport-dev.whoi.edu/thredds/dodsC/HUDSON_SVALLEY/5951adc-a1h.nc.html

在下面的示例中,NetCDF 文件是通过 OPeNDAP 提供的,而 NetCDF4-Python 库允许您打开和使用远程 OPeNDAP 数据集,就像它是本地 NetCDF 文件一样,这非常漂亮。如果您想查看 NetCDF4 文件的属性,请将浏览器指向此链接http://geoport-dev.whoi.edu/thredds/dodsC/HUDSON_SVALLEY/5951adc-a1h.nc.html

You should be able to run this without changes:

您应该能够在不进行更改的情况下运行它:

from matplotlib import pyplot as plt
import pandas as pd
import netCDF4

url='http://geoport-dev.whoi.edu/thredds/dodsC/HUDSON_SVALLEY/5951adc-a1h.nc'
vname = 'Tx_1211'
station = 0

nc = netCDF4.Dataset(url)
h = nc.variables[vname]
times = nc.variables['time']
jd = netCDF4.num2date(times[:],times.units)
hs = pd.Series(h[:,station],index=jd)

fig = plt.figure(figsize=(12,4))
ax = fig.add_subplot(111)
hs.plot(ax=ax,title='%s at %s' % (h.long_name,nc.id))
ax.set_ylabel(h.units)

The result may be seen here in the Ipython Notebook: http://nbviewer.ipython.org/4615153/

结果可以在 Ipython Notebook 中看到:http://nbviewer.ipython.org/4615153/

回答by joaquin

You can use a library like PyNIO to read your file into p.e. numpy arrays and feed them to pandas.
PyNIOallows reading several file formats including classic netCDF3 and netCDF4.
netcdf4-pythoncan also read these netCDF formats and is py3.3 compatible

您可以使用像 PyNIO 这样的库将您的文件读入 pe numpy 数组并将它们提供给 Pandas。
PyNIO允许读取多种文件格式,包括经典的 netCDF3 和 netCDF4。
netcdf4-python也可以读取这些 netCDF 格式并且兼容 py3.3