pandas 从 hdf 文件中获取列名(标题)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25495041/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:23:25  来源:igfitidea点击:

Get column names (headers) from hdf file

pythonpandashdf5hdfstore

提问by Cenoc

I was wondering how to get the column names (seemingly stored in the hdf header) of an hdf file; for example, a file might have columns named [a,b,c,d] while another file has columns [a,b,c] and yet another has columns [b,e,r,z]; and I would like to find out which ones have which. Any help would be very much appreciated!

我想知道如何获取 hdf 文件的列名(似乎存储在 hdf 标题中);例如,一个文件可能有名为 [a,b,c,d] 的列,而另一个文件有 [a,b,c] 列,而另一个文件有 [b,e,r,z] 列;我想知道哪些有哪些。任何帮助将不胜感激!

回答by ssnobody

To do this outside of python you can use h5dumpvia something like h5dump --header my.hdf5

要在 python 之外执行此操作,您可以通过类似的方式使用h5dumph5dump --header my.hdf5

In python you can use h5py

在python中你可以使用h5py

As an example this is how I might access field names for my HDF-EOS5 file:

例如,这是我如何访问 HDF-EOS5 文件的字段名称:

>>> import h5py
>>> f = h5py.File('/tmp/temp.hdf','r')
>>> f.keys()
[u'HDFEOS', u'HDFEOS INFORMATION']
>>> f.values()
[<HDF5 group "/HDFEOS" (2 members)>, <HDF5 group "/HDFEOS INFORMATION" (2 members)>]
>>> grpname = f.require_group('/HDFEOS')
>>> grpname.keys()
[u'ADDITIONAL', u'GRIDS']
>>> grpname.values()
[<HDF5 group "/HDFEOS/ADDITIONAL" (1 members)>, <HDF5 group "/HDFEOS/GRIDS" (9 members)>]
>>> subgrpname = grpname.require_group('/HDFEOS/GRIDS')
>>> subgrpname.keys()
[u'355nm_band', u'380nm_band', u'445nm_band', u'470nm_band', u'555nm_band', u'660nm_band', u'865nm_band', u'935nm_band', u'Ancillary']
>>> group_660 = subgrpname.require_group('660nm_band')
>>> group_660.keys()
[u'Data Fields']
>>> group_660.values()
[<HDF5 group "/HDFEOS/GRIDS/660nm_band/Data Fields" (20 members)>]
>>> fields_660 = group_660.require_group('Data Fields')
>>> fields_660.keys()
[u'AOLP_meridian', u'AOLP_scatter', u'DOLP', u'Glint_angle', u'I', u'I.mask', u'IPOL', u'Q.mask', u'Q_meridian', u'Q_scatter', u'RDQI', u'Scattering_angle', u'Sun_azimuth', u'Sun_zenith', u'Time_in_seconds_from_epoch', u'U.mask', u'U_meridian', u'U_scatter', u'View_azimuth', u'View_zenith']
>>> fields_660.values()
[<HDF5 dataset "AOLP_meridian": shape (3072, 3072), type "<f4">, <HDF5 dataset "AOLP_scatter": shape (3072, 3072), type "<f4">, <HDF5 dataset "DOLP": shape (3072, 3072), type "<f4">, <HDF5 dataset "Glint_angle": shape (3072, 3072), type "<f4">, <HDF5 dataset "I": shape (3072, 3072), type "<f4">, <HDF5 dataset "I.mask": shape (3072, 3072), type "<i4">, <HDF5 dataset "IPOL": shape (3072, 3072), type "<f4">, <HDF5 dataset "Q.mask": shape (3072, 3072), type "<i4">, <HDF5 dataset "Q_meridian": shape (3072, 3072), type "<f4">, <HDF5 dataset "Q_scatter": shape (3072, 3072), type "<f4">, <HDF5 dataset "RDQI": shape (3072, 3072), type "<f4">, <HDF5 dataset "Scattering_angle": shape (3072, 3072), type "<f4">, <HDF5 dataset "Sun_azimuth": shape (3072, 3072), type "<f4">, <HDF5 dataset "Sun_zenith": shape (3072, 3072), type "<f4">, <HDF5 dataset "Time_in_seconds_from_epoch": shape (3072, 3072), type "<f8">, <HDF5 dataset "U.mask": shape (3072, 3072), type "<i4">, <HDF5 dataset "U_meridian": shape (3072, 3072), type "<f4">, <HDF5 dataset "U_scatter": shape (3072, 3072), type "<f4">, <HDF5 dataset "View_azimuth": shape (3072, 3072), type "<f4">, <HDF5 dataset "View_zenith": shape (3072, 3072), type "<f4">]