pandas 从 hdf 文件中获取列名（标题）

Question

提问by Cenoc

I was wondering how to get the column names (seemingly stored in the hdf header) of an hdf file; for example, a file might have columns named [a,b,c,d] while another file has columns [a,b,c] and yet another has columns [b,e,r,z]; and I would like to find out which ones have which. Any help would be very much appreciated!

我想知道如何获取 hdf 文件的列名（似乎存储在 hdf 标题中）；例如，一个文件可能有名为 [a,b,c,d] 的列，而另一个文件有 [a,b,c] 列，而另一个文件有 [b,e,r,z] 列；我想知道哪些有哪些。任何帮助将不胜感激！

Answer 1

回答by ssnobody

To do this outside of python you can use h5dumpvia something like h5dump --header my.hdf5

要在 python 之外执行此操作，您可以通过类似的方式使用h5dumph5dump --header my.hdf5

In python you can use h5py

在python中你可以使用h5py

As an example this is how I might access field names for my HDF-EOS5 file:

例如，这是我如何访问 HDF-EOS5 文件的字段名称：

>>> import h5py
>>> f = h5py.File('/tmp/temp.hdf','r')
>>> f.keys()
[u'HDFEOS', u'HDFEOS INFORMATION']
>>> f.values()
[<HDF5 group "/HDFEOS" (2 members)>, <HDF5 group "/HDFEOS INFORMATION" (2 members)>]
>>> grpname = f.require_group('/HDFEOS')
>>> grpname.keys()
[u'ADDITIONAL', u'GRIDS']
>>> grpname.values()
[<HDF5 group "/HDFEOS/ADDITIONAL" (1 members)>, <HDF5 group "/HDFEOS/GRIDS" (9 members)>]
>>> subgrpname = grpname.require_group('/HDFEOS/GRIDS')
>>> subgrpname.keys()
[u'355nm_band', u'380nm_band', u'445nm_band', u'470nm_band', u'555nm_band', u'660nm_band', u'865nm_band', u'935nm_band', u'Ancillary']
>>> group_660 = subgrpname.require_group('660nm_band')
>>> group_660.keys()
[u'Data Fields']
>>> group_660.values()
[<HDF5 group "/HDFEOS/GRIDS/660nm_band/Data Fields" (20 members)>]
>>> fields_660 = group_660.require_group('Data Fields')
>>> fields_660.keys()
[u'AOLP_meridian', u'AOLP_scatter', u'DOLP', u'Glint_angle', u'I', u'I.mask', u'IPOL', u'Q.mask', u'Q_meridian', u'Q_scatter', u'RDQI', u'Scattering_angle', u'Sun_azimuth', u'Sun_zenith', u'Time_in_seconds_from_epoch', u'U.mask', u'U_meridian', u'U_scatter', u'View_azimuth', u'View_zenith']
>>> fields_660.values()
[<HDF5 dataset "AOLP_meridian": shape (3072, 3072), type "<f4">, <HDF5 dataset "AOLP_scatter": shape (3072, 3072), type "<f4">, <HDF5 dataset "DOLP": shape (3072, 3072), type "<f4">, <HDF5 dataset "Glint_angle": shape (3072, 3072), type "<f4">, <HDF5 dataset "I": shape (3072, 3072), type "<f4">, <HDF5 dataset "I.mask": shape (3072, 3072), type "<i4">, <HDF5 dataset "IPOL": shape (3072, 3072), type "<f4">, <HDF5 dataset "Q.mask": shape (3072, 3072), type "<i4">, <HDF5 dataset "Q_meridian": shape (3072, 3072), type "<f4">, <HDF5 dataset "Q_scatter": shape (3072, 3072), type "<f4">, <HDF5 dataset "RDQI": shape (3072, 3072), type "<f4">, <HDF5 dataset "Scattering_angle": shape (3072, 3072), type "<f4">, <HDF5 dataset "Sun_azimuth": shape (3072, 3072), type "<f4">, <HDF5 dataset "Sun_zenith": shape (3072, 3072), type "<f4">, <HDF5 dataset "Time_in_seconds_from_epoch": shape (3072, 3072), type "<f8">, <HDF5 dataset "U.mask": shape (3072, 3072), type "<i4">, <HDF5 dataset "U_meridian": shape (3072, 3072), type "<f4">, <HDF5 dataset "U_scatter": shape (3072, 3072), type "<f4">, <HDF5 dataset "View_azimuth": shape (3072, 3072), type "<f4">, <HDF5 dataset "View_zenith": shape (3072, 3072), type "<f4">]

pandas 从 hdf 文件中获取列名（标题）

提问by Cenoc

回答by ssnobody

相关推荐

最近更新

标签

pandas 从 hdf 文件中获取列名（标题）

提问by Cenoc

回答by ssnobody

相关推荐

Pandas - 删除只有 NaN 值的行

pandas 合并具有相同列值的连续行

将 Pandas 数据框保存到 csv 时如何保留 columns.name？

pandas 更改 DataFrame 最后一行中的元素

相关推荐

最近更新

标签