用 Pandas 读取 hdf5 数据集
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38018186/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reading hdf5 datasets with pandas
提问by hsnee
I'm trying to open a group-less hdf5 file with pandas:
我正在尝试用 Pandas 打开一个无组的 hdf5 文件:
import pandas as pd
foo = pd.read_hdf('foo.hdf5')
but I get an error:
但我收到一个错误:
TypeError: cannot create a storer if the object is not existing nor a value are passed
类型错误:如果对象不存在或传递值,则无法创建存储库
I tried solving this by assigning a key
:
我尝试通过分配一个来解决这个问题key
:
foo = pd.read_hdf('foo.hdf5','key')
which works if key
was a group, but the file has no groups, but rather several datasets in the highest hdf structure. i.e. the structure of the working file is: Groups --> Datasets, while the structure of the not working file is: Datasets. Both work fine when opening them with h5py, where I would use:
如果key
是一个组,则该文件有效,但该文件没有组,而是最高 hdf 结构中的几个数据集。即工作文件的结构是:Groups --> Datasets,而不工作文件的结构是:Datasets。使用 h5py 打开它们时,它们都可以正常工作,我将在其中使用:
f = h5py.File('foo.hdf5','r')
and
和
dset = f['dataset']
to view a dataset. Any ideas how to read this in pandas?
查看数据集。任何想法如何在Pandas中阅读这个?
回答by MaxU
I think you'are confused by different terminology - Pandas's HDF store key
is a full path i.e. Group + DataSet_name
...
我认为您对不同的术语感到困惑 - Pandas 的 HDF 存储key
是一条完整的路径,即Group + DataSet_name
......
demo:
演示:
In [67]: store = pd.HDFStore(r'D:\temp\.data\hdf\test.h5')
In [68]: store.append('dataset1', df)
In [69]: store.append('/group1/sub_group1/dataset2', df)
In [70]: store.groups
Out[70]:
<bound method HDFStore.groups of <class 'pandas.io.pytables.HDFStore'>
File path: D:\temp\.data\hdf\test.h5
/dataset1 frame_table (typ->appendable,nrows->9,ncols->2,indexers->[index])
/group1/sub_group1/dataset2 frame_table (typ->appendable,nrows->9,ncols->2,indexers->[index])>
In [71]: store.items
Out[71]:
<bound method HDFStore.items of <class 'pandas.io.pytables.HDFStore'>
File path: D:\temp\.data\hdf\test.h5
/dataset1 frame_table (typ->appendable,nrows->9,ncols->2,indexers->[index])
/group1/sub_group1/dataset2 frame_table (typ->appendable,nrows->9,ncols->2,indexers->[index])>
In [72]: store.close()
In [73]: x = pd.read_hdf(r'D:\temp\.data\hdf\test.h5', 'dataset1')
In [74]: x.shape
Out[74]: (9, 2)
In [75]: x = pd.read_hdf(r'D:\temp\.data\hdf\test.h5', '/group1/sub_group1/dataset2')
In [76]: x.shape
Out[76]: (9, 2)