用 Pandas 读取 hdf5 数据集

Question

提问by hsnee

I'm trying to open a group-less hdf5 file with pandas:

我正在尝试用 Pandas 打开一个无组的 hdf5 文件：

import pandas as pd
foo = pd.read_hdf('foo.hdf5')

but I get an error:

但我收到一个错误：

TypeError: cannot create a storer if the object is not existing nor a value are passed

类型错误：如果对象不存在或传递值，则无法创建存储库

I tried solving this by assigning a key:

我尝试通过分配一个来解决这个问题key：

foo = pd.read_hdf('foo.hdf5','key')

which works if keywas a group, but the file has no groups, but rather several datasets in the highest hdf structure. i.e. the structure of the working file is: Groups --> Datasets, while the structure of the not working file is: Datasets. Both work fine when opening them with h5py, where I would use:

如果key是一个组，则该文件有效，但该文件没有组，而是最高 hdf 结构中的几个数据集。即工作文件的结构是：Groups --> Datasets，而不工作文件的结构是：Datasets。使用 h5py 打开它们时，它们都可以正常工作，我将在其中使用：

f = h5py.File('foo.hdf5','r')

and

和

dset = f['dataset']

to view a dataset. Any ideas how to read this in pandas?

查看数据集。任何想法如何在Pandas中阅读这个？

Answer 1

回答by MaxU

I think you'are confused by different terminology - Pandas's HDF store keyis a full path i.e. Group + DataSet_name...

我认为您对不同的术语感到困惑 - Pandas 的 HDF 存储key是一条完整的路径，即Group + DataSet_name......

demo:

演示：

In [67]: store = pd.HDFStore(r'D:\temp\.data\hdf\test.h5')

In [68]: store.append('dataset1', df)

In [69]: store.append('/group1/sub_group1/dataset2', df)

In [70]: store.groups
Out[70]:
<bound method HDFStore.groups of <class 'pandas.io.pytables.HDFStore'>
File path: D:\temp\.data\hdf\test.h5
/dataset1                              frame_table  (typ->appendable,nrows->9,ncols->2,indexers->[index])
/group1/sub_group1/dataset2            frame_table  (typ->appendable,nrows->9,ncols->2,indexers->[index])>

In [71]: store.items
Out[71]:
<bound method HDFStore.items of <class 'pandas.io.pytables.HDFStore'>
File path: D:\temp\.data\hdf\test.h5
/dataset1                              frame_table  (typ->appendable,nrows->9,ncols->2,indexers->[index])
/group1/sub_group1/dataset2            frame_table  (typ->appendable,nrows->9,ncols->2,indexers->[index])>

In [72]: store.close()

In [73]: x = pd.read_hdf(r'D:\temp\.data\hdf\test.h5', 'dataset1')

In [74]: x.shape
Out[74]: (9, 2)

In [75]: x = pd.read_hdf(r'D:\temp\.data\hdf\test.h5', '/group1/sub_group1/dataset2')

In [76]: x.shape
Out[76]: (9, 2)

用 Pandas 读取 hdf5 数据集

提问by hsnee

回答by MaxU

相关推荐

最近更新

标签

用 Pandas 读取 hdf5 数据集

提问by hsnee

回答by MaxU

相关推荐

pandas.DataFrame 中一列的反向累积总和

使用 '.' 访问 pandas.DataFrame 列名 在里面

pandas 使用散景中 x 坐标的数据帧索引绘制熊猫数据帧

pandas 如何在熊猫 date_range 方法中包含结束日期？

相关推荐

最近更新

标签

使用 '.' 访问 pandas.DataFrame 列名在里面