pandas 使用 Python 删除 HDF 存储中的键/表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33488659/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:09:13  来源:igfitidea点击:

Deleting a key/table in an HDF Store with Python

pythonpandashdf5

提问by KidMcC

Is there a pyTables method similar to the following:

是否有类似于以下的 pyTables 方法:

    with pd.get_store(my_store) as store:
        keys = store.keys()
        rem_key = min(sorted(keys))
        store.remove(rem_key)

I am essentially trying to access the HDF5 store's list of keys, find the one that is no longer desired (in this case it is the min(), if the store keys were dates for example), and then remove that key from the store while preserving the others.

我本质上是在尝试访问 HDF5 存储的密钥列表,找到不再需要的那个(在这种情况下,它是 min(),如果存储密钥是日期,例如),然后从存储中删除该密钥在保留其他人的同时。

Pandas does not seem to having anything for this and I have looked over pyTables methods to no avail, having read they impact HDF functionality in python.

Pandas 似乎对此没有任何帮助,我查看了 pyTables 方法无济于事,读过它们会影响 Python 中的 HDF 功能。

Thanks!

谢谢!

回答by 0_0

Pandas does precisely what you want. The removefunction is part of pandas/io/pytables.py(available for v0.19.1 here) and it will remove a node by key, or rows within a node by a condition.

Pandas 正是您想要的。该remove函数是pandas/io/pytables.py此处适用于 v0.19.1 )的一部分,它将按键删除节点,或按条件删除节点内的行。

HDF5 does not adjust the size of your store after removal (see SO answer), so it is advisable to re-compress/restructure your store every now and then. You may do this from the command line using (from SO answer):

HDF5 在移除后不会调整您的商店的大小(请参阅 SO答案),因此建议时不时地重新压缩/重组您的商店。您可以使用(来自 SO answer)从命令行执行此操作:

ptrepack --chunkshape=auto --propindexes --complib=blosc test.h5 out.h5