如何从内存中删除多个 Pandas (python) 数据帧以节省 RAM?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32247643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to delete multiple pandas (python) dataframes from memory to save RAM?
提问by GeorgeOfTheRF
I have lot of dataframes created as part of preprocessing. Since I have limited 6GB ram, I want to delete all the unnecessary dataframes from RAM to avoid running out of memory when running GRIDSEARCHCV in scikit-learn.
作为预处理的一部分,我创建了很多数据帧。由于我限制了 6GB 内存,我想从内存中删除所有不必要的数据帧,以避免在 scikit-learn 中运行 GRIDSEARCHCV 时内存不足。
1) Is there a function to list only, all the dataframes currently loaded in memory?
1) 是否有一个函数可以只列出当前加载到内存中的所有数据帧?
I tried dir() but it gives lot of other object other than dataframes.
我尝试了 dir() 但它提供了除数据帧以外的许多其他对象。
2) I created a list of dataframes to delete
2)我创建了一个要删除的数据框列表
del_df=[Gender_dummies,
capsule_trans,
col,
concat_df_list,
coup_CAPSULE_dummies]
& ran
&跑了
for i in del_df:
del (i)
But its not deleting the dataframes. But deleting dataframes individially like below is deleting dataframe from memory.
但它不会删除数据帧。但是像下面这样单独删除数据帧是从内存中删除数据帧。
del Gender_dummies
del col
回答by pacholik
del
statement does not delete an instance, it merely deletes a name.
del
语句不会删除实例,它只是删除一个名称。
When you do del i
, you are deleting just the name i- but the instance is still bound to some other name, so it won't be Garbage-Collected.
当您这样做时del i
,您只是删除了名称i- 但该实例仍绑定到其他名称,因此它不会被垃圾收集。
If you want to release memory, your dataframes has to be Garbage-Collected, i.e. delete all references to them.
如果你想释放内存,你的数据帧必须是Garbage-Collected,即删除对它们的所有引用。
If you created your dateframes dynamically to list, then removing that list will trigger Garbage Collection.
如果您动态创建日期框以列出,则删除该列表将触发垃圾收集。
>>> lst = [pd.DataFrame(), pd.DataFrame(), pd.DataFrame()]
>>> del lst # memory is released
If you created some variables, you have to delete them all.
如果您创建了一些变量,则必须将它们全部删除。
>>> a, b, c = pd.DataFrame(), pd.DataFrame(), pd.DataFrame()
>>> lst = [a, b, c]
>>> del a, b, c # dfs still in list
>>> del lst # memory release now
回答by shanmuga
In python automatic garbage collection deallocates the variable (pandas DataFrame are also just another object in terms of python). There are different garbage collection strategies that can be tweaked (requires significant learning).
在 python 中,自动垃圾收集会释放变量(pandas DataFrame 也只是 python 中的另一个对象)。可以调整不同的垃圾收集策略(需要大量学习)。
You can manually trigger the garbage collection using
您可以使用手动触发垃圾收集
import gc
gc.collect()
But frequent calls to garbage collection is discouraged as it is a costly operation and may affect performance.
但不鼓励频繁调用垃圾收集,因为这是一项代价高昂的操作,可能会影响性能。
回答by hardi
This will delete the dataframe and will release the RAM/memory
这将删除数据帧并释放 RAM/内存
del [[df_1,df_2]]
gc.collect()
df_1=pd.DataFrame()
df_2=pd.DataFrame()