scala 如何取消缓存RDD?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25938567/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to uncache RDD?
提问by Rubbic
I used cache()to cache the data in memory but I realized to see the performance without cached data I need to uncache it to remove data from memory:
我曾经cache()将数据缓存在内存中,但我意识到在没有缓存数据的情况下查看性能我需要取消缓存以从内存中删除数据:
rdd.cache();
//doing some computation
...
rdd.uncache()
but I got the error said:
但我收到错误说:
value uncache is not a member of org.apache.spark.rdd.RDD[(Int, Array[Float])]
值 uncache 不是 org.apache.spark.rdd.RDD[(Int, Array[Float])] 的成员
I don't know how to do the uncache then!
我不知道如何进行取消缓存!
回答by eliasah
回答by Sankar
If you want to remove all the cached RDDs, use this ::
如果要删除所有缓存的 RDD,请使用以下 ::
for ((k,v) <- sc.getPersistentRDDs) {
v.unpersist()
}
回答by Anupam Mahapatra
If you cache the source data in a RDDby using .cache()or You have declared small memory.
or the default memory is used and its about 500 MB for me.
and you are running the code again and again,
如果您RDD通过 using将源数据缓存在 a 中,.cache()或者您已经声明了小内存。或者使用默认内存,对我来说大约 500 MB。你一次又一次地运行代码,
Then this error occurs.
Try clearing all RDDat the end of the code, thus each time the code runs, the RDDis created and also cleared from memory.
然后出现这个错误。尝试RDD在代码末尾清除 all ,因此每次代码运行时,RDD都会创建并从内存中清除。
Do this by using: RDD_Name.unpersist()
使用以下方法执行此操作: RDD_Name.unpersist()

