如何腌制或存储 Jupyter (IPython) 笔记本会话以备后用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34342155/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to pickle or store Jupyter (IPython) notebook session for later
提问by Robin Nemeth
Let's say I am doing a larger data analysis in Jupyter/Ipython notebook with lots of time consuming computations done. Then, for some reason, I have to shut down the jupyter local server I, but I would like to return to doing the analysis later, without having to go through all the time-consuming computations again.
假设我正在 Jupyter/Ipython notebook 中进行更大的数据分析,并完成了大量耗时的计算。然后,出于某种原因,我不得不关闭 jupyter 本地服务器 I,但我想稍后返回进行分析,而不必再次进行所有耗时的计算。
What I would likelove to do is pickle
or store the whole Jupyter session (all pandas dataframes, np.arrays, variables, ...) so I can safely shut down the server knowing I can return to my session in exactly the same state as before.
我想什么想爱做的是pickle
或存储整个Jupyter会话(所有大熊猫dataframes,np.arrays,变量,...),所以我可以放心地关闭服务器知道我可以在完全相同的状态返回到我的会话前。
Is it even technically possible? Is there a built-in functionality I overlooked?
它甚至在技术上可能吗?是否有我忽略的内置功能?
EDIT:based on thisanswer there is a %store
magicwhich should be "lightweight pickle". However you have to store the variables manually like so:
编辑:基于这个答案,有一种%store
魔法应该是“轻量级泡菜”。但是,您必须像这样手动存储变量:
#inside a ipython/nb session
foo = "A dummy string"
%store foo
closing seesion, restarting kernel%store -r foo
# r for refreshprint(foo) # "A dummy string"
#inside a ipython/nb session
foo = "A dummy string"
%store foo
关闭%store -r foo
查看,重新启动内核# r 进行刷新print(foo) # "A dummy string"
which is fairly close to what I would want, but having to do it manually and being unable to distinguish between different sessions makes it less useful.
这与我想要的非常接近,但是必须手动完成并且无法区分不同的会话使其不太有用。
采纳答案by MetalloyD
回答by Anh Huynh
(I'd rather comment than offer this as an actual answer, but I need more reputation to comment.)
(我宁愿发表评论也不愿将其作为实际答案提供,但我需要更多声誉才能发表评论。)
You can store most data-like variables in a systematic way. What I usually do is store all dataframes, arrays, etc. in pandas.HDFStore. At the beginning of the notebook, declare
您可以系统地存储大多数类似数据的变量。我通常做的是将所有数据帧、数组等存储在pandas.HDFStore 中。在笔记本的开头,声明
backup = pd.HDFStore('backup.h5')
and then store any new variables as you produce them
然后在生成它们时存储任何新变量
backup['var1'] = var1
At the end, probably a good idea to do
最后,可能是一个好主意
backup.close()
before turning off the server. The next time you want to continue with the notebook:
在关闭服务器之前。下次要继续使用笔记本时:
backup = pd.HDFStore('backup.h5')
var1 = backup['var1']
Truth be told, I'd prefer built-in functionality in ipython notebook, too. You can't save everything this way (e.g. objects, connections), and it's hard to keep the notebook organized with so much boilerplate codes.
说实话,我也更喜欢 ipython notebook 中的内置功能。您无法以这种方式保存所有内容(例如对象、连接),并且很难用如此多的样板代码保持笔记本的井井有条。
回答by Vasco
This question is related to: How to cache in IPython Notebook?
这个问题与:如何在 IPython Notebook 中缓存?
To save the results of individual cells, the caching magiccomes in handy.
为了保存单个单元格的结果,缓存魔法就派上用场了。
%%cache longcalc.pkl var1 var2 var3
var1 = longcalculation()
....
When rerunning the notebook, the contents of this cell is loaded from the cache.
重新运行笔记本时,此单元格的内容将从缓存中加载。
This is not exactly answering your question, but it might be enough to when the results of all the lengthy calculations are recovered fast. This in combination of hitting the run-all button on top of the notebook is for me a workable solution.
这并不能完全回答您的问题,但是当所有冗长计算的结果快速恢复时可能就足够了。这与点击笔记本顶部的全部运行按钮相结合,对我来说是一个可行的解决方案。
The cache magic cannot save the state of a whole notebook yet. To my knowledge there is no other system yet to resume a "notebook". This would require to save all the history of the python kernel. After loading the notebook, and connecting to a kernel, this information should be loaded.
缓存魔法救不了整个笔记本的状态还没有。据我所知,还没有其他系统可以恢复“笔记本”。这将需要保存 python 内核的所有历史记录。加载笔记本并连接到内核后,应加载此信息。