Python 使用 nltk.download() 下载错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27658409/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
downloading error using nltk.download()
提问by user288609
I am experimenting NLTK package using Python. I tried to downloaded NLTK using nltk.download()
. I got this kind of error message. How to solve this problem? Thanks.
我正在使用 Python 试验 NLTK 包。我尝试使用nltk.download()
. 我收到了这种错误信息。如何解决这个问题呢?谢谢。
The system I used is Ubuntu installed under VMware. The IDE is Spyder.
我使用的系统是安装在VMware下的Ubuntu。IDE 是 Spyder。
After using nltk.download('all')
, it can download some packages, but it gets error message when downloading oanc_masc
使用后nltk.download('all')
,可以下载一些包,但是下载的时候报错oanc_masc
采纳答案by alvas
To download a particular dataset/models, use the nltk.download()
function, e.g. if you are looking to download the punkt
sentence tokenizer, use:
要下载特定的数据集/模型,请使用该nltk.download()
函数,例如,如果您要下载punkt
句子标记器,请使用:
$ python3
>>> import nltk
>>> nltk.download('punkt')
If you're unsure of which data/model you need, you can start out with the basic list of data + models with:
如果您不确定您需要哪种数据/模型,您可以从数据 + 模型的基本列表开始:
>>> import nltk
>>> nltk.download('popular')
It will download a list of "popular" resources.
它将下载“流行”资源列表。
Ensure that you've the latest version of NLTK
because it's always improving and constantly maintain:
确保您拥有最新版本,NLTK
因为它一直在改进并不断维护:
$ pip install --upgrade nltk
EDITED
已编辑
In case anyone is avoiding errors from downloading larger datasets from nltk
, from https://stackoverflow.com/a/38135306/610569
如果有人nltk
从https://stackoverflow.com/a/38135306/610569下载更大的数据集避免错误
$ rm /Users/<your_username>/nltk_data/corpora/panlex_lite.zip
$ rm -r /Users/<your_username>/nltk_data/corpora/panlex_lite
$ python
>>> import nltk
>>> dler = nltk.downloader.Downloader()
>>> dler._update_index()
>>> dler._status_cache['panlex_lite'] = 'installed' # Trick the index to treat panlex_lite as it's already installed.
>>> dler.download('popular')
And if anyone wants to find nltk_data
directory, see https://stackoverflow.com/a/36383314/610569
如果有人想找到nltk_data
目录,请参阅https://stackoverflow.com/a/36383314/610569
And to config nltk_data
path, see https://stackoverflow.com/a/22987374/610569
并配置nltk_data
路径,请参阅https://stackoverflow.com/a/22987374/610569
回答by tolgayilmaz
From command line, after importing nltk, try
从命令行,导入 nltk 后,尝试
nltk.download('popular', halt_on_error=False)
After an error it will ask to retry broken package, just decline with n and it will continue with proper packages.
出现错误后,它会要求重试损坏的包,只需用 n 拒绝,它将继续使用正确的包。
回答by HaticeKübraK?l?n?
I had this error:
我有这个错误:
Resource punkt not found. Please use the NLTK Downloader to obtain the resource: import nltk nltk.download('punkt')
When I tried to solve by writing:
当我试图通过写作来解决时:
import nltk
nltk.download()
my computer shut downs suddenly and anaconda also closed. When I try to open it always shows an error.
我的电脑突然关机,anaconda 也关闭了。当我尝试打开它时,它总是显示错误。
I solved the problem by writing:
我通过写作解决了这个问题:
import nltk
nltk.download('punkt')
回答by Alexandre
a) in OSX either run
a) 在 OSX 中要么运行
sudo /Applications/Python\ 3.6/Install\ Certificates.command
sudo /Applications/Python\ 3.6/Install\ Certificates.command
b) switch to admin user (the one you have set up with administrator privileges)
b) 切换到 admin 用户(您设置的具有管理员权限的用户)
and type at command line:
并在命令行输入:
/Applications/Python\ 3.6/Install\ Certificates.command
/Applications/Python\ 3.6/Install\ Certificates.command
Notes:
笔记:
- "\" are necessary because they escape blank characters in file names.
- This procedure worked if you have python 3.6 installed, otherwise change it in order to match your install python version... for this execute:
- "\" 是必需的,因为它们会转义文件名中的空白字符。
- 如果您安装了 python 3.6,则此过程有效,否则请更改它以匹配您安装的 python 版本......为此执行:
ls /Applications
ls /Applications
and look at the python directory name you have there.
并查看您在那里的 python 目录名称。