Python 导入 nltk 库时找不到语料库/停用词
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41610543/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Corpora/stopwords not found when import nltk library
提问by Frits Verstraten
I trying to import the nltk package in python 2.7
我试图在 python 2.7 中导入 nltk 包
import nltk
stopwords = nltk.corpus.stopwords.words('english')
print(stopwords[:10])
Running this gives me the following error:
运行这会给我以下错误:
LookupError:
**********************************************************************
Resource 'corpora/stopwords' not found. Please use the NLTK
Downloader to obtain the resource: >>> nltk.download()
So therefore I open my python termin and did the following:
因此,我打开了我的 python 终端并执行了以下操作:
import nltk
nltk.download()
Which gives me:
这给了我:
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
However this does not seem to stop. And running it again still gives me the same error. Any thoughts where this goes wrong?
然而这似乎并没有停止。再次运行它仍然给我同样的错误。任何想法哪里出了问题?
回答by Kurt Bourbaki
You are currently trying to download every item in nltk data, so this can take long. You can try downloading only the stopwords that you need:
您当前正在尝试下载 nltk 数据中的每个项目,因此这可能需要很长时间。您可以尝试仅下载您需要的停用词:
import nltk
nltk.download('stopwords')
Or from command line (thanks to Rafael Valero's answer):
或者从命令行(感谢Rafael Valero 的回答):
python -m nltk.downloader stopwords
Reference:
参考:
回答by Rafael Valero
The some as mentioned hereby Kurt Bourbakibut in the command line:
python -m nltk.downloader stopwords
回答by Umesh
You can do this in separately in console.
It will give you a result.
您可以在控制台中单独执行此操作。
它会给你一个结果。
import nltk
nltk.download('stopwords')
I used jupyter console when I faced this problem.
当我遇到这个问题时,我使用了 jupyter 控制台。
回答by Ehsan
You can enter this in command line for Python 3:
您可以在 Python 3 的命令行中输入:
python3 -m nltk.downloader stopwords
回答by R Kumar
If your PC uses proxy for connectivity, then try this:
如果您的 PC 使用代理进行连接,请尝试以下操作:
import nltk
nltk.set_proxy('http://proxy.example.com:3128', ('USERNAME', 'PASSWORD'))
nltk.download('stopwords')
回答by Gaurav Sharma
Just run this command in your ipython notebook (or any other text editor/IDE you are using):
只需在您的 ipython 笔记本(或您正在使用的任何其他文本编辑器/IDE)中运行此命令:
import nltk
nltk.download('stopwords')
It will automatically download the stopword
file and unzip it into the required directory.
它将自动下载stopword
文件并将其解压缩到所需的目录中。
回答by Srushti
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
If you are running this command in a jupyter notebook, it opens another window titled 'NLTK Downloader'. Once you go in that window, you can select the topics you want to download and then click on download button to start downloading.
如果您在 jupyter notebook 中运行此命令,它会打开另一个标题为“NLTK Downloader”的窗口。进入该窗口后,您可以选择要下载的主题,然后单击下载按钮开始下载。
Until you close the NLTK Downloader window, the cell in the Jupyter keeps on running.
在您关闭 NLTK Downloader 窗口之前,Jupyter 中的单元会继续运行。
回答by yogeshwaran
type in your command prompt if you have installed python 3 .
如果您安装了 python 3 ,请输入命令提示符。
>>python
>>import nltk
this is to check whether you have nltk installed other wise install it by
这是为了检查您是否安装了 nltk 否则安装它
>>pip install nltk
then if you want to install only stopwords directory use
那么如果您只想安装停用词目录,请使用
>>python -m nltk.downloader stopwords
this will consume less time compared to installing the whole package then
与安装整个软件包相比,这将消耗更少的时间
>> import nltk
>> nltk.download('punkt')
after this you are ready to go with usage of stopwords in your compiler
在此之后,您就可以在编译器中使用停用词了