Python 未找到资源 u'tokenizers/punkt/english.pickle'

Question

提问by Supreeth Meka

My Code:

我的代码：

import nltk.data
tokenizer = nltk.data.load('nltk:tokenizers/punkt/english.pickle')

ERROR Message:

错误信息：

[ec2-user@ip-172-31-31-31 sentiment]$ python mapper_local_v1.0.py
Traceback (most recent call last):
File "mapper_local_v1.0.py", line 16, in <module>

    tokenizer = nltk.data.load('nltk:tokenizers/punkt/english.pickle')

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 774, in load

    opened_resource = _open(resource_url)

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 888, in _open

    return find(path_, path + ['']).open()

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 618, in find

    raise LookupError(resource_not_found)

LookupError:

Resource u'tokenizers/punkt/english.pickle' not found.  Please
use the NLTK Downloader to obtain the resource:

    >>>nltk.download()

Searched in:
- '/home/ec2-user/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- u''

I'm trying to run this program in Unix machine:

我试图在 Unix 机器上运行这个程序：

As per the error message, I logged into python shell from my unix machine then I used the below commands:

根据错误消息，我从我的 unix 机器登录到 python shell，然后我使用了以下命令：

import nltk
nltk.download()

and then I downloaded all the available things using d- down loader and l- list options but still the problem persists.

然后我使用 d-down loader 和 l-list 选项下载了所有可用的东西，但问题仍然存在。

I tried my best to find the solution in internet but I got the same solution what I did as I mentioned in my above steps.

我尽力在互联网上找到解决方案，但我得到了与我在上述步骤中提到的相同的解决方案。

Answer 1

采纳答案by Supreeth Meka

I got the solution:

我得到了解决方案：

import nltk
nltk.download()

once the NLTK Downloader starts

一旦 NLTK 下载器启动

d) Download l) List u) Update c) Config h) Help q) Quit

d) 下载 l) 列表 u) 更新 c) 配置 h) 帮助 q) 退出

Downloader> d

下载器>d

Download which package (l=list; x=cancel)? Identifier> punkt

下载哪个包（l=list；x=cancel）？标识符> punkt

Answer 2

回答by eeelnico

The same thing happened to me recently, you just need to download the "punkt" package and it should work.

最近我也发生了同样的事情，你只需要下载“punkt”包就可以了。

When you execute "list" (l) after having "downloaded all the available things", is everything marked like the following line?:

在“下载所有可用的东西”后执行“list”(l) 时，是否所有内容都标记为如下行？：

[*] punkt............... Punkt Tokenizer Models

If you see this line with the star, it means you have it, and nltk should be able to load it.

如果您看到带有星号的这一行，则表示您拥有它，并且 nltk 应该能够加载它。

Answer 3

回答by alvas

If you're looking to only download the punktmodel:

如果您只想下载punkt模型：

import nltk
nltk.download('punkt')

If you're unsure which data/model you need, you can install the populardatasets, models and taggers from NLTK:

如果您不确定您需要哪种数据/模型，您可以从 NLTK安装流行的数据集、模型和标记器：

import nltk
nltk.download('popular')

With the above command, there is no need to use the GUI to download the datasets.

使用上述命令，无需使用 GUI 下载数据集。

Answer 4

回答by yprez

To add to alvas' answer, you can download only the punktcorpus:

要添加到alvas 的回答中，您只能下载punkt语料库：

nltk.download('punkt')

Downloading allsounds like overkill to me. Unless that's what you want.

下载all对我来说听起来有点矫枉过正。除非那是你想要的。

Answer 5

回答by Raj

My issue was that I called nltk.download('all')as the root user, but the process that eventually used nltk was another user who didn't have access to /root/nltk_data where the content was downloaded.

我的问题是我nltk.download('all')以 root 用户身份调用，但最终使用 nltk 的进程是另一个用户，该用户无权访问下载内容的 /root/nltk_data。

So I simply recursively copied everything from the download location to one of the paths where NLTK was looking to find it like this:

所以我只是递归地将所有内容从下载位置复制到 NLTK 希望找到它的路径之一，如下所示：

cp -R /root/nltk_data/ /home/ubuntu/nltk_data

Answer 6

回答by Deepthi Karnam

Simple nltk.download() will not solve this issue. I tried the below and it worked for me:

简单的 nltk.download() 不会解决这个问题。我尝试了以下方法，它对我有用：

in the nltk folder create a tokenizers folder and copy your punkt folder into tokenizers folder.

在 nltk 文件夹中创建一个 tokenizers 文件夹并将您的 punkt 文件夹复制到 tokenizers 文件夹中。

This will work.! the folder structure needs to be as shown in the picture

这会奏效。！文件夹结构需要如图所示

Answer 7

回答by alily

You need to rearrange your folders Move your tokenizersfolder into nltk_datafolder. This doesn't work if you have nltk_datafolder containing corporafolder containing tokenizersfolder

您需要重新排列文件夹将tokenizers文件夹移动到nltk_data文件夹中。如果您的nltk_data文件夹包含包含corpora文件夹的 tokenizers文件夹，则这不起作用

Answer 8

回答by Franck Dernoncourt

From the shell you can execute:

您可以从 shell 执行：

sudo python -m nltk.downloader punkt

If you want to install the popular NLTK corpora/models:

如果要安装流行的 NLTK 语料库/模型：

sudo python -m nltk.downloader popular

If you want to install allNLTK corpora/models:

如果要安装所有NLTK 语料库/模型：

sudo python -m nltk.downloader all

To list the resources you have downloaded:

要列出您已下载的资源：

python -c 'import os; import nltk; print os.listdir(nltk.data.find("corpora"))'
python -c 'import os; import nltk; print os.listdir(nltk.data.find("tokenizers"))'

Answer 9

回答by Dharani Manne

Go to python console by typing

通过键入转到 python 控制台

$ python

$蟒蛇

in your terminal. Then, type the following 2 commands in your python shell to install the respective packages:

在您的终端中。然后，在你的 python shell 中键入以下 2 个命令来安装相应的包：

>> nltk.download('punkt') >> nltk.download('averaged_perceptron_tagger')

This solved the issue for me.

这为我解决了这个问题。

Answer 10

回答by Camille

For me nothing of the above worked, so I just downloaded all the files by hand from the web site http://www.nltk.org/nltk_data/and I put them also by hand in a file "tokenizers" inside of "nltk_data" folder. Not a pretty solution but still a solution.

对我来说，以上都没有奏效，所以我只是从网站http://www.nltk.org/nltk_data/手动下载了所有文件，并将它们手动放入“nltk_data”内的“tokenizers”文件中“ 文件夹。不是一个很好的解决方案，但仍然是一个解决方案。

Python 未找到资源 u'tokenizers/punkt/english.pickle'

提问by Supreeth Meka

采纳答案by Supreeth Meka

once the NLTK Downloader starts

一旦 NLTK 下载器启动

d) Download l) List u) Update c) Config h) Help q) Quit

d) 下载 l) 列表 u) 更新 c) 配置 h) 帮助 q) 退出

回答by eeelnico

回答by alvas

回答by yprez

回答by Raj

回答by Deepthi Karnam

回答by alily

回答by Franck Dernoncourt

回答by Dharani Manne

回答by Camille

相关推荐

最近更新

标签

Python 未找到资源 u'tokenizers/punkt/english.pickle'

提问by Supreeth Meka

采纳答案by Supreeth Meka

once the NLTK Downloader starts

一旦 NLTK 下载器启动

d) Download l) List u) Update c) Config h) Help q) Quit

d) 下载 l) 列表 u) 更新 c) 配置 h) 帮助 q) 退出

回答by eeelnico

回答by alvas

回答by yprez

回答by Raj

回答by Deepthi Karnam

回答by alily

回答by Franck Dernoncourt

回答by Dharani Manne

回答by Camille

相关推荐

Python ./xx.py：第 1 行：导入：未找到命令

Python 使用多个分隔符将文本导入到 Pandas

Python 如何使 Django 表单字段仅包含字母数字字符

Python 如何在 Matplotlib 中在同一个图形上绘制多个函数？

相关推荐

最近更新

标签