Python:为 Windows 7 安装 Tesseract
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42831662/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: Install Tesseract for Windows 7
提问by Plug4
My objective is to use OCR in Python 2.7 using Tesseract on a Windows 7 machine, but I am running into issues as for the installation process. I tried following the instruction herebut the link to "tesseract-core-yyyymmdd.exe" and "tesseract-langs-yyyymmdd.exe" do not exist anymore and I can't find these .exe elsewhere online. Here's what I have done so far:
我的目标是在 Windows 7 机器上使用 Tesseract 在 Python 2.7 中使用 OCR,但我遇到了安装过程的问题。我尝试按照此处的说明进行操作,但是“tesseract-core-yyyymmdd.exe”和“tesseract-langs-yyyymmdd.exe”的链接不再存在,而且我在网上的其他地方找不到这些 .exe。这是我到目前为止所做的:
- installed tesseract from its executable from official tesseract-ocr page.
- installed via pip packages "wand", "PIL", "pyocr".
- 从官方 tesseract-ocr 页面从其可执行文件安装了 tesseract。
- 通过 pip 包“wand”、“PIL”、“pyocr”安装。
Now, if I do the following in Python:
现在,如果我在 Python 中执行以下操作:
from wand.image import Image
from PIL import Image as PI
import pyocr
import pyocr.builders
import io
from wand.image import Image
from PIL import Image as PI
import pyocr
import pyocr.builders
import io
No problem loading up these packages but pyocr.get_available_tools()
gives me an empty list. I am sure this has to do with the missing installation .exe files above. Where can I find them? Is it something else that I am missing?
加载这些包没问题,但pyocr.get_available_tools()
给了我一个空列表。我确定这与上面缺少的安装 .exe 文件有关。我在哪里可以找到它们?我还缺少其他东西吗?
采纳答案by Asha Magenta
I just tried to set up pytesseract and it works ! I have windows 10 and python 2.7 installed.
我只是尝试设置 pytesseract 并且它有效!我安装了 Windows 10 和 python 2.7。
all you need to do :
所有你需要做的:
- Download Visual basic C++ from http://aka.ms/vcpython27and install it (common installation step)
Download tesseract from python via this link https://pypi.python.org/pypi/pytesseract
Unizip the file.
Go to the directory which contains the unizip file
Run this command " python setup.py install "
(Additional) to test if it's installed, go to your python shell and run this command " import pytesseract "
- 从http://aka.ms/vcpython27下载 Visual basic C++并安装(常见安装步骤)
通过此链接从 python 下载 tesseract https://pypi.python.org/pypi/pytesseract
解压文件。
转到包含 unizip 文件的目录
运行这个命令“python setup.py install”
(附加)要测试它是否已安装,请转到您的 python shell 并运行此命令“ import pytesseract ”
I hope it works !! Note pytesseract is google based OCR, it works similarly to tesseract.
我希望它有效!!注意 pytesseract 是基于谷歌的 OCR,它的工作原理类似于 tesseract。
回答by Shashank Singh
Step [1]To install tesseractkindly visit
步骤 [1]要安装tesseract,请访问
https://github.com/UB-Mannheim/tesseract/wiki
https://github.com/UB-Mannheim/tesseract/wiki
The latest installers can be downloaded from here: e.g., tesseract-ocr-setup-3.05.02-20180621.exe, tesseract-ocr-w32-setup-v4.0.0-beta.1.20180608.exe, tesseract-ocr-w64-setup-v4.0.0-beta.1.20180608.exe (64 bit)
最新的安装程序可以从这里下载: 例如,tesseract-ocr-setup-3.05.02-20180621.exe、tesseract-ocr-w32-setup-v4.0.0-beta.1.20180608.exe、tesseract-ocr-w64-setup -v4.0.0-beta.1.20180608.exe (64 位)
Step [2]Download Microsoft Visual C++ Compiler for Python 2.7 from the link given below https://download.microsoft.com/download/7/9/6/796EF2E4-801B-4FC4-AB28-B59FBF6D907B/VCForPython27.msi
步骤 [2]从下面给出的链接下载 Microsoft Visual C++ Compiler for Python 2.7 https://download.microsoft.com/download/7/9/6/796EF2E4-801B-4FC4-AB28-B59FBF6D907B/VCForPython27.msi
Step [3]Install pytesseractfor binding for tesseractusing pip
步骤 [3]安装pytesseract以使用 pip绑定tesseract
pip install pytesseract
Step [4]Furthermore you can install an image processing library in python, e.g., pillow:
步骤 [4]此外,您可以在 python 中安装图像处理库,例如枕头:
pip install pillow
greetings!! you are done!! :)
你好!!你完成了!!:)
回答by Shurima
PIP is a package manager for Python packages
PIP 是 Python 包的包管理器
回答by Abhishek
Install both and you are done
安装两个,你就完成了
Binaries from: https://github.com/UB-Mannheim/tesseract/wiki
二进制文件来自:https: //github.com/UB-Mannheim/tesseract/wiki
Python Wrapper from here: https://pypi.python.org/pypi/pytesseract
来自这里的 Python 包装器:https: //pypi.python.org/pypi/pytesseract