图像到文本python

Question

提问by muazfaiz

I am using python 3.x and using the following code to convert image into text:

我正在使用 python 3.x 并使用以下代码将图像转换为文本：

from PIL import Image
from pytesseract import image_to_string

image = Image.open('image.png', mode='r')
print(image_to_string(image))

I am getting the following error:

我收到以下错误：

Traceback (most recent call last):
  File "C:/Users/hp/Desktop/GII/Image_to_text.py", line 12, in <module>
    print(image_to_string(image))
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string
    config=config)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract
    stderr=subprocess.PIPE)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 950, in __init__
    restore_signals, start_new_session)
  File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 1220, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

Please note that I have put the image in the same directory where my python is present. Also It does not raise error on image = Image.open('image.png', mode='r')but it raises on the line print(image_to_string(image)).

请注意，我已将图像放在我的 python 所在的同一目录中。此外，它不会引发错误， image = Image.open('image.png', mode='r')但会引发在线print(image_to_string(image))。

Any idea what might be wrong here? Thanks

知道这里可能有什么问题吗？谢谢

Answer 1

回答by ?ukasz Rogalski

You have to have tesseractinstalled and accesible in your path.

你必须tesseract在你的路径中安装和访问。

According to source, pytesseractis merely a wrapper for subprocess.Popenwith tesseract binary as a binary to run. It does not perform any kind of OCR itself.

根据 source，pytesseract它只是将subprocess.Popentesseract 二进制文件作为要运行的二进制文件的包装器。它本身不执行任何类型的 OCR。

Relevant part of sources:

回答by AnkurJangra

You need to download tesseract OCR setup as well. Use this link to download the setup:http://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.01.exe

您还需要下载 tesseract OCR 设置。使用此链接下载设置：http: //digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.01.exe

Then, include this line in your code to use tesseract executable: pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract'

然后，在您的代码中包含这一行以使用 tesseract 可执行文件：pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract'

This is the default location where tesseract will be installed.

这是安装 tesseract 的默认位置。

That's it. I have also followed these steps to run the code at my end.

就是这样。我也按照这些步骤在我的最后运行代码。

Hope this will help.

希望这会有所帮助。

Answer 3

回答by thrinadhn

Please install the Below packages for extracting text from images pnf/jpeg

请安装以下软件包以从图像 pnf/jpeg 中提取文本

pip install pytesseract

pip install Pillow

using python pytesseract OCR (Optical Character Recognition) is the process of electronically extracting text from images

使用python pytesseract OCR（Optical Character Recognition）是从图像中电子提取文本的过程

PIL is used anything from simply reading and writing image files to scientific image processing, geographical information systems, remote sensing, and more.

PIL 可用于从简单的读取和写入图像文件到科学图像处理、地理信息系统、遥感等等。

from PIL import Image
from pytesseract import image_to_string 
print(image_to_string(Image.open('/home/ABCD/Downloads/imageABC.png'),lang='eng'))

Answer 4

回答by prabhakar267

You can try using this python library: https://github.com/prabhakar267/ocr-convert-image-to-text

您可以尝试使用这个 python 库：https: //github.com/prabhakar267/ocr-convert-image-to-text

As mentioned on the README of the package, usage is very straightforward.

正如包的 README 中提到的，使用非常简单。

usage: python main.py [-h] input_dir [output_dir]

positional arguments:
  input_dir
  output_dir

optional arguments:
  -h, --help  show this help message and exit

Answer 5

回答by stonebig

Your "current" directory is not where you think.

您的“当前”目录不是您所想的。

==> You may specify the full path to the image, for example: image = Image.open(r'C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\image.png', mode='r')

==> 您可以指定图像的完整路径，例如： image = Image.open(r'C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\image .png', mode='r')

图像到文本python

提问by muazfaiz

回答by ?ukasz Rogalski

回答by AnkurJangra

回答by thrinadhn

回答by prabhakar267

回答by stonebig

相关推荐

最近更新

标签

图像到文本python

提问by muazfaiz

回答by ?ukasz Rogalski

回答by AnkurJangra

回答by thrinadhn

回答by prabhakar267

回答by stonebig

相关推荐

Python 在pyspark中检索每组DataFrame中的前n个

无法使用 OpenCV 和 Python 编写和保存视频文件

Python 用于回归的 tensorflow 深度神经网络总是在一批中预测相同的结果

如何在 Python 中计算一个特定的单词？

相关推荐

最近更新

标签