Linux 开源 OCR

Question

提问by Chris

I'm looking for an open source OCR library that runs on Linux. I need this to work for PNGs and PDFs. Mostly I would like to interface this library from java or ruby. Any idea if there is anything available?

我正在寻找一个在 Linux 上运行的开源 OCR 库。我需要它来处理 PNG 和 PDF。大多数情况下，我想从 java 或 ruby 接口这个库。知道是否有任何可用的东西吗？

Regards.

问候。

Answer 1

回答by Ben Hymanson

Cuneiformis free and does a decent job. You could invoke it as a subprogram but there's no language binding that I know of. It won't read PDFs directly but you can easily take apart PDFs that are sequences of scanned images to feed them to Cuneiform. There are also scripts to reassemble the images and text back into a searchable PDF.

楔形文字是免费的并且做得不错。您可以将它作为子程序调用，但我知道没有语言绑定。它不会直接读取 PDF，但您可以轻松地拆开作为扫描图像序列的 PDF，将它们提供给楔形文字。还有一些脚本可以将图像和文本重新组合成可搜索的 PDF。

Answer 2

回答by olivierlemasle

Tesseract is a very good OCR engine: https://github.com/tesseract-ocr/tesseract

Tesseract 是一个非常好的 OCR 引擎：https: //github.com/tesseract-ocr/tesseract

The project has been launched by HP Labs and is now continued and sponsored by Google (for Google Books !). It is released under the Apache license, and it runs on Linux. It uses Tiff or PNGs files ; for PDFs, you will need to convert to one of these formats. I suppose that there is no binding so you should invoke this software as a subprogram...

该项目已由 HP 实验室启动，现在由 Google（用于 Google 图书！）继续并赞助。它是在 Apache 许可下发布的，并且在 Linux 上运行。它使用 Tiff 或 PNGs 文件；对于 PDF，您需要转换为这些格式之一。我想没有绑定所以你应该调用这个软件作为子程序......

Answer 3

回答by nguyenq

Try tesjeract, which uses JNI to call Tesseract OCR API.

试试tesjeract，它使用 JNI 调用 Tesseract OCR API。

For PDF, you'll need to convert them to image first, using GhostScript, for instance.

对于 PDF，您需要先将它们转换为图像，例如使用 GhostScript。

Linux 开源 OCR

提问by Chris

回答by Ben Hymanson

回答by olivierlemasle

回答by nguyenq

相关推荐

最近更新

标签

Linux 开源 OCR

提问by Chris

回答by Ben Hymanson

回答by olivierlemasle

回答by nguyenq

相关推荐

如何在Linux上按名称对某个目录中的文件进行排序

Linux Bash脚本中的while循环？

Linux 我怎么知道 std::map 插入是成功还是失败？

Linux 如何找出程序或其他库使用了共享对象的哪些函数？

相关推荐

最近更新

标签