我应该在 Android 中使用什么样的 OCR Java 库?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1062578/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 22:59:12  来源:igfitidea点击:

What kind of OCR Java library should I use in Android?

javaandroidocr

提问by systempuntoout

I would like to build an Android application that, via an OCR library, should scan a picture extracting text from it .

我想构建一个 Android 应用程序,它应该通过 OCR 库扫描图片并从中提取文本。

What Java library should I use?

我应该使用什么 Java 库?

采纳答案by Thilo

Don't know how good it is (it definitely needs to be trained first), but there is Ron Cemer's Java OCR library.

不知道它有多好(肯定需要先训练),但有Ron Cemer的Java OCR库

回答by davetapley

If you are looking for a very extensible option or have a specific problem domain you could consider rolling your own using the Java Object Oriented Neural Engine.

如果您正在寻找一个非常可扩展的选项或有一个特定的问题域,您可以考虑使用Java Object Oriented Neural Engine滚动您自己的选项。

I used it successfully in a personal project to identify the letter from an image such as this, you can find all the source for the OCR component of my application on github, here.

我成功地用它在一个个人项目,以确定从图像信如这个,你可以找到所有的来源我的应用程序的GitHub上的OCR组件,在这里

回答by raudi

try tesseract, checkout this article http://www.itwizard.ro/interfacing-cc-libraries-via-jni-example-tesseract-163.htmland this example http://code.google.com/p/mezzofanti/

尝试 tesseract,查看这篇文章 http://www.itwizard.ro/interface-cc-libraries-via-jni-example-tesseract-163.html和这个例子 http://code.google.com/p/mezzofanti/

Edit: some more facts - tesseract is one of the best open source OCR used by google - there is training data available for many languages - mezzofanti is an android app that uses tesseract - beware: OCR does use a lot of CPU power. trying to OCR a A4 page with your T-Mob G1 will take a lot of time and the result may not impress you ;-)

编辑:一些更多的事实 - tesseract 是谷歌使用的最好的开源 OCR 之一 - 有许多语言的训练数据 - mezzofanti 是一个使用 tesseract 的安卓应用程序 - 当心:OCR 确实使用了大量的 CPU 能力。尝试使用 T-Mob G1 对 A4 页面进行 OCR 将花费大量时间,结果可能不会给您留下深刻印象;-)

回答by yeradis

You can use the OCR feature from Google Docs. Check the Documents List Data APIhttp://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#OCR

您可以使用 Google Docs 中的 OCR 功能。检查文档列表数据 API http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#OCR