ios 使用tesseract识别车牌
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19268648/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using tesseract to recognize license plates
提问by unicorn80
I'm developing an app which can recognize license plates (ANPR). The first step is to extract the licenses plates from the image. I am using OpenCV to detect the plates based on width/height ratio and this works pretty well:
我正在开发一个可以识别车牌 (ANPR) 的应用程序。第一步是从图像中提取车牌。我正在使用 OpenCV 来检测基于宽/高比的板块,这很有效:
But as you can see, the OCR results are pretty bad.
但是正如您所看到的,OCR 结果非常糟糕。
I am using tesseract
in my Objective C
(iOS) environment. These are my init
variables when starting the engine:
我tesseract
在我的Objective C
(iOS)环境中使用。这些是我init
启动引擎时的变量:
// init the tesseract engine.
tesseract = new tesseract::TessBaseAPI();
int initRet=tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], [language UTF8String]);
tesseract->SetVariable("tessedit_char_whitelist", "BCDFGHJKLMNPQRSTVWXYZ0123456789-");
tesseract->SetVariable("language_model_penalty_non_freq_dict_word", "1");
tesseract->SetVariable("language_model_penalty_non_dict_word ", "1");
tesseract->SetVariable("load_system_dawg", "0");
How can I improve the results? Do I need to let OpenCV do more image manipulation? Or is there something I can improve with tesseract?
我怎样才能改善结果?我需要让 OpenCV 做更多的图像处理吗?或者我可以用tesseract改进什么?
回答by Alex I
Two things will fix this completely:
两件事将完全解决这个问题:
Remove everything which is not textfrom the image. You need to use some CV to find the plate area (for example by color, etc) and then mask out allof the background. You want the input to tesseract to be black and white, where text is black and everything else is white
Remove skew (as mentioned by FrankPI above). tesseract is actually supposed to work okay with skew (see "Tesseract OCR Engine" overview by R. Smith) but on the other hand it doesn't always work, especially if you have a single line as opposed to a few paragraphs. So removing skew manually first is always good, if you can do it reliably. You will probably know the exact shape of the bounding trapezoid of the plate from step 1, so this should not be too hard. In the process of removing skew, you can also remove perspective: all license plates (usually) have the same font, and if you scale them to the same (perspective-free) shape the letter shapes would be exactly the same, that would help text recognition.
从图像中删除所有不是文本的内容。您需要使用一些 CV 来查找板区域(例如按颜色等),然后屏蔽所有背景。您希望tesseract的输入为黑白,其中文本为黑色,其他所有内容均为白色
消除倾斜(如上面 FrankPI 所述)。tesseract 实际上应该可以正常工作(参见R. Smith 的“ Tesseract OCR 引擎”概述),但另一方面,它并不总是有效,尤其是当您只有一行而不是几段时。因此,如果您能可靠地做到这一点,首先手动消除歪斜总是好的。您可能会从步骤 1 中知道板的边界梯形的确切形状,因此这不应该太难。在消除倾斜的过程中,您还可以消除透视:所有车牌(通常)具有相同的字体,如果将它们缩放到相同(无透视)形状,字母形状将完全相同,这将有所帮助文字识别。
Some further pointers...
一些进一步的指针...
Don't try to code this at first: take a really easy to OCR (ie: from directly in front, no perspective) picture of a plate, edit it in photoshop (or gimp) and run it through tesseract on the commandline. Keep editing in different ways until this works. For example: select by color (or flood select the letter shapes), fill with black, invert selection, fill with white, perspective transform so corners of plate are a rectangle, etc. Take a bunch of pictures, some harder (maybe from odd angles, etc). Do this with all of them. Once this works completely, think about how to make a CV algorithm that does the same thing you did in photoshop :)
一开始不要尝试编写代码:拍摄一个非常容易 OCR(即:从正前方,没有透视)的板图片,在 photoshop(或 gimp)中编辑它并通过命令行上的 tesseract 运行它。继续以不同的方式进行编辑,直到成功为止。例如:按颜色选择(或洪水选择字母形状),用黑色填充,反转选择,用白色填充,透视变换使板的角变成矩形等。 拍一堆照片,一些更难的(可能是奇怪的)角度等)。对所有人都这样做。一旦这完全有效,请考虑如何制作一个 CV 算法,它可以执行与您在 photoshop 中所做的相同的事情:)
P.S. Also, it is better to start with higher resolution image if possible. It looks like the text in your example is around 14 pixels tall. tesseract works pretty well with 12 point text at 300 dpi, this is about 50 pixels tall, and it works much better at 600 dpi. Try to make your letter size be at least 50 preferably 100 pixels.
PS 另外,如果可能的话,最好从更高分辨率的图像开始。看起来您示例中的文本高约 14 像素。tesseract 在 300 dpi 下处理 12 点文本时效果很好,大约 50 像素高,在 600 dpi 下效果更好。尽量使您的字母大小至少为 50 像素,最好是 100 像素。
P.P.S. Are you doing anything to train tesseract? I think you have to do that, the font here is different enough to be a problem. You probably also need something to recognize (and not penalize) dashes which will be very common in your texts, looks like in the second example "T-" is recognized as H.
PPS 你在做任何事情来训练 tesseract吗?我认为你必须这样做,这里的字体不同,足以成为一个问题。你可能还需要一些东西来识别(而不是惩罚)破折号,这在你的文本中很常见,看起来在第二个例子中“T-”被识别为 H。
回答by guneykayim
I don't know tesseract too much, but I have some information about OCR. Here we go.
我不太了解tesseract,但我有一些关于OCR的信息。开始了。
- In an OCR task you need to be sure that, your train data has the same font that you are trying to recognize. Or if you are trying to recognize multiple fonts, be sure that you have those fonts in your train data to get best performance.
- As far as I know, tesseract applies OCR in few different ways: One, you give an image which has multiple letters in it and let tesseract do the segmentation. And other, you give segmented letters to tesseract and only expect it to recognize the letter. Maybe you can try to change the one which you are using.
- If you are training recognizer by yourself be sure that you have enough and equally amount of each letter in your train data.
- 在 OCR 任务中,您需要确保您的训练数据与您尝试识别的字体相同。或者,如果您尝试识别多种字体,请确保您的训练数据中有这些字体以获得最佳性能。
- 据我所知,tesseract 以几种不同的方式应用 OCR:一,你给出一个包含多个字母的图像,让 tesseract 进行分割。以及其他,您将分段的字母提供给 tesseract,并且只期望它能够识别该字母。也许您可以尝试更改您正在使用的那个。
- 如果您自己训练识别器,请确保您的训练数据中每个字母的数量都足够且相等。
Hope this helps.
希望这可以帮助。
回答by chroman
I've been working on an iOS app, if you need to improve the results you should train tesseract OCR, this improved 90% for me. Before tranning, OCR results were pretty bad.
我一直在开发一个 iOS 应用程序,如果你需要改进结果,你应该训练 tesseract OCR,这对我来说提高了 90%。在 tranning 之前,OCR 结果非常糟糕。
So, I used this gistin the past to train tesseract ORC with a licence plate font.
所以,我过去使用这个要点来训练带有车牌字体的 tesseract ORC。
If you are interested, I open-sourced this project some weeks ago on github
如果你有兴趣,我几周前在github上开源了这个项目
回答by valentt
Here is my real world example with trying out OCR from my old power meter. I would like to use your OpenCV code so that OpenCV does automatic cropping of image, and I'll do image cleaning scripts.
这是我在旧功率计上尝试 OCR 的真实示例。我想使用您的 OpenCV 代码,以便 OpenCV 自动裁剪图像,我将执行图像清理脚本。
- First image is original image (croped power meter numbers)
- Second image is slightly cleaned up image in GIMP, around 50% OCR accuracy in tesseract
- Third image is completely cleaned image - 100% OCR recognized without any training!
- 第一张图片是原始图片(裁剪的功率计编号)
- 第二张图片是在 GIMP 中稍微清理过的图片,tesseract 中的 OCR 准确率约为 50%
- 第三张图像是完全清洁的图像 - 100% OCR 识别,无需任何训练!
回答by Prabhjot Singh Gogana
Now License Plate can be easily recognized by mlmodel. I have created the core model you can find it here. You just need to split characters in 28*28 resolution through vision framework and send this image to VNImageRequestHandler like given below-
现在,mlmodel 可以轻松识别车牌。我已经创建了核心模型,您可以在此处找到它。您只需要通过视觉框架以 28*28 的分辨率分割字符并将此图像发送到 VNImageRequestHandler 如下所示 -
let handler = VNImageRequestHandler(cgImage: imageUI.cgImage!, options: [:])
you will get desired results by using my core mlmodel. Use thislink for better clarification but use my model for better results in license plate recognition. I have also created the mlmodelfor License Plate Recognition.
通过使用我的核心 mlmodel,您将获得所需的结果。使用此链接以获得更好的说明,但使用我的模型可以获得更好的车牌识别结果。我还为车牌识别创建了mlmodel。