java 如何从录制的监控摄像头视频中读取时间?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4503475/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 06:38:41  来源:igfitidea点击:

How to read time from recorded surveillance camera video?

javaimage-processingcomputer-visionocrvideo-processing

提问by stressed_geek

I have a problem where I have to read the time of recording from the video recorded by a surveillance camera.

我有一个问题,我必须从监控摄像头录制的视频中读取录制时间。

The time shows up on the top-left area of the video. Below is a link to screen grab of the area which shows the time. Also, the digit color(white/black) keeps changing during the duration of the video.

时间显示在视频的左上角区域。下面是显示时间的区域的屏幕截图链接。此外,数字颜色(白色/黑色)在视频播放期间不断变化。

alt texthttp://i55.tinypic.com/2j5gca8.png

替代文字http://i55.tinypic.com/2j5gca8.png

Please guide me in the direction to approach this problem. I am a Java programmer so would prefer an approach through Java.

请指导我解决这个问题的方向。我是一名 Java 程序员,所以更喜欢通过 Java 的方法。

EDIT:Thanks unhillbilly for the comment. I had looked at the Ron Cemer OCR library and its performance is much below our requirement.

编辑:感谢 unhillbilly 的评论。我看过 Ron Cemer OCR 库,它的性能远低于我们的要求。

Since the ocr performance is less than desired, I was planning to build a character set using the screen grabs for all the digits, and using some image/pixel comparison library to compare the frame time with the character-set which will show a probabilistic result after comparison.

由于 ocr 性能低于预期,我计划使用所有数字的屏幕抓取构建一个字符集,并使用一些图像/像素比较库将帧时间与将显示概率结果的字符集进行比较比较后。

So I was looking for a good image comparison library(I would be OK with a non-java library which I can run using the command-line). Also any advice on the above approach would be really helpful.

所以我一直在寻找一个好的图像比较库(我可以使用可以使用命令行运行的非 Java 库)。此外,对上述方法的任何建议都会非常有帮助。

采纳答案by Adi Shavit

It doesn't seem like you need a full blown OCR here.
I presume that the numbers are always in the same position in the image. You only expect digits 0-9 at each of the know positions (in either black or white).
A simple template matching at each position with each of the digits (you'll have 20 templates for the 10 digits at each color) is very fast (real-time) and should give you very accurate results.

在这里您似乎不需要完整的 OCR。
我认为这些数字在图像中总是处于相同的位置。您只期望在每个已知位置(黑色或白色)处有数字 0-9。
在每个位置与每个数字匹配的简单模板(每种颜色的 10 个数字将有 20 个模板)非常快(实时),并且应该为您提供非常准确的结果。

回答by Ron Cemer

Java OCR will work perfectly for your situation (Ron Cemer here). All you need to do is remove the background image, or make it always be less than 50% white, so that the white characters will be white and the background will be black when the image is converted to monochrome.

Java OCR 将非常适合您的情况(此处为 Ron Cemer)。您需要做的就是去除背景图像,或者使其始终小于 50% 的白色,这样当图像转换为单色时,白色字符为白色,背景为黑色。

Train JavaOCR on the font, extract that rectangular region from the image, remove the background and you're off and running.

在字体上训练 JavaOCR,从图像中提取该矩形区域,去除背景,然后您就可以开始运行了。

I suggest an algorithm which looks at r,g,b and sets everything to black where r,g,b are not exactly the same values. That will leave only pixels which are perfect shades of gray. Since the image is color and the digits are monochrome, that will leave the digits and some dust.

我建议一种算法,它查看 r,g,b 并将所有内容设置为黑色,其中 r,g,b 不是完全相同的值。这将只留下完美的灰色阴影像素。由于图像是彩色的,数字是单色的,所以会留下数字和一些灰尘。

JavaOCR wants to see black characters on a white background, so once you've done the above, you'll also need to invert the monochrome image (white = black and vice-versa). Then run that through the JavaOCR library, passing it reference samples of all of the characters you expect it to recognize, and your problem should be (at least mostly) solved.

JavaOCR 想要在白色背景上看到黑色字符,因此一旦完成上述操作,您还需要反转单色图像(白色 = 黑色,反之亦然)。然后通过 JavaOCR 库运行它,将您希望它识别的所有字符的引用样本传递给它,您的问题应该(至少大部分)得到解决。

回答by 3Dave

What format is the source in (vhs, dvd, stills)? It's possible that the time stamp is encoded in the data.

源是什么格式(vhs、dvd、stills)?时间戳可能编码在数据中。

Update with more detail

更新更多细节

While I completely understand the desire to have an automated end-to-end process (especially if you're selling this app as opposed to creating an in-house tool), it'd be more efficient to have someone manually enter the start time for each video (even if there are hundreds of them ) then to spend weeks of coding getting this to work automatically.

虽然我完全理解希望有一个自动化的端到端流程(特别是如果你销售这个应用程序而不是创建一个内部工具),让某人手动输入开始时间会更有效对于每个视频(即使有数百个),然后花费数周的编码使其自动工作。

What I'd do (failing a simple, very-fast-to-implement, super-accurate OCR solution which I don't believe exists):

我会做什么(我认为不存在的简单、快速实施、超准确的 OCR 解决方案失败了):

Create a couple of database tables, like

创建几个数据库表,比如

video           video_group
-------         -----------
id              id
filename        title
start_time      date_created
group_id        date_modified
date_created    date_deleted
date_modified
date_deleted

video_groupmight contain

video_group可能包含

id| title
-----------
1 | Unassigned
2 | 711 Mockingbird @ 75
3 | Kroger storage room

videowould be prepopulated with the video filenames by an import script. Initially assign everything a group_idof 1 (Unassigned)

video将通过导入脚本预先填充视频文件名。最初将所有内容分配group_id为 1(未分配)

Create a simple Winforms or WPF app (pardon my ASCII art):

创建一个简单的 Winforms 或 WPF 应用程序(请原谅我的 ASCII 艺术):

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
|  Group: [=========]\/ [New group...]                            |
|                                                                 |
|  File:  [=========]\/                                           |
|                                                                 |
|  Preview                                                        |
|  |--------------------------------------| [Next Video]          |
|  | (first frame of selected video here) | [Prev]                |
|  |                                      |                       |
|  |                                      |                       |
|  |                                      |                       |
|  |--------------------------------------|                       |
|  Start Time                                                     |
|  [(enter start time value here as displayed on preview frame)]  |
|                                                                 |
|  [Update]                                                       |
-------------------------------------------------------------------

A user (anybody could do this - secretary, janitor, even a recent CS graduate). All they have to do is read the time from the preview frame, type it into the Start Timefield, and Click "update" or "Next" to update the database and move on to the next one. Keep the Group selection from one video to the next unless the user changes it.

用户(任何人都可以这样做 - 秘书、看门人,甚至是最近的 CS 毕业生)。他们所要做的就是从预览帧中读取时间,将其输入到Start Time字段中,然后单击“更新”或“下一步”以更新数据库并移至下一个。除非用户更改它,否则将组选择从一个视频保留到下一个。

Assuming it takes the user 30 seconds to read, type and click next, They could complete 100-150 videos in an hour (Call it 75 for a more realistic estimate). And, interns are a lot cheaper than developer time.

假设用户需要 30 秒来阅读、输入和点击下一步,他们可以在一个小时内完成 100-150 个视频(更现实的估计称之为 75)。而且,实习生比开发人员的时间便宜很多。

If you really have "hundreds" of videos, it'll still be faster to do it this way than to screw around with OCR. If the OCR works for the most part, you'll most likely need to have someone manually inspect everything to see if the results are correct. which begs the question, why bother with the OCR?

如果您真的拥有“数百个”视频,那么以这种方式进行操作仍然比使用 OCR 来得更快。如果 OCR 在大部分情况下都能正常工作,您很可能需要有人手动检查所有内容以查看结果是否正确。这就引出了一个问题,为什么要打扰 OCR?

回答by Steve-o

Try Tesseractfrom Google, there area coupleof JNI wrappers available. Ensure to read the FAQ to only pull digits.

尝试正方体来自谷歌,那里一个情侣JNI的封装协议提供。确保阅读常见问题解答以仅提取数字。