python 简单的物体识别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1449139/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 22:16:29  来源:igfitidea点击:

Simple object recognition

pythonimage-processingcomputer-visionpattern-recognition

提问by G?khan Sever

===SOLVED===

===已解决===

Thanks for your suggestions and comments. By working on the flood_fill algorithm given in Beginning Python Visualizationbook (Chapter 9 - Image Processing) I have implemented what I have wanted. I can count the objects, get enclosing rectangles for each object (therefore height and widths), and lastly can construct NumPy arrays or matrices for each of them.

感谢您的建议和意见。通过研究Beginning Python Visualization一书(第 9 章 - 图像处理)中给出的 flood_fill 算法,我已经实现了我想要的。我可以计算对象,为每个对象获取封闭矩形(因此是高度和宽度),最后可以为每个对象构建 NumPy 数组或矩阵。

Although it is not an optimized approach it does what I want. The source code (lab2.py) and the png file (lab2-particles.png) that I use have been put under http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450.

虽然它不是优化的方法,但它可以满足我的要求。我使用的源代码(lab2.py)和png文件(lab2-particles.png)已经放在http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450下.

You need NumPy and PIL installed, and matplotlib to see the histogram. Core of the code lies within the objfind function where the main recursive object search action occurs.

您需要安装 NumPy 和 PIL,以及 matplotlib 才能查看直方图。代码的核心位于主要递归对象搜索操作发生的 objfind 函数中。

One further update:

进一步更新:

SciPy's ndimage.label()does exactly what I want, too.

SciPy 的ndimage.label()也正是我想要的。

Cheers for David-Warde Farleyand Zachary Pincusfrom the NumPy and SciPy mailing-lists for pointing this right into my eyes :)

为NumPy 和 SciPy 邮件列表中的David-Warde FarleyZachary Pincus干杯,感谢他们指出这一点:)

=============

==============

Hello,

你好,

I have an image that contains the shadows of ice particles measured by a particle spectrometer. I want to be able to identify each object, so that I can later classify and use them further in my calculations.

我有一张图像,其中包含由粒子光谱仪测量的冰粒子阴影。我希望能够识别每个对象,以便我以后可以在计算中进一步分类和使用它们。

In essence, what I am willing to do is to simply implement a fuzzy selection tool where I can simply select each entity.

本质上,我愿意做的是简单地实现一个模糊选择工具,我可以简单地选择每个实体。

How could I easily solve this problem? (Preferably using Python)

我怎样才能轻松解决这个问题?(最好使用Python)

Thanks.

谢谢。

NOTE: In my question I am referring to each specific connected pixels as objects or entities. My intention to extract them and create NumPy array representations as shown below. (Here I am using the top-left object; if a pixel exist use 1's if not use 0's. This object's shape is 3 by 3 which correspondingly 3 pixel height by 3 pixel width. These are projections of real ice-particles onto 2D domain, under the assumption of their being spherical and equivalent radius is (height+width)/2, and later some scalings --from pixels to actual sizes and volume calculations will follow)

注意:在我的问题中,我将每个特定的连接像素称为对象或实体。我打算提取它们并创建 NumPy 数组表示,如下所示。(这里我使用左上角的对象;如果存在像素,则使用 1 如果不使用 0。该对象的形状为 3 x 3,对应于 3 像素高 x 3 像素宽。这些是真实冰粒到 2D 域的投影,假设它们是球形的,等效半径为(高度+宽度)/2,稍后将进行一些缩放——从像素到实际大小和体积计算将随之而来)

import numpy as np

np.array([[1,1,1], [1,1,1], [0,0,1]])

array([[1, 1, 1],
       [1, 1, 1],
       [0, 0, 1]])

Here is a section from the image which I am going to use.

这是我将要使用的图像的一部分。

screenshot http://img43.imageshack.us/img43/2327/particles.png

截图 http://img43.imageshack.us/img43/2327/particles.png

回答by ChrisW

  1. Scan every square (e.g. from the top-left, left-to-right, top-to-bottom)

  2. When you hit a blue square then:

    a. Record this square as a location of a new object

    b. Find all the other contiguous blue squares (e.g. by looking at the neighbours of this square, and the neighbours of those neighbours, etc.) and mark them as being part of the same object

  3. Continue to scan

  4. When you find another blue square, test to see whether it's part of a known object before going to step 2; alternatively in step 2b, erase any square after you've associated it with an object

  1. 扫描每个方块(例如从左上角、从左到右、从上到下)

  2. 当你击中一个蓝色方块时:

    一个。将此方块记录为新对象的位置

    湾 找到所有其他连续的蓝色方块(例如,通过查看此方块的邻居,以及这些邻居的邻居等)并将它们标记为同一对象的一部分

  3. 继续扫描

  4. 当你找到另一个蓝色方块时,在进入第 2 步之前测试它是否是已知对象的一部分;或者在步骤 2b 中,在将任何方块与对象关联后擦除它

回答by Amro

Looking at the image you provided, all you need to do next is to apply a simple region growing algorithm.

查看您提供的图像,接下来您需要做的就是应用一个简单的区域增长算法

If I were using MATLAB, I would use bwlabel/bwboundariesfunctions. I believe there's an equivalent function somewhere in Numpy, or use OpenCVwith python wrappers as suggested by @kwatford

如果我使用 MATLAB,我会使用bwlabel/ bwboundaries函数。我相信在Numpy某处有一个等效的函数,或者按照@kwatford 的建议使用带有 python 包装器的OpenCV

回答by behindthefall

I used to do this kind of analysis on micrographs and eventually put everything I needed into an image processing and analysis package written in C, driven via Tcl. (It worked with 512 x 512 images only, which explains why 512 crops up so often. There were images with pixels of various sizes allocated, but most of the work was done with 8-bit pixels, which explains why there is that business of 0xff and maximum meaningful count of 254 on an image.)

我曾经在显微照片上进行这种分析,并最终将我需要的一切放入用 C 编写的图像处理和分析包中,由 Tcl 驱动。(它仅适用于 512 x 512 图像,这解释了为什么 512 突然出现如此频繁。有分配了各种尺寸像素的图像,但大部分工作是用 8 位像素完成的,这解释了为什么有0xff 和图像上最大有意义的计数 254。)

Briefly, the 'zz' at the begining of the Tcl commands sends the remainder of the line to the package's parser which calls the appropriate C routine with the given arguments. Right after the 'zz' is an argument that indicates the input and output of the command. (There can be multiple inputs but only a single output.) 'r' indicates a 512 x 512 x 8-bit image. The third word is the name of the command to be invoked; 'graphs' marks up an image as described in the text below. So, 'zz rr graphs' means 'Call the ZZ parser; input an r image to the graphs command and get back an r image.' The rest of the Tcl command line specifies which of the pre-allocated images to use. (The 'g' image is an ROI, i.e., region-of-interest, image; almost all ZZ ops are done under ROI control.) So, 'r1 r1 g8' means 'Use r1 as input, use r1 as output (that is, mark up the input image itself), and do the operation wherever the corresponding pixel on image g8 --- that is, r8, used as an ROI --- is >0.

简而言之,Tcl 命令开头的“zz”将行的其余部分发送到包的解析器,该解析器使用给定的参数调用适当的 C 例程。'zz' 之后是一个参数,表示命令的输入和输出。(可以有多个输入,但只有一个输出。)“r”表示 512 x 512 x 8 位图像。第三个字是要调用的命令的名称;'graphs' 标记图像,如下文所述。所以,'zz rr graphs' 的意思是'调用 ZZ 解析器;将 r 图像输入到 graphs 命令并返回 r 图像。Tcl 命令行的其余部分指定要使用哪个预先分配的图像。('g' 图像是一个 ROI,即感兴趣区域的图像;几乎所有 ZZ 操作都是在 ROI 控制下完成的。)所以,'

I don't think it is available online anywhere, but if you want to pick through the source code or even compile the whole shebang, I'll be happy to send it to you. Here's an excerpt from the manual (but I think I see some errors in the manual at this late date --- that's embarrassing ...):

我不认为它可以在任何地方在线获得,但是如果您想选择源代码甚至编译整个 shebang,我很乐意将其发送给您。这是手册的摘录(但我想我在这么晚的日期看到手册中有一些错误——这很尴尬......):

Example 6. Counting features.

示例 6. 计数特征。

Problem

问题

Counting is a common task. The items counted are called “features”, and it is usually necessary to prepare images carefully so that features correspond in a one-to-one way with things that are the real objects to be counted. Here, however, we ignore image preparation and consider, instead, the mechanics of counting. The first counting exercise is to find out how many features are on the images in the directory ./cells?

计数是一项常见的任务。计数的项目称为“特征”,通常需要仔细准备图像,使特征与要计数的真实物体一一对应。然而,在这里,我们忽略了图像准备,而是考虑了计数机制。第一个计数练习是找出目录 ./cells 中的图像上有多少特征?

Approach

方法

First, let us define “feature”. A feature is the largest group of “set” (non-zero) pixels all of which can be reached by travelling from one set pixel to another along north-south-east-west (up-down-right-left) routes, starting from a given set pixel. The zz command that detects and marks such features on an image is “zz rr graphs R:src R:dest G:ROI”, so called because the mathematical term for such a feature is a “graph”. If all the pixels on an image are set, then there is only a single graph on the image, but it contains 262144 pixels (512 * 512). If pixels are set and clear (equal to zero) in a checkerboard pattern, then there will be 131072 (512 * 512 / 2) graphs, but each will containing only one pixel. Briefly explained, “zz rr graphs” starts in the upper-left corner of an image and scans each succeeding row left to right until it finds a set pixel, then finds all the set pixels attached to that through north, south, east, or west borders (“4-connected”). It then sets all pixels in that graph to 1 (0x01). After finding and marking graph 1, it starts scanning again at the pixel after the one where it first discovered graph 1, this time ignoring any pixels that already belong to a graph. The first 254 graphs that it finds will be marked uniquely; all graphs found after that, however, will be marked with the value 255 (0xff) and so cannot be distinguished from each other. The key to being able to count any number of graphs accurately is to process each image in stages, that is, find the number of graphs on an image and, if the number is greater than 254, erase the 254 graphs just found, repeating the process until 254 or fewer graphs are found. The Tcl language provides the means to set up control of this operation.

首先,让我们定义“特征”。一个特征是最大的一组“集合”(非零)像素,所有这些像素都可以通过沿着北-南-东-西(上-下-右-左)路线从一个集合像素移动到另一个像素,开始从给定的集合像素。在图像上检测和标记此类特征的 zz 命令是“zz rr graphs R:src R:dest G:ROI”,之所以这样称呼,是因为此类特征的数学术语是“图形”。如果设置了图像上的所有像素,则图像上只有一个图形,但它包含 262144 个像素(512 * 512)。如果在棋盘图案中设置并清除像素(等于零),则将有 131072 (512 * 512 / 2) 个图形,但每个图形仅包含一个像素。简要说明,“zz rr graphs”从图像的左上角开始,从左到右扫描每一行,直到找到一个设置像素,然后通过北、南、东或西边界找到所有与该像素相连的设置像素( “4-连接”)。然后将该图中的所有像素设置为 1 (0x01)。在找到并标记图形 1 后,它再次开始扫描它第一次发现图形 1 之后的像素,这次忽略任何已经属于图形的像素。它找到的前 254 个图形将被唯一标记;但是,之后找到的所有图形都将标有值 255 (0xff),因此无法相互区分。能够准确统计任意数量的图的关键是分阶段处理每个图像,即在图像上找到图的数量,如果数量大于254,擦除刚刚找到的 254 个图形,重复该过程直到找到 254 个或更少的图形。Tcl 语言提供了设置此操作控制的方法。

Let us begin to build the commands needed by reading a ZZ image file into an R image and detecting and marking the graphs. Before the processing loop, we declare and zero a variable to hold the total number of features in the image series. Within the processing loop, we begin by reading the image file into an R image and detecting and marking the graphs.

让我们开始通过将 ZZ 图像文件读入 R 图像并检测和标记图形来构建所需的命令。在处理循环之前,我们声明一个变量并将其归零以保存图像系列中的特征总数。在处理循环中,我们首先将图像文件读入 R 图像并检测和标记图形。

zz ur to $inDir/$img r1
zz rr graphs r1 r1 g8

Next, we zero some variables to keep track of the counts, then use the “ra max” command to find out whether more than 254 graphs were detected.

接下来,我们将一些变量归零以跟踪计数,然后使用“ra max”命令找出是否检测到超过 254 个图形。

set nGraphs [ zz ra max r1 a1 g1 ]

If nGraphs does equal 255, then the 254 accurately counted graphs should be added to the total, the graphs from 1 through 254 should be erased, and the count repeated for as many times as it takes to reduce the number of graphs below 255.

如果 nGraphs 确实等于 255,那么应该将 254 个准确计数的图形添加到总数中,应该删除从 1 到 254 的图形,并重复计数以减少 255 以下的图形数量。

while {$nGraphs == 255} {
  incr sumGraphs 254
  zz rbr lt r1 155 r1 g1 0 255 
  set sumGraphs 0
  zz rr graphs r1 r1 g8
  set nGraphs [ zz ra max r1 a1 g8 ]
}

When the “while” loop exits, the variable nGraphs must hold a number less than 255, that is, a number of accurately counted graphs; this is added to the rising total of the number of features in the image series.

当“while”循环退出时,变量nGraphs必须持有一个小于255的数字,即精确计数的图的数量;这被添加到图像系列中不断增加的特征数量中。

incr sumGraphs $nGraphs

After the processing loop, print out the total number of features found in the series.

在处理循环之后,打印出在系列中找到的特征总数。

puts “Total number of features in $inDir \
images $beginImg through $endImg is $sumGraphs.”

After the processing loop, print out the total number of features found in the series.

在处理循环之后,打印出在系列中找到的特征总数。

回答by Dima

Connected component analysismay be what you are looking for.

连通分量分析可能是您正在寻找的。

回答by kwatford

OpenCVhas a Python interface that you might find useful.

OpenCV有一个 Python 接口,您可能会发现它很有用。