Python - 计算图像的直方图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22159160/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:23:37  来源:igfitidea点击:

Python - Calculate histogram of image

pythonopencvnumpyhistogram

提问by Jason

I'm working on teaching myself the basics of computerized image processing, and I am teaching myself Python at the same time.

我正在自学计算机图像处理的基础知识,同时我也在自学 Python。

Given an image xof dimensions 2048x1354 with 3 channels, efficiently calculate the histogram of the pixel intensities.

给定x具有 3 个通道的尺寸为 2048x1354的图像,有效地计算像素强度的直方图。

import numpy as np, cv2 as cv

img = cv.imread("image.jpg")
bins = np.zeros(256, np.int32)

for i in range(0, img.shape[0]):
    for j in range(0, img.shape[1]):

        intensity = 0
        for k in range(0, len(img[i][j])):
            intensity += img[i][j][k]

        bins[intensity/3] += 1

print bins

My issue is that this code runs pretty slowly, as in ~30 seconds. How can I speed this up and be more Pythonic?

我的问题是这段代码运行得很慢,大约需要 30 秒。我怎样才能加快速度并变得更加 Pythonic?

回答by Wesley Bowman

Take a look at MatPlotLib. This should take you through everything you want to do, and without the for loops.

看看MatPlotLib。这应该带你完成你想做的一切,而且没有 for 循环。

回答by amaurea

If you just want to count the number of occurences of each value in an array, numpycan do that for you using numpy.bincount. In your case:

如果您只想计算数组中每个值的出现次数,numpy可以使用numpy.bincount. 在你的情况下:

arr  = numpy.asarray(img)
flat = arr.reshape(numpy.prod(arr.shape[:2]),-1)
bins = numpy.bincount(np.sum(flat,1)/flat.shape[1],minsize=256)

I'm using numpy.asarrayhere to make sure that imgis a numpy array, so I can flatten it to the one-dimensional array bincountneeds. If imgis already an array, you can skip that step. The counting itself will be very fast. Most of the time here will probably be spent in converting the cv matrix to an array.

我在numpy.asarray这里使用以确保它img是一个 numpy 数组,因此我可以将其展平为一维数组bincount需要。如果img已经是一个数组,则可以跳过该步骤。计数本身会非常快。这里的大部分时间可能都用于将 cv 矩阵转换为数组。

Edit: According to this answer, you may need to use numpy.asarray(img[:,:])(or possibly img[:,:,:]) in order to successfully convert the image to an array. On the other hand, according to this, what you get out from newer versions of openCV is already a numpy array. So in that case you can skip the asarraycompletely.

编辑:根据此答案,您可能需要使用numpy.asarray(img[:,:])(或可能img[:,:,:])才能将图像成功转换为数组。另一方面,根据this,您从较新版本的 openCV 中得到的已经是一个 numpy 数组。所以在这种情况下,你可以asarray完全跳过。

回答by Zaw Lin

it's impossible to do this(i.e without removing the for loop) in pure python. Python's for loop construct has too many things going on to be fast. If you really want to keep the for loop, the only solution is numba or cython but these have their own set of issues. Normally, such loops are written in c/c++(most straightforward in my opinion) and then called from python, it's main role being that of a scripting language.

在纯 python 中不可能做到这一点(即不删除 for 循环)。Python 的 for 循环结构有太多的事情要处理得很快。如果您真的想保留 for 循环,唯一的解决方案是 numba 或 cython,但这些都有自己的问题。通常,这样的循环是用 c/c++ 编写的(在我看来最简单)然后从 python 调用,它的主要作用是脚本语言。

Having said that, opencv+numpy provides enough useful routines so that in 90% of cases, it's possible to simply use built in functions without having to resort to writing your own pixel level code.

话虽如此,opencv+numpy 提供了足够有用的例程,因此在 90% 的情况下,可以简单地使用内置函数而不必求助于编写自己的像素级代码。

Here's a solution in numba without changing your looping code. on my computer it's about 150 times faster than pure python.

这是 numba 中的解决方案,无需更改循环代码。在我的电脑上,它比纯 python 快大约 150 倍。

import numpy as np, cv2 as cv

from time import time
from numba import jit,int_,uint8 

@jit(argtypes=(uint8[:,:,:],int_[:]),
    locals=dict(intensity=int_),
    nopython=True
    )
def numba(img,bins):
    for i in range(0, img.shape[0]):
        for j in range(0, img.shape[1]):
            intensity = 0
            for k in range(0, len(img[i][j])):
                intensity += img[i][j][k]
            bins[intensity/3] += 1


def python(img,bins):
    for i in range(0, img.shape[0]):
        for j in range(0, img.shape[1]):
            intensity = 0
            for k in range(0, len(img[i][j])):
                intensity += img[i][j][k]
            bins[intensity/3] += 1

img = cv.imread("image.jpg")
bins = np.zeros(256, np.int32)

t0 = time()
numba(img,bins)
t1 = time()
#print bins
print t1 - t0

bins[...]=0
t0 = time()
python(img,bins)
t1 = time()
#print bins
print t1 - t0    

回答by Ondro

You can use newer OpenCV python interface which natively uses numpy arrays and plot the histogram of the pixel intensities using matplotlib hist. It takes less than second on my computer.

您可以使用较新的 OpenCV python 接口,它本机使用 numpy 数组并使用 matplotlib 绘制像素强度的直方图hist。在我的电脑上只需不到一秒钟。

import matplotlib.pyplot as plt
import cv2

im = cv2.imread('image.jpg')
# calculate mean value from RGB channels and flatten to 1D array
vals = im.mean(axis=2).flatten()
# plot histogram with 255 bins
b, bins, patches = plt.hist(vals, 255)
plt.xlim([0,255])
plt.show()

enter image description here

在此处输入图片说明

UPDATE: Above specified number of bins not always provide desired result as min and max are calculated from actual values. Moreover, counts for values 254 and 255 are summed in last bin. Here is updated code which always plot histogram correctly with bars centered on values 0..255

更新:由于 min 和 max 是根据实际值计算的,因此上述指定的 bin 数量并不总是能提供所需的结果。此外,值 254 和 255 的计数在最后一个 bin 中求和。这是更新的代码,它始终正确绘制直方图,条形以值 0..255 为中心

import numpy as np
import matplotlib.pyplot as plt
import cv2

# read image
im = cv2.imread('image.jpg')
# calculate mean value from RGB channels and flatten to 1D array
vals = im.mean(axis=2).flatten()
# calculate histogram
counts, bins = np.histogram(vals, range(257))
# plot histogram centered on values 0..255
plt.bar(bins[:-1] - 0.5, counts, width=1, edgecolor='none')
plt.xlim([-0.5, 255.5])
plt.show()

enter image description here

在此处输入图片说明