使用 Python 对图像进行 FFT

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38476359/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:49:46  来源:igfitidea点击:

FFT on image with Python

pythonimagefftdft

提问by Tatarinho

I have a problem with FFT implementation in Python. I have completely strange results. Ok so, I want to open image, get value of every pixel in RGB, then I need to use fft on it, and convert to image again.

我在 Python 中的 FFT 实现有问题。我有完全奇怪的结果。好的,我想打开图像,获取RGB中每个像素的值,然后我需要对其使用fft,然后再次转换为图像。

My steps:

我的步骤:

1) I'm opening image with PIL library in Python like this

1)我像这样在Python中用PIL库打开图像

from PIL import Image
im = Image.open("test.png")

2) I'm getting pixels

2)我得到像素

pixels = list(im.getdata())

3) I'm seperate every pixel to r,g,b values

3)我将每个像素分离到 r,g,b 值

for x in range(width):
    for y in range(height):
        r,g,b = pixels[x*width+y]
        red[x][y] = r
        green[x][y] = g
        blue[x][y] = b

4). Let's assume that I have one pixel (111,111,111). And use fft on all red values like this

4)。假设我有一个像素 (111,111,111)。并像这样对所有红色值使用 fft

red = np.fft.fft(red)

And then:

进而:

print (red[0][0], green[0][0], blue[0][0])

My output is:

我的输出是:

(53866+0j) 111 111

It's completely wrong I think. My image is 64x64, and FFT from gimp is completely different. Actually, my FFT give me only arrays with huge values, thats why my output image is black.

我认为这是完全错误的。我的图像是 64x64,而 gimp 的 FFT 完全不同。实际上,我的 FFT 只给我具有巨大值的数组,这就是为什么我的输出图像是黑色的。

Do you have any idea where is problem?

你知道问题出在哪里吗?

[EDIT]

[编辑]

I've changed as suggested to

我已按照建议更改为

red= np.fft.fft2(red)

And after that I scale it

之后我缩放它

scale = 1/(width*height)
red= abs(red* scale)

And still, I'm getting only black image.

而且,我只得到黑色图像。

[EDIT2]

[编辑2]

Ok, so lets take one image. test.png

好的,让我们拍一张照片。 测试.png

Assume that I dont want to open it and save as greyscale image. So I'm doing like this.

假设我不想打开它并保存为灰度图像。所以我就是这样做的。

def getGray(pixel):
    r,g,b = pixel
    return (r+g+b)/3  

im = Image.open("test.png")
im.load()

pixels = list(im.getdata())
width, height = im.size
for x in range(width):
    for y in range(height):
        greyscale[x][y] = getGray(pixels[x*width+y])  

data = []
for x in range(width):
     for y in range(height):
         pix = greyscale[x][y]
         data.append(pix)

img = Image.new("L", (width,height), "white")
img.putdata(data)
img.save('out.png')

After this, I'm getting this image greyscale, which is ok. So now, I want to make fft on my image before I'll save it to new one, so I'm doing like this

在此之后,我得到了这个图像灰度,这很好。所以现在,我想在我的图像上制作fft,然后再将它保存到新的,所以我这样做

scale = 1/(width*height)
greyscale = np.fft.fft2(greyscale)
greyscale = abs(greyscale * scale)

after loading it. After saving it to file, I have bad FFT. So lets try now open test.png with gimp and use FFT filter plugin. I'm getting this image, which is correct good FFT

加载后。将其保存到文件后,我有坏的FFT. 所以现在让我们尝试用 gimp 打开 test.png 并使用 FFT 过滤器插件。我得到这张图片,这是正确的良好的FFT

How I can handle it?

我该如何处理?

回答by Ahmed Fasih

Great question. I've never heard of it but the Gimp Fourierplugin seems really neat:

很好的问题。我从未听说过它,但Gimp Fourier插件看起来非常简洁:

A simple plug-in to do fourier transform on you image. The major advantage of this plugin is to be able to work with the transformed image inside GIMP. You can so draw or apply filters in fourier space, and get the modified image with an inverse FFT.

一个简单的插件,可以对您的图像进行傅立叶变换。这个插件的主要优点是能够在 GIMP 中处理转换后的图像。您可以在傅立叶空间中绘制或应用滤波器,并使用逆 FFT 获取修改后的图像。

This idea—of doing Gimp-style manipulation on frequency-domain data and transforming back to an image—is very cool! Despite years of working with FFTs, I've never thought about doing this. Instead of messing with Gimp plugins and C executables and ugliness, let's do this in Python!

这个想法——对频域数据进行 Gimp 风格的操作并转换回图像——非常酷!尽管使用 FFT 多年,但我从未想过要这样做。与其搞乱 Gimp 插件和 C 可执行文件和丑陋,让我们用 Python 来做吧!

Caveat.I experimented with a number of ways to do this, attempting to get something close to the output Gimp Fourier image (gray with mtheitroadé pattern) from the original input image, but I simply couldn't. The Gimp image appears to be somewhat symmetric around the middle of the image, but it's not flipped vertically or horizontally, nor is it transpose-symmetric. I'd expect the plugin to be using a real 2D FFT to transform an H×W image into a H×W array of real-valued data in the frequency domain, in which case there would be no symmetry (it's just the to-complex FFT that's conjugate-symmetric for real-valued inputs like images). So I gave up trying to reverse-engineer what the Gimp plugin is doing and looked at how I'd do this from scratch.

警告。我尝试了多种方法来做到这一点,试图从原始输入图像中获得接近输出 Gimp Fourier 图像(带有莫尔条纹的灰色)的东西,但我就是做不到。Gimp 图像在图像中间看起来有些对称,但它没有垂直或水平翻转,也不是转置对称。我希望该插件使用真正的 2D FFT 将 H×W 图像转换为频域中实值数据的 H×W 数组,在这种情况下将不存在对称性(它只是到-复数 FFT,对于像图像这样的实值输入是共轭对称的)。所以我放弃了对 Gimp 插件正在做的事情进行逆向工程的尝试,而是从头开始研究如何做到这一点。

The code.Very simple: read an image, apply scipy.fftpack.rfftin the leading two dimensions to get the “frequency-image”, rescale to 0–255, and save.

编码。很简单:读取一张图片,scipy.fftpack.rfft在前导二维上应用得到“频率-图片”,rescale到0-255,保存。

Note how this is different from the other answers! No grayscaling—the 2D real-to-real FFT happens independently on all three channels. No absneeded: the frequency-domain image can legitimately have negative values, and if you make them positive, you can't recover your original image. (Also a nice feature: no compromises on image size. The size of the array remains the same before and after the FFT, whether the width/height is even or odd.)

请注意这与其他答案的不同之处!无灰度——二维实对实 FFT 在所有三个通道上独立发生。没有abs必要:频域的图像可以合法拥有负值,如果你让他们积极的,你无法恢复原始图像。(也是一个不错的功能:不影响图像大小。无论宽度/高度是偶数还是奇数,数组的大小在 FFT 之前和之后都保持不变。)

from PIL import Image
import numpy as np
import scipy.fftpack as fp

## Functions to go from image to frequency-image and back
im2freq = lambda data: fp.rfft(fp.rfft(data, axis=0),
                               axis=1)
freq2im = lambda f: fp.irfft(fp.irfft(f, axis=1),
                             axis=0)

## Read in data file and transform
data = np.array(Image.open('test.png'))

freq = im2freq(data)
back = freq2im(freq)
# Make sure the forward and backward transforms work!
assert(np.allclose(data, back))

## Helper functions to rescale a frequency-image to [0, 255] and save
remmax = lambda x: x/x.max()
remmin = lambda x: x - np.amin(x, axis=(0,1), keepdims=True)
touint8 = lambda x: (remmax(remmin(x))*(256-1e-4)).astype(int)

def arr2im(data, fname):
    out = Image.new('RGB', data.shape[1::-1])
    out.putdata(map(tuple, data.reshape(-1, 3)))
    out.save(fname)

arr2im(touint8(freq), 'freq.png')

(Aside: FFT-lover geek note.Look at the documentation for rfftfor details, but I used Scipy's FFTPACK module because its rfftinterleaves real and imaginary components of a single pixel as two adjacent real values, guaranteeing that the output for any-sized 2D image (even vs odd, width vs height) will be preserved. This is in contrast to Numpy's numpy.fft.rfft2which, because it returns complex data of size width/2+1by height/2+1, forces you to deal with one extra row/column and deal with deinterleaving complex-to-real yourself. Who needs that hassle for this application.)

旁白:FFT-lover geek note。rfft有关详细信息,查看文档,但我使用了 Scipy 的 FFTPACK 模块,因为它将rfft单个像素的实部和虚部交错为两个相邻的实部值,从而保证任何尺寸的 2D 图像的输出(偶数与奇数,宽度与高度) 将被保留。这与 Numpy 形成对比numpy.fft.rfft2,因为它返回大小width/2+1为 的复杂数据height/2+1,迫使您处理一个额外的行/列并自己处理从复杂到真实的去交错.谁需要这个应用程序的麻烦。)

Results.Given input named test.png:

结果。给定输入名为test.png

test input

测试输入

this snippet produces the following output (global min/max have been rescaled and quantized to 0-255):

此代码段产生以下输出(全局最小值/最大值已重新缩放并量化为 0-255):

test output, frequency domain

测试输出,频域

And upscaled:

并升级:

frequency, upscaled

频率,放大

In this frequency-image, the DC (0 Hz frequency) component is in the top-left, and frequencies move higher as you go right and down.

在此频率图像中,DC(0 Hz 频率)分量位于左上角,随着您向右和向下移动,频率会更高。

Now, let's see what happens when you manipulate this image in a couple of ways. Instead of this test image, let's use a cat photo.

现在,让我们看看当您以几种方式处理此图像时会发生什么。让我们用一张猫照片代替这个测试图像。

original cat

原来的猫

I made a few mask images in Gimp that I then load into Python and multiply the frequency-image with to see what effect the mask has on the image.

我在 Gimp 中制作了一些蒙版图像,然后将其加载到 Python 中并将频率图像乘以查看蒙版对图像的影响。

Here's the code:

这是代码:

# Make frequency-image of cat photo
freq = im2freq(np.array(Image.open('cat.jpg')))

# Load three frequency-domain masks (DSP "filters")
bpfMask = np.array(Image.open('cat-mask-bpfcorner.png')).astype(float) / 255
hpfMask = np.array(Image.open('cat-mask-hpfcorner.png')).astype(float) / 255
lpfMask = np.array(Image.open('cat-mask-corner.png')).astype(float) / 255

# Apply each filter and save the output
arr2im(touint8(freq2im(freq * bpfMask)), 'cat-bpf.png')
arr2im(touint8(freq2im(freq * hpfMask)), 'cat-hpf.png')
arr2im(touint8(freq2im(freq * lpfMask)), 'cat-lpf.png')

Here's a low-pass filtermask on the left, and on the right, the result—click to see the full-res image:

这是左侧的低通滤波器掩码,右侧是结果 - 单击以查看全分辨率图像:

low-passed cat

低通猫

In the mask, black = 0.0, white = 1.0. So the lowest frequencies are kept here (white), while the high ones are blocked (black). This blurs the image by attenuating high frequencies. Low-pass filters are used all over the place, including when decimating (“downsampling”) an image (though they will be shaped much more carefully than me drawing in Gimp ).

在蒙版中,黑色 = 0.0,白色 = 1.0。所以最低频率被保留在这里(白色),而高频率被阻止(黑色)。这通过衰减高频来模糊图像。低通滤波器无处不在,包括在抽取(“下采样”)图像时(尽管它们的形状比我在 Gimp 中绘制的要仔细得多)。

Here's a band-pass filter, where the lowest frequencies (see that bit of white in the top-left corner?) and high frequencies are kept, but the middling-frequencies are blocked. Quite bizarre!

这是一个带通滤波器,其中保留了最低频率(看到左上角的那一点白色?)和高频,但阻止了中频。很奇怪!

band-passed cat

带通猫

Here's a high-pass filter, where the top-left corner that was left white in the above mask is blacked out:

这是一个高通滤波器,上面蒙版中留下白色的左上角被涂黑:

high-passed filter

高通滤波器

This is how edge-detection works.

这就是边缘检测的工作原理。

Postscript.Someone, make a webapp using this technique that lets you draw masks and apply them to an image real-time!!!

后记。有人,使用这种技术制作一个网络应用程序,让您可以绘制蒙版并将它们实时应用于图像!!!

回答by u354356007

There are several issues here.

这里有几个问题。

1)Manual conversion to grayscale isn't good. Use Image.open("test.png").convert('L')

1)手动转换为灰度不好。用Image.open("test.png").convert('L')

2)Most likely there is an issue with types. You shouldn't pass np.ndarrayfrom fft2to a PIL image without being sure their types are compatible. abs(np.fft.fft2(something))will return you an array of type np.float32or something like this, whereas PIL image is going to receive something like an array of type np.uint8.

2)很可能存在类型问题。你不应该传递np.ndarray来自fft2于PIL图像而不确定他们的类型是兼容的。abs(np.fft.fft2(something))将返回一个 type 数组np.float32或类似的东西,而 PIL image 将接收类似 type 数组的东西np.uint8

3)Scaling suggested in the comments looks wrong. You actually need your values to fit into 0..255 range.

3)评论中建议的缩放看起来不对。您实际上需要您的值适合 0..255 范围。

Here's my code that addresses these 3 points:

这是我的代码,解决了这 3 点:

import numpy as np
from PIL import Image

def fft(channel):
    fft = np.fft.fft2(channel)
    fft *= 255.0 / fft.max()  # proper scaling into 0..255 range
    return np.absolute(fft)

input_image = Image.open("test.png")
channels = input_image.split()  # splits an image into R, G, B channels
result_array = np.zeros_like(input_image)  # make sure data types, 
# sizes and numbers of channels of input and output numpy arrays are the save

if len(channels) > 1:  # grayscale images have only one channel
    for i, channel in enumerate(channels):
        result_array[..., i] = fft(channel)
else:
    result_array[...] = fft(channels[0])

result_image = Image.fromarray(result_array)
result_image.save('out.png')

I must admit I haven't managed to get results identical to the GIMP FFT plugin. As far as I see it does some post-processing. My results are all kinda very low contrast mess, and GIMP seems to overcome this by tuning contrast and scaling down non-informative channels (in your case all chanels except Red are just empty). Refer to the image:

我必须承认我没有设法获得与 GIMP FFT 插件相同的结果。据我所知,它做了一些后处​​理。我的结果都是非常低的对比度混乱,而 GIMP 似乎通过调整对比度和缩小非信息通道来克服这一点(在您的情况下,除 Red 之外的所有 chanels 都是空的)。参考图片:

enter image description here

在此处输入图片说明