java 快速傅里叶变换(FFT)输入和输出以分析Java中音频文件的频率?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6620544/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 16:39:17  来源:igfitidea点击:

Fast Fourier Transform (FFT) input and output to analyse the frequency of audio files in Java?

javaaudiojava-mefft

提问by thongcaoloi

I have to use FFT to analyse the frequency of an audio file. But I don't know what the input and output is.

我必须使用 FFT 来分析音频文件的频率。但我不知道输入和输出是什么。

Do I have to use 1-dimension, 2-dimension or 3-dimension array if I want to draw the spectrum's audio file? And can someone suggest me library for FFT on J2ME?

如果我想绘制频谱的音频文件,我是否必须使用一维、二维或三维数组?有人可以建议我在 J2ME 上使用 FFT 库吗?

回答by Ernest Barkowski

@thongcaoloi,

@thongcaoloi,

The simple answer regarding the dimensionality of your input data is: you need 1D data. Now I'll explain what that means.

关于输入数据维度的简单答案是:您需要一维数据。现在我将解释这意味着什么。

Because you want to analyze audio data, your input to the discrete Fourier transform (DFT or FFT), is a 1-dimensional sequence of real numbers, which represents the changing voltage of the audio signal over time, and your audio file is a digital representation of that changing voltage over time.

因为您要分析音频数据,所以离散傅立叶变换(DFT 或 FFT)的输入是一维实数序列,表示音频信号随时间变化的电压,而您的音频文件是数字表示随时间变化的电压。

Your audio file was produced by sampling the voltage of a continuous audio signal at a fixed sampling rate (also known as the sampling frequency), typically 44.1 KHz for CD quality audio.

您的音频文件是通过以固定采样率(也称为采样频率)对连续音频信号的电压进行采样而生成的,对于 CD 质量的音频,通常为 44.1 KHz。

But your data file could have been sampled at a much lower frequency, so try to find out the sampling frequency of your data before you do an FFT on that data.

但是您的数据文件可能以低得多的频率采样,因此在对该数据执行 FFT 之前尝试找出数据的采样频率。

So now you have to extract the individual samples from your audio file. If your file is stereo, it will have two separate sample sequences, one for the right channel and one for the left channel. If the file is mono, it will have only one sample sequence.

所以现在您必须从音频文件中提取单个样本。如果您的文件是立体声文件,它将有两个单独的样本序列,一个用于右声道,另一个用于左声道。如果文件是单声道,它将只有一个样本序列。

If your file is stereo, or any other multi-channel audio format such as 5.1 or 7.1, you could FFT each channel separately, or you could combine any number of channels together using voltage addition. That's up to you, and depends on what you're trying to do with your FFT results.

如果您的文件是立体声文件或任何其他多声道音频格式,例如 5.1 或 7.1,您可以分别对每个声道进行 FFT,或者您可以使用电压相加将任意数量的声道组合在一起。这取决于你,取决于你想用 FFT 结果做什么。

The output of the DFT or FFT is a sequence of complex numbers. Each complex number is a pair consisting of a real-part and an imaginary-part, typically shown as a pair (re,im).

DFT 或 FFT 的输出是复数序列。每个复数都是由实部和虚部组成的对,通常显示为一对 (re,im)。

If you want to graph the power spectral density of your audio file, which is what most people want from the FFT, you'll graph 20*log10( sqrt( re^2 + im^2 ) ), using the first N/2 complex numbers of the FFT output, where N is the number of input samples to the FFT.

如果您想绘制音频文件的功率谱密度图,这是大多数人希望从 FFT 得到的,您将绘制 20*log10( sqrt( re^2 + im^2 ) ),使用第一个 N/2 FFT 输出的复数,其中 N 是 FFT 的输入样本数。

You can try to build your own spectrum analyzer software program, but I suggest using something that's already built and tested.

您可以尝试构建自己的频谱分析仪软件程序,但我建议使用已经构建和测试过的软件。

These two FFT spectrum analyzers give results instantly, and have built-in IFFT synthesis, meaning that you can inverse Fourier transform the frequency-domain spectral data to reconstruct the original signal in the time-domain.

这两款 FFT 频谱分析仪即时给出结果,并具有内置的 IFFT 合成功能,这意味着您可以对频域频谱数据进行逆傅里叶变换,从而在时域重建原始信号。

http://www.mathworks.com/help/techdoc/ref/fft.html

http://www.mathworks.com/help/techdoc/ref/fft.html

http://www.sooeet.com/math/fft.php

http://www.sooeet.com/math/fft.php

There's a lot more to this topic, and to the subject of digital signal processing in general, but this brief introduction, should get you started.

这个主题和数字信号处理的一般主题还有很多,但这个简短的介绍应该让你开始。

回答by Jeremy Salwen

In the theoretical sense, an FFT maps complex[N] => complex[N]. However, if your data is just an audio file, then your input will be simply complex numbers with no imaginary component. Thus you will map real[N] =>complex[N]. However, with a little math, you see that the format of the output will always be output[i]==complex_conjugate(output[N-i]). Thus you really only need to look at the first N/2+1 samples. Additionally, the complex output of the FFT gives you information about both phase and magnitude. If all you care about is howmuch of a certain frequency is in your audio, you only need to look at the magnitude, which can be calculated as square_root(imaginary^2+real^2), for each element of the output.

在理论上,FFT 映射 complex[N] => complex[N]。但是,如果您的数据只是一个音频文件,那么您的输入将只是没有虚部的复数。因此,您将映射 real[N] =>complex[N]。但是,通过一些数学运算,您会看到输出的格式将始终为 output[i]==complex_共轭(output[Ni])。因此,您实际上只需要查看前 N/2+1 个样本。此外,FFT 的复数输出可为您提供有关相位和幅度的信息。如果你所关心的是如何多一定频率的在你的音响,你只需要看看大小,它可以作为square_root计算(虚^ 2 +真正^ 2),为输出的每个元素。

Of course, you'll need to look at the documentation of whatever library you use to understand which array element corresponds to the real part of the Nth complex output, and likewise to find the imaginary part of the Nth complex output.

当然,您需要查看您使用的任何库的文档,以了解哪个数组元素对应于第 N 个复数输出的实部,同样需要查找第 N 个复数输出的虚部。

回答by Jeremy Salwen

As I remember FFT algorithm is not that complex, I used to write a Class of FFT calculation for my thesis. At that time the input is a 1D array of values which are read from the *.WAV files. But before FFT, there were some filtering and normalization performed.

因为我记得 FFT 算法并不那么复杂,所以我曾经为我的论文写过一类 FFT 计算。当时的输入是从 *.WAV 文件中读取的一维值数组。但在 FFT 之前,进行了一些过滤和归一化。