C# 来自实时音频输入的每分钟节拍数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/79445/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 11:17:28  来源:igfitidea点击:

Beats per minute from real-time audio input

提问by Karl

I'd like to write a simple C# application to monitor the line-in audio and give me the current (well, the rolling average) beats per minute.

我想编写一个简单的 C# 应用程序来监视线路输入音频并为我提供每分钟的当前(以及滚动平均值)节拍。

I've seen this gamedev article, and that was absolutely no help. I went through and tried to implement what he was doing but it just wasn't working.

我看过这篇游戏开发文章,这绝对没有帮助。我经历了并试图实施他正在做的事情,但它只是行不通。

I know there have to be tons of solutions for this, because lots of DJ software does it, but I'm not having any luck in finding any open-source library or instructions on doing it myself.

我知道必须有很多解决方案,因为很多 DJ 软件都这样做,但我没有找到任何开源库或自己做的说明。

回答by Thomas

This is by no means an easy problem. I'll try to give you an overview only.

这绝不是一个容易的问题。我会尽量给你一个概述。

What you could do is something like the following:

您可以执行以下操作:

  1. Compute the average (root-mean-square) loudness of the signal over blocks of, say, 5 milliseconds. (Having never done this before, I don't know what a good block size would be.)
  2. Take the Fourier transform of the "blocked" signal, using the FFT algorithm.
  3. Find the component in the transformed signal that has the largest magnitude.
  1. 计算信号在块上的平均(均方根)响度,例如 5 毫秒。(以前从未这样做过,我不知道一个好的块大小是多少。)
  2. 使用 FFT 算法对“阻塞”信号进行傅立叶变换。
  3. 找出变换信号中幅度最大的分量。

A Fourier transform is basically a way of computing the strength of all frequencies present in the signal. If you do that over the "blocked" signal, the frequency of the beat will hopefully be the strongest one.

傅里叶变换基本上是一种计算信号中所有频率强度的方法。如果您对“阻塞”信号执行此操作,则节拍频率有望成为最强的频率。

Maybe you need to apply a filter first, to focus on specific frequencies (like the bass) that usually contain the most information about the BPM.

也许您需要先应用过滤器,以专注于通常包含有关 BPM 的最多信息的特定频率(如低音)。

回答by Dan Harper

Not that I have a clue how to implement this, but from an audio engineering perspective you'd need to filter first. Bass drum hits would be the first to check. A low pass filter that gives you anything under about 200Hz should give you a pretty clear picture of the bass drum. A gate might also be necessary to cleanup any clutter from other instruments with harmonics that low.

并不是说我不知道​​如何实现这一点,但从音频工程的角度来看,您需要先进行过滤。低音鼓击打将是第一个检查。低通滤波器可以为您提供大约 200Hz 以下的任何内容,应该可以为您提供非常清晰的低音鼓图像。可能还需要一个门来清除来自其他具有低谐波的乐器的任何杂波。

The next to check would be snare hits. You'd have to EQ this one. The "crack" from a snare is around 1.5kHz from memory, but you'd need to definitely gate this one.

接下来要检查的是军鼓打击。你必须均衡这个。军鼓的“爆裂声”在记忆中大约为 1.5kHz,但您肯定需要对这个进行选通。

The next challenge would be to work out an algorithm for funky beats. How would you programatically find beat 1? I guess you'd keep track of previous beats and use a pattern matching something-or-other. So, you'd probably need a few bars to accurately find the beat. Then there's timing issues like 4/4, 3/4, 6/8, wow, I can't imagine what would be required to do this accurately! I'm sure it'd be worth some serious money to audio hardware/software companies.

下一个挑战是为时髦的节拍制定一个算法。您将如何以编程方式找到beat 1?我猜你会跟踪以前的节拍并使用匹配某物或其他东西的模式。因此,您可能需要几个小节才能准确找到节拍。然后是 4/4、3/4、6/8 之类的时间问题,哇,我无法想象要准确地做到这一点需要什么!我敢肯定,对于音频硬件/软件公司来说,它是值得的。

回答by Nick Johnson

There's an excellent project called Dancing Monkeys, which procedurally generates DDR dance steps from music. A large part of what it does is based on (necessarily very accurate) beat analysis, and their project paper goes into much detail describing the various beat detection algorithms and their suitability to the task. They include references to the original papers for each of the algorithms. They've also published the matlab code for their solution. I'm sure that between those you can find what you need.

有一个名为 Dancing Monkeys 的优秀项目,它从音乐中程序性地生成 DDR 舞步。它所做的很大一部分是基于(必须非常准确)节拍分析,他们的项目论文详细描述了各种节拍检测算法及其对任务的适用性。它们包括对每个算法的原始论文的参考。他们还发布了他们解决方案的 matlab 代码。我相信在这些之间你可以找到你需要的。

It's all available here: http://monket.net/dancing-monkeys-v2/Main_Page

这一切都可以在这里找到:http: //monket.net/dancing-monkeys-v2/Main_Page

回答by Hallgrim

Calculate a powerspectrum with a sliding window FFT: Take 1024 samples:

使用滑动窗口 FFT 计算功率谱:取 1024 个样本:

double[] signal = stream.Take(1024);

Feed it to an FFT algorithm:

将其提供给 FFT 算法:

double[] real = new double[signal.Length];
double[] imag = new double[signal.Length);
FFT(signal, out real, out imag);

You will get a real part and an imaginary part. Do NOT throw away the imaginary part. Do the same to the real part as the imaginary. While it is true that the imaginary part is pi / 2 out of phase with the real, it still contains 50% of the spectrum information.

你会得到一个实部和一个虚部。不要扔掉虚部。对实部和虚部做同样的事情。虽然虚部确实与实部异相 pi/2,但它仍然包含 50% 的频谱信息。

EDIT:

编辑:

Calculate the power as opposed to the amplitude so that you have a high number when it is loud and close to zero when it is quiet:

计算与振幅相对的功率,以便在大声时有一个高数字,在安静时接近零:

for (i=0; i < real.Length; i++) real[i] = real[i] * real[i];

Similarly for the imaginary part.

虚部也是如此。

for (i=0; i < imag.Length; i++) imag[i] = imag[i] * imag[i];

Now you have a power spectrum for the last 1024 samples. Where the first part of the spectrum is the low frequencies and the last part of the spectrum is the high frequencies.

现在您有最后 1024 个样本的功率谱。其中频谱的第一部分是低频,频谱的最后部分是高频。

If you want to find BPM in popular music you should probably focus on the bass. You can pick up the bass intensity by summing the lower part of the power spectrum. Which numbers to use depends on the sampling frequency:

如果您想在流行音乐中找到 BPM,您可能应该专注于低音。您可以通过对功率谱的较低部分求和来获取低音强度。使用哪些数字取决于采样频率:

double bassIntensity = 0;
for (i=8; i < 96; i++) bassIntensity += real[i];

Now do the same again but move the window 256 samples before you calculate a new spectrum. Now you end up with calculating the bassIntensity for every 256 samples.

现在再次执行相同的操作,但在计算新光谱之前移动窗口 256 个样本。现在您最终要计算每 256 个样本的 bassIntensity。

This is a good input for your BPM analysis. When the bass is quiet you do not have a beat and when it is loud you have a beat.

这是您 BPM 分析的一个很好的输入。当低音安静时,您没有节拍,而当低音响亮时,您有节拍。

Good luck!

祝你好运!

回答by Hallgrim

The easy way to do it is to have the user tap a button in rhythm with the beat, and count the number of taps divided by the time.

最简单的方法是让用户随着节拍的节奏轻敲按钮,然后计算轻敲次数除以时间。

回答by pete

First of all, what Hallgrim is producing is not the power spectral density function. Statistical periodicities in any signal can be brought out through an autocorrelation function. The fourier transform of the autocorrelation signal is the power spectral density. Dominant peaks in the PSD other than at 0 Hz will correspond to the effective periodicity in the signal (in Hz)...

首先,Hallgrim 产生的不是功率谱密度函数。任何信号中的统计周期都可以通过自相关函数得出。自相关信号的傅立叶变换是功率谱密度。PSD 中除 0 Hz 以外的主要峰值将对应于信号中的有效周期(以 Hz 为单位)...

回答by eandersson

I found this library which seem to have a pretty solid implementation for detecting Beats per Minute. http://soundtouchdotnet.codeplex.com/

我发现这个库似乎有一个非常可靠的实现来检测Beats per Minutehttp://soundtouchdotnet.codeplex.com/

It's based on http://www.surina.net/soundtouch/index.htmlwhich is used in quite a few DJ projects http://www.surina.net/soundtouch/applications.html

它基于http://www.surina.net/soundtouch/index.html,它用于很多 DJ 项目http://www.surina.net/soundtouch/applications.html

回答by Matt Williams

I'd recommend checking out the BASS audio library and the BASS.NET wrapper. It has a built in BPMCounter class.

我建议查看 BASS 音频库和 BASS.NET 包装器。它有一个内置的 BPMCounter 类。

Details for this specific function can be found at http://bass.radio42.com/help/html/0833aa5a-3be9-037c-66f2-9adfd42a8512.htm.

有关此特定功能的详细信息,请访问 http://bass.radio42.com/help/html/0833aa5a-3be9-037c-66f2-9adfd42a8512.htm