Python 使用来自实时麦克风的 pyaudio 检测点击

Question

提问by a sandwhich

How would I use pyaudio to detect a sudden tapping noise from a live microphone?

我将如何使用 pyaudio 检测来自现场麦克风的突然敲击声？

Answer 1

采纳答案by Russell Borogove

One way I've done it:

我这样做的一种方式：

read a block of samples at a time, say 0.05 seconds worth
compute the RMS amplitude of the block (square root of the average of the squares of the individual samples)
if the block's RMS amplitude is greater than a threshold, it's a "noisy block" else it's a "quiet block"
a sudden tap would be a quiet block followed by a small number of noisy blocks followed by a quiet block
if you never get a quiet block, your threshold is too low
if you never get a noisy block, your threshold is too high

一次读取一个样本块，比如 0.05 秒
计算块的 RMS 幅度（单个样本的平方平均值的平方根）
如果块的 RMS 幅度大于阈值，则为“噪声块”，否则为“安静块”
突然点击将是一个安静的街区，然后是少量嘈杂的街区，然后是一个安静的街区
如果你从来没有得到一个安静的街区，你的门槛太低了
如果你从来没有遇到嘈杂的街区，那么你的门槛就太高了

My application was recording "interesting" noises unattended, so it would record as long as there were noisy blocks. It would multiply the threshold by 1.1 if there was a 15-second noisy period ("covering its ears") and multiply the threshold by 0.9 if there was a 15-minutequiet period ("listening harder"). Your application will have different needs.

我的应用程序在无人看管的情况下录制“有趣”的噪音，因此只要有噪音块，它就会录制。如果有 15 秒的嘈杂期（“捂住耳朵”），它将阈值乘以 1.1，如果有 15分钟的安静期（“听得更努力”），则将阈值乘以 0.9 。您的应用程序会有不同的需求。

Also, just noticed some comments in my code regarding observed RMS values. On the built in mic on a Macbook Pro, with +/- 1.0 normalized audio data range, with input volume set to max, some data points:

另外，刚刚注意到我的代码中关于观察到的 RMS 值的一些注释。在 Macbook Pro 的内置麦克风上，具有 +/- 1.0 归一化音频数据范围，输入音量设置为最大，一些数据点：

0.003-0.006 (-50dB to -44dB) an obnoxiously loud central heating fan in my house
0.010-0.40 (-40dB to -8dB) typing on the same laptop
0.10 (-20dB) snapping fingers softly at 1' distance
0.60 (-4.4dB) snapping fingers loudly at 1'

0.003-0.006 (-50dB 到 -44dB) 我家的中央暖气风扇噪音很大
在同一台笔记本电脑上打字 0.010-0.40（-40dB 到 -8dB）
0.10 (-20dB) 手指在 1' 距离处轻轻弹响
0.60 (-4.4dB) 在 1' 处响亮地弹响手指

Update: here's a sample to get you started.

更新：这是一个让您入门的示例。

#!/usr/bin/python

# open a microphone in pyAudio and listen for taps

import pyaudio
import struct
import math

INITIAL_TAP_THRESHOLD = 0.010
FORMAT = pyaudio.paInt16 
SHORT_NORMALIZE = (1.0/32768.0)
CHANNELS = 2
RATE = 44100  
INPUT_BLOCK_TIME = 0.05
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
# if we get this many noisy blocks in a row, increase the threshold
OVERSENSITIVE = 15.0/INPUT_BLOCK_TIME                    
# if we get this many quiet blocks in a row, decrease the threshold
UNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME 
# if the noise was longer than this many blocks, it's not a 'tap'
MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIME

def get_rms( block ):
    # RMS amplitude is defined as the square root of the 
    # mean over time of the square of the amplitude.
    # so we need to convert this string of bytes into 
    # a string of 16-bit samples...

    # we will get one short out for each 
    # two chars in the string.
    count = len(block)/2
    format = "%dh"%(count)
    shorts = struct.unpack( format, block )

    # iterate over the block.
    sum_squares = 0.0
    for sample in shorts:
        # sample is a signed short in +/- 32768. 
        # normalize it to 1.0
        n = sample * SHORT_NORMALIZE
        sum_squares += n*n

    return math.sqrt( sum_squares / count )

class TapTester(object):
    def __init__(self):
        self.pa = pyaudio.PyAudio()
        self.stream = self.open_mic_stream()
        self.tap_threshold = INITIAL_TAP_THRESHOLD
        self.noisycount = MAX_TAP_BLOCKS+1 
        self.quietcount = 0 
        self.errorcount = 0

    def stop(self):
        self.stream.close()

    def find_input_device(self):
        device_index = None            
        for i in range( self.pa.get_device_count() ):     
            devinfo = self.pa.get_device_info_by_index(i)   
            print( "Device %d: %s"%(i,devinfo["name"]) )

            for keyword in ["mic","input"]:
                if keyword in devinfo["name"].lower():
                    print( "Found an input: device %d - %s"%(i,devinfo["name"]) )
                    device_index = i
                    return device_index

        if device_index == None:
            print( "No preferred input found; using default input device." )

        return device_index

    def open_mic_stream( self ):
        device_index = self.find_input_device()

        stream = self.pa.open(   format = FORMAT,
                                 channels = CHANNELS,
                                 rate = RATE,
                                 input = True,
                                 input_device_index = device_index,
                                 frames_per_buffer = INPUT_FRAMES_PER_BLOCK)

        return stream

    def tapDetected(self):
        print("Tap!")

    def listen(self):
        try:
            block = self.stream.read(INPUT_FRAMES_PER_BLOCK)
        except IOError as e:
            # dammit. 
            self.errorcount += 1
            print( "(%d) Error recording: %s"%(self.errorcount,e) )
            self.noisycount = 1
            return

        amplitude = get_rms( block )
        if amplitude > self.tap_threshold:
            # noisy block
            self.quietcount = 0
            self.noisycount += 1
            if self.noisycount > OVERSENSITIVE:
                # turn down the sensitivity
                self.tap_threshold *= 1.1
        else:            
            # quiet block.

            if 1 <= self.noisycount <= MAX_TAP_BLOCKS:
                self.tapDetected()
            self.noisycount = 0
            self.quietcount += 1
            if self.quietcount > UNDERSENSITIVE:
                # turn up the sensitivity
                self.tap_threshold *= 0.9

if __name__ == "__main__":
    tt = TapTester()

    for i in range(1000):
        tt.listen()

Answer 2

回答by user1405612

a simplified version of the above code...

上面代码的简化版本......

import pyaudio
import struct
import math

INITIAL_TAP_THRESHOLD = 0.010
FORMAT = pyaudio.paInt16 
SHORT_NORMALIZE = (1.0/32768.0)
CHANNELS = 2
RATE = 44100  
INPUT_BLOCK_TIME = 0.05
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)

OVERSENSITIVE = 15.0/INPUT_BLOCK_TIME                    

UNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME # if we get this many quiet blocks in a row, decrease the threshold

MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIME # if the noise was longer than this many blocks, it's not a 'tap'

def get_rms(block):

    # RMS amplitude is defined as the square root of the 
    # mean over time of the square of the amplitude.
    # so we need to convert this string of bytes into 
    # a string of 16-bit samples...

    # we will get one short out for each 
    # two chars in the string.
    count = len(block)/2
    format = "%dh"%(count)
    shorts = struct.unpack( format, block )

    # iterate over the block.
    sum_squares = 0.0
    for sample in shorts:
    # sample is a signed short in +/- 32768. 
    # normalize it to 1.0
        n = sample * SHORT_NORMALIZE
        sum_squares += n*n

    return math.sqrt( sum_squares / count )

pa = pyaudio.PyAudio()                                 #]
                                                       #|
stream = pa.open(format = FORMAT,                      #|
         channels = CHANNELS,                          #|---- You always use this in pyaudio...
         rate = RATE,                                  #|
         input = True,                                 #|
         frames_per_buffer = INPUT_FRAMES_PER_BLOCK)   #]

tap_threshold = INITIAL_TAP_THRESHOLD                  #]
noisycount = MAX_TAP_BLOCKS+1                          #|---- Variables for noise detector...
quietcount = 0                                         #|
errorcount = 0                                         #]         

for i in range(1000):
    try:                                                    #]
        block = stream.read(INPUT_FRAMES_PER_BLOCK)         #|
    except IOError, e:                                      #|---- just in case there is an error!
        errorcount += 1                                     #|
        print( "(%d) Error recording: %s"%(errorcount,e) )  #|
        noisycount = 1                                      #]

    amplitude = get_rms(block)
    if amplitude > tap_threshold: # if its to loud...
        quietcount = 0
        noisycount += 1
        if noisycount > OVERSENSITIVE:
            tap_threshold *= 1.1 # turn down the sensitivity

    else: # if its to quiet...

        if 1 <= noisycount <= MAX_TAP_BLOCKS:
            print 'tap!'
        noisycount = 0
        quietcount += 1
        if quietcount > UNDERSENSITIVE:
            tap_threshold *= 0.9 # turn up the sensitivity

Python 使用来自实时麦克风的 pyaudio 检测点击

提问by a sandwhich

采纳答案by Russell Borogove

回答by user1405612

相关推荐

最近更新

标签

Python 使用来自实时麦克风的 pyaudio 检测点击

提问by a sandwhich

采纳答案by Russell Borogove

回答by user1405612

相关推荐

Python 中的递归循环函数

Python 在 tkinter 中交互式验证 Entry 小部件内容

Python 如何在同一目录或子目录中导入类？

如何在 Python 中绘制带有空圆圈的散点图？

相关推荐

最近更新

标签