Python 语音识别库 - 总是听?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25394329/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:09:47  来源:igfitidea点击:

Python Voice Recognition Library - Always Listen?

pythonloopsspeech-recognition

提问by Matthew Sansom

I've recently been working on using a speech recognition library in python in order to launch applications. I Intend to ultimately use the library for voice activated home automation using the Raspberry Pi GPIO.

我最近一直致力于在 python 中使用语音识别库来启动应用程序。我打算最终将该库用于使用 Raspberry Pi GPIO 的语音激活家庭自动化。

I have this working, it detects my voice and launches application. The problem is that it seems to hang on the one word I say (for example, I say internet and it launches chrome an infinite number of times)

我有这个工作,它检测我的声音并启动应用程序。问题是它似乎挂在我说的一个词上(例如,我说 internet 并且它无限次启动 chrome)

This is unusual behavior from what I have seen of while loops. I cant figure out how to stop it looping. Do I need to do something out of the loop to make it work properly? Please see the code below.

从我所看到的 while 循环来看,这是不寻常的行为。我不知道如何阻止它循环。我是否需要在循环之外做一些事情才能使其正常工作?请看下面的代码。

http://pastebin.com/auquf1bR

http://pastebin.com/auquf1bR

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
        audio = r.listen(source)

def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()

回答by dano

The problem is that you only actually listen for speech once at the beginning of the program, and then just repeatedly call recognizeon the same bit of saved audio. Move the code that actually listens for speech into the whileloop:

问题是你实际上只在程序开始时听一次语音,然后重复调用recognize保存的音频的同一位。将实际监听语音的代码移到while循环中:

import pyaudio,os
import speech_recognition as sr


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction(source):
    audio = r.listen(source)
    user = r.recognize(audio)
    print(user)
    if user == "Excel":
        excel()
    elif user == "Internet":
        internet()
    elif user == "music":
        media()

if __name__ == "__main__":
    r = sr.Recognizer()
    with sr.Microphone() as source:
        while 1:
            mainfunction(source)

回答by Nikolay Shmyrev

Just in case, here is the example on how to listen continuously for keyword in pocketsphinx, this is going to be way easier than to send audio to google continuously. And you could have way more flexible solution.

以防万一,这里是如何在pocketsphinx 中连续监听关键字的示例,这比连续向谷歌发送音频要容易得多。你可以有更灵活的解决方案。

import sys, os, pyaudio
from pocketsphinx import *

modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)

decoder = Decoder(config)
decoder.start_utt('spotting')

stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()        

while True:
    buf = stream.read(1024)
    decoder.process_raw(buf, False, False)
    if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
        print "Detected keyword, restarting search"
        decoder.end_utt()
        decoder.start_utt('spotting')

回答by Connor

I've spent a lot of time working on this subject.

我花了很多时间研究这个主题。

Currently I'm developing a Python 3 open-source cross-platform virtual assistant program called Athena Voice: https://github.com/athena-voice/athena-voice-client

目前我正在开发一个名为 Athena Voice 的 Python 3 开源跨平台虚拟助手程序:https: //github.com/athena-voice/athena-voice-client

Users can use it much like Siri, Cortana, or Amazon Echo.

用户可以像使用 Siri、Cortana 或 Amazon Echo 一样使用它。

It also uses a very simple "module" system where users can easily write their own modules to enhance it's functionality. Let me know if that could be of use.

它还使用一个非常简单的“模块”系统,用户可以轻松地编写自己的模块来增强其功能。让我知道这是否有用。

Otherwise, I recommend looking into Pocketsphinx and Google's Python speech-to-text/text-to-speech packages.

否则,我建议查看 Pocketsphinx 和 Google 的 Python 语音到文本/文本到语音包。

On Python 3.4, Pocketsphinx can be installed with:

在 Python 3.4 上,Pocketsphinx 可以安装:

pip install pocketsphinx

However, you must install the PyAudio dependency separately (unofficial download): http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

但是,您必须单独安装 PyAudio 依赖项(非官方下载):http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

Both google packages can be installed by using the command:

两个 google 包都可以使用以下命令安装:

pip install SpeechRecognition gTTS

Google STT: https://pypi.python.org/pypi/SpeechRecognition/

谷歌 STT:https: //pypi.python.org/pypi/SpeechRecognition/

Google TTS: https://pypi.python.org/pypi/gTTS/1.0.2

谷歌 TTS:https: //pypi.python.org/pypi/gTTS/1.0.2

Pocketsphinx should be used for offline wake-up-word recognition, and Google STT should be used for active listening.

Pocketsphinx 应该用于离线唤醒词识别,谷歌 STT 应该用于主动收听。

回答by Muhammad Naufil

That's sad but you have to initialise microphone in every loop and since, this module always have r.adjust_for_ambient_noise(source), which makes sure, that it understands your voice in noisy room too. Setting threshold takes time and can skip some of your words, if you are continuously giving commands

这很可悲,但您必须在每个循环中初始化麦克风,因为该模块始终具有 r.adjust_for_ambient_noise(source),这确保它在嘈杂的房间中也能理解您的声音。设置阈值需要时间并且可以跳过一些单词,如果您不断发出命令

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        with sr.Microphone() as source:
            r.adjust_for_ambient_noise(source)
            audio = r.listen(source)
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()