vb.net MS System.Speech.Recognizer 和 SpeechRecognitionEngine 的准确性

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18821566/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 15:05:05  来源:igfitidea点击:

Accuracy of MS System.Speech.Recognizer and the SpeechRecognitionEngine

c#.netvb.netspeech-recognition

提问by darbid

I am currently testing the SpeechRecognitionEngine by loading from an xml file a pretty simple rule. In fact it is a simple between ("decrypt the email", "remove encryption") or ("encrypt the email", "add encryption").

我目前正在通过从 xml 文件加载一个非常简单的规则来测试 SpeechRecognitionEngine。事实上,它是(“解密电子邮件”、“删除加密”)或(“加密电子邮件”、“添加加密”)之间的简单方法。

I have trained my Windows 7 PC and additionally added the words encrypt and decrypt as I realize they are very similar. The recognizer already has a problem with making a difference between these two.

我已经训练了我的 Windows 7 PC 并额外添加了加密和解密这两个词,因为我意识到它们非常相似。识别器在区分这两者时已经遇到了问题。

The issue I am having is that it recognizes things too often. I have set the confidence to 0.93 because with my voice in a quiet room when saying the exact words sometimes only gets to 0.93. But then if I turn on the radio the voice of the announcer or a song can mean that this recognizer thinks it has heard with over 0.93 confidence with words "decrpyt the email".

我遇到的问题是它太频繁地识别事物。我将置信度设置为 0.93,因为在安静的房间里我的声音在说出确切的词时有时只能达到 0.93。但是,如果我打开收音机,播音员或歌曲的声音可能意味着该识别器认为它以超过 0.93 的置信度听到了“解密电子邮件”这样的词。

Maybe Lady Gaga is backmasking Applause to secretly decrypt emails :-)

也许 Lady Gaga 是在掩饰掌声来秘密解密电子邮件 :-)

Can anyone help in working out how to do something to make this recognizer workable.

任何人都可以帮助研究如何做一些事情以使这个识别器可用。

In fact the recognizer is also picking up keyboard noise as "decrypt the email". I don't understand how this is possible.

事实上,识别器也在“解密电子邮件”时拾取键盘噪音。我不明白这怎么可能。

Further to my editing buddy there are at least two managed namespaces for MS Speech Microsoft.Speech and System.Speech - It is important for this question that it be know that it is System.Speech.

除了我的编辑伙伴之外,MS Speech Microsoft.Speech 和 System.Speech 至少有两个托管名称空间 - 对于这个问题,知道它是 System.Speech 很重要。

回答by Eric Brown

If the onlything the System.Speech recognizer is listening for is "encrypt the email", then the recognizer will generate lotsof false positives. (Particularly in a noisy environment.) If you add a DictationGrammar (particularly a pronunciation grammar) in parallel, the DictationGrammar will pick up the noise, and you can check the (e.g.) name of the grammar in the event handler to discard the bogus recognitions.

如果System.Speech 识别器正在侦听的唯一事情是“加密电子邮件”,那么识别器将生成大量误报。(特别是在嘈杂的环境中。)如果并行添加 DictationGrammar(特别是发音语法),DictationGrammar 会拾取噪音,您可以在事件处理程序中检查(例如)语法的名称以丢弃伪造认可。

A (subset) example:

一个(子集)示例:

    static void Main(string[] args)
    {
        Choices gb = new Choices();
        gb.Add("encrypt the document");
        gb.Add("decrypt the document");
        Grammar commands = new Grammar(gb);
        commands.Name = "commands";
        DictationGrammar dg = new DictationGrammar("grammar:dictation#pronunciation");
        dg.Name = "Random";
        using (SpeechRecognitionEngine recoEngine = new SpeechRecognitionEngine(new CultureInfo("en-US")))
        {
        recoEngine.SetInputToDefaultAudioDevice();
        recoEngine.LoadGrammar(commands);
        recoEngine.LoadGrammar(dg);
        recoEngine.RecognizeCompleted += recoEngine_RecognizeCompleted;
        recoEngine.RecognizeAsync();

        System.Console.ReadKey(true);
        recoEngine.RecognizeAsyncStop();
        }
    }

    static void recoEngine_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
    {
        if (e.Result.Grammar.Name != "Random")
        {
            System.Console.WriteLine(e.Result.Text);
        }
    }