.net System.Speech.Recognition 和 Microsoft.Speech.Recognition 有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2977338/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 14:21:51  来源:igfitidea点击:

What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition?

.netspeech-recognitionspeechucma2.0ucs

提问by Michael Levy

There are two similar namespaces and assemblies for speech recognition in .NET. I'm trying to understand the differences and when it is appropriate to use one or the other.

.NET 中有两个类似的命名空间和程序集用于语音识别。我试图了解这些差异以及何时适合使用其中之一。

There is System.Speech.Recognition from the assembly System.Speech (in System.Speech.dll). System.Speech.dll is a core DLL in the .NET Framework class library 3.0 and later

程序集 System.Speech 中有 System.Speech.Recognition(在 System.Speech.dll 中)。System.Speech.dll 是 .NET Framework 类库 3.0 及更高版本中的核心 DLL

There is also Microsoft.Speech.Recognition from the assembly Microsoft.Speech (in microsoft.speech.dll). Microsoft.Speech.dll is part of the UCMA 2.0 SDK

还有来自程序集 Microsoft.Speech(在 microsoft.speech.dll 中)的 Microsoft.Speech.Recognition。Microsoft.Speech.dll 是 UCMA 2.0 SDK 的一部分

I find the docs confusing and I have the following questions:

我发现文档令人困惑,我有以下问题:

System.Speech.Recognition says it is for "The Windows Desktop Speech Technology", does this mean it cannot be used on a server OS or cannot be used for high scale applications?

System.Speech.Recognition 说它是用于“Windows 桌面语音技术”,这是否意味着它不能用于服务器操作系统或不能用于大规模应用程序?

The UCMA 2.0 Speech SDK ( http://msdn.microsoft.com/en-us/library/dd266409%28v=office.13%29.aspx) says that it requires Microsoft Office Communications Server 2007 R2 as a prerequisite. However, I've been told at conferences and meetings that if I do not require OCS features like presence and workflow I can use the UCMA 2.0 Speech API without OCS. Is this true?

UCMA 2.0 语音 SDK ( http://msdn.microsoft.com/en-us/library/dd266409%28v=office.13%29.aspx) 说它需要 Microsoft Office Communications Server 2007 R2 作为先决条件。但是,我在会议和会议上被告知,如果我不需要状态和工作流等 OCS 功能,我可以在没有 OCS 的情况下使用 UCMA 2.0 Speech API。这是真的?

If I'm building a simple recognition app for a server application (say I wanted to automatically transcribe voice mails) and I don't need features of OCS, what are the differences between the two APIs?

如果我正在为服务器应用程序构建一个简单的识别应用程序(比如我想自动转录语音邮件)并且我不需要 OCS 的功能,那么这两个 API 之间有什么区别?

回答by Eric Brown

The short answer is that Microsoft.Speech.Recognition uses the Server version of SAPI, while System.Speech.Recognition uses the Desktop version of SAPI.

简短的回答是 Microsoft.Speech.Recognition 使用 SAPI 的服务器版本,而 System.Speech.Recognition 使用 SAPI 的桌面版本。

The APIs are mostly the same, but the underlying engines are different. Typically, the Server engine is designed to accept telephone-quality audio for command & control applications; the Desktop engine is designed to accept higher-quality audio for both command & control and dictation applications.

API 基本相同,但底层引擎不同。通常,服务器引擎旨在为命令和控制应用程序接受电话质量的音频;桌面引擎旨在为命令和控制以及听写应用程序接受更高质量的音频。

You can use System.Speech.Recognition on a server OS, but it's not designed to scale nearly as well as Microsoft.Speech.Recognition.

您可以在服务器操作系统上使用 System.Speech.Recognition,但它的扩展性并不像 Microsoft.Speech.Recognition 那样好。

The differences are that the Server engine won't need training, and will work with lower-quality audio, but will have a lower recognition quality than the Desktop engine.

不同之处在于服务器引擎不需要培训,可以处理质量较低的音频,但识别质量低于桌面引擎。

回答by Michael Levy

I found Eric's answerreally helpful, I just wanted to add some more details that I found.

我发现Eric 的回答真的很有帮助,我只是想添加一些我发现的更多细节。

System.Speech.Recognition can be used to program the desktop recognizers. SAPI and Desktop recognizers have shipped in the products:

System.Speech.Recognition 可用于对桌面识别器进行编程。SAPI 和桌面识别器已在产品中提供:

  • Windows XP: SAPI v5.1 and no recognizer
  • Windows XP Tablet Edition: SAPI v5.1 and Recognizer v6.1
  • Windows Vista: SAPI v5.3 and Recognizer v8.0
  • Windows 7: SAPI v5.4 and Recognizer v8.0?
  • Windows XP:SAPI v5.1 且无识别器
  • Windows XP 平板版:SAPI v5.1 和 Recognizer v6.1
  • Windows Vista:SAPI v5.3 和 Recognizer v8.0
  • Windows 7:SAPI v5.4 和 Recognizer v8.0?

Servers come with SAPI, but no recognizer:

服务器带有 SAPI,但没有识别器:

  • Windows Server 2003: SAPI v5.1 and no recognizer
  • Windows Server 2008 and 2008 R2: SAPI v5.3? and no recognizer
  • Windows Server 2003:SAPI v5.1 且无识别器
  • Windows Server 2008 和 2008 R2:SAPI v5.3?并且没有识别器

Desktop recognizers have also shipped in products like office.

桌面识别器也用于办公等产品。

  • Microsoft Office 2003: Recognizer v6.1
  • Microsoft Office 2003:识别器 v6.1

Microsoft.Speech.Recognition can be used to program the server recognizers. Server recognizers have shipped in the products:

Microsoft.Speech.Recognition 可用于对服务器识别器进行编程。服务器识别器已在产品中提供:

  • Speech Server (various versions)
  • Office Communications Server (OCS) (various versions)
  • UCMA – which is a managed API for OCS that (I believe) included a redistributable recognizer
  • Microsoft Server Speech Platform – recognizer v10.2
  • 语音服务器(各种版本)
  • Office Communications Server (OCS)(各种版本)
  • UCMA——这是一个 OCS 的托管 API,(我相信)包括一个可再发行的识别器
  • Microsoft Server Speech Platform – 识别器 v10.2

The complete SDK for the Microsoft Server Speech Platform 10.2 version is available at http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4. The speech engine is a free download. Version 11 is now available at http://www.microsoft.com/download/en/details.aspx?id=27226.

Microsoft Server Speech Platform 10.2 版本的完整 SDK 可从http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4 获得。语音引擎可免费下载。版本 11 现在可从http://www.microsoft.com/download/en/details.aspx?id=27226 获得

For Microsoft Speech Platform SDK 11 info and downloads, see:

有关 Microsoft Speech Platform SDK 11 信息和下载,请参阅:

Desktop recognizers are designed to run inproc or shared. Shared recognizers are useful on the desktop where voice commands are used to control any open applications. Server recognizers can only run inproc. Inproc recognizers are used when a single application uses the recognizer or when wav files or audio streams need to be recognized (shared recognizers can't process audio files, just audio from input devices).

桌面识别器旨在运行 inproc 或共享。共享识别器在桌面上很有用,其中语音命令用于控制任何打开的应用程序。服务器识别器只能运行 inproc。Inproc 识别器用于单个应用程序使用识别器或需要识别 wav 文件或音频流时(共享识别器无法处理音频文件,只能处理来自输入设备的音频)。

Only Desktop speech recognizers include a dictation grammar (system provided grammar used for free text dictation). The class System.Speech.Recognition.DictationGrammar has no complement in the Microsoft.Speech namespace.

只有桌面语音识别器包含听写语法(系统提供的用于自由文本听写的语法)。System.Speech.Recognition.DictationGrammar 类在 Microsoft.Speech 命名空间中没有补充。

You can use use the APIs to query determine your installed recongizers

您可以使用 API 来查询确定您安装的识别器

  • Desktop: System.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()
  • Server: Microsoft.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()
  • 桌面:System.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()
  • 服务器:Microsoft.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()

I found that I can also see what recognizers are installed by looking at the registry keys:

我发现我还可以通过查看注册表项来查看安装了哪些识别器:

  • Desktop recognizers: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Recognizers\Tokens
  • Server recognizers: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech Server\v10.0\Recognizers\Tokens
  • 桌面识别器:HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Recognizers\Tokens
  • 服务器识别器:HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech Server\v10.0\Recognizers\Tokens

--- Update ---

- - 更新 - -

As discussed in Microsoft Speech Recognition - what reference do I have to add?, Microsoft.Speech is also the API used for the Kinect recognizer. This is documented in the MSDN article http://msdn.microsoft.com/en-us/library/hh855387.aspx

Microsoft Speech Recognition 中所述 - 我必须添加什么参考?, Microsoft.Speech 也是 Kinect 识别器使用的 API。这在 MSDN 文章http://msdn.microsoft.com/en-us/library/hh855387.aspx 中有记录

回答by Switch Commerce

Here is the link for the Speech Library (MS Server Speech Platform):

这是语音库(MS 服务器语音平台)的链接:

Microsoft Server Speech Platform 10.1 Released (SR and TTS in 26 languages)

Microsoft Server Speech Platform 10.1 发布(26 种语言的 SR 和 TTS)

回答by George Birbilis

Seems Microsoft wrote an article that clears things up regarding the differences between Microsoft Speech Platform and Windows SAPI - https://msdn.microsoft.com/en-us/library/jj127858.aspx. A difference I found myself while converting Speech recognition code for Kinect from Microsoft.Speech to System.Speech (see http://github.com/birbilis/Hotspotizer) was that the former supports SGRS grammars with tag-format=semantics/1.0-literals, while the latter doesn't and you have to convert to semantics/1.0 by changing x to out="x"; at tags

似乎微软写了一篇文章,澄清了微软语音平台和 Windows SAPI 之间的差异 - https://msdn.microsoft.com/en-us/library/jj127858.aspx。的差异,我发现自己在从Microsoft.Speech转换语音识别代码的Kinect来System.Speech(见http://github.com/birbilis/Hotspotizer)是,前者支持的SGR与标签格式=语义/ 1.0-文法文字,而后者没有,您必须通过将 x 更改为 out="x" 来转换为语义/1.0;在标签