macos Mac OS X 语音到文本 API。如何?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/837582/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 06:08:32  来源:igfitidea点击:

Mac OS X speech to text API. Howto?

objective-ccocoamacosaudiospeech-recognition

提问by Roy Chan

I have a program that receives an audio (mono) stream of bits from TCP/IP. I am wondering whether the speech (speech-recognition) API in Mac OS X would be able to do a speech-to-text transform for me.

我有一个从 TCP/IP 接收音频(单声道)比特流的程序。我想知道 Mac OS X 中的语音(语音识别)API 是否能够为我进行语音到文本的转换。

(I don't mind saving the audio into .wav first and read it as oppose to do the transform on the fly).

(我不介意先将音频保存到 .wav 中,然后将其阅读为反对即时进行转换)。

I have read the official docs online, it is a bit confusing. And I couldn't find any good example about this topic.

我已经在线阅读了官方文档,这有点令人困惑。我找不到任何关于这个话题的好例子。

Also, should I do it in Cocoa/Carbon/Java or Objective-C?

另外,我应该用 Cocoa/Carbon/Java 还是 Objective-C 来做?

Can someone please shed some light?

有人可以透露一些信息吗?

Thanks.

谢谢。

采纳答案by diciu

There's a number of examples that get copied under /Developer/Examples/Speech/Recognition when you install XCode.

安装 XCode 时,有许多示例会复制到 /Developer/Examples/Speech/Recognition 下。

Cocoa class for speech recognition is NSSpeechRecognizer. I've not used it but as far as I know speech recognition requires you to build a grammar to help the engine choose from a number of choices rather then allowing you to pass free-form input. This is all explained in the examples referred above.

用于语音识别的 Cocoa 类是NSSpeechRecognizer。我没有使用过它,但据我所知,语音识别需要您构建一个语法来帮助引擎从多个选项中进行选择,而不是允许您传递自由格式的输入。这在上面提到的例子中都有解释。

回答by Latrokles

This comes a bit late perhaps, but I'll chime in anyway.

这可能来得有点晚,但无论如何我都会插话。

The speech recognition facilities in OS X (on both the Carbon and Cocoa side of things) are for speech command recognition, which means that they will recognize words (or phrases, commands) that have been loaded into the speech system language model. I've done some stuff with small dictionaries and it works pretty well, but if you want to recognize arbitrary speech things may turn hairier.

OS X 中的语音识别工具(在 Carbon 和 Cocoa 方面)用于语音命令识别,这意味着它们将识别已加载到语音系统语言模型中的单词(或短语、命令)。我用小词典做了一些工作,效果很好,但如果你想识别任意语音,事情可能会变得更糟。

Something else to keep in mind is that the functionality that the speech APIs in OS X provide is not one to one. The Carbon stuff provides functionality that has not made it to NSSpeechRecognizer(the docs make some mention of this).

要记住的另一件事是 OS X 中的语音 API 提供的功能不是一对一的。Carbon 的东西提供了尚未实现的功能NSSpeechRecognizer(文档对此有所提及)。

I don't know about Cocoa, but the Carbon Speech Recognition Manager does allow you to specify inputs other than a microphone so a sound stream would work just fine.

我不了解 Cocoa,但 Carbon Speech Recognition Manager 确实允许您指定麦克风以外的输入,因此声音流可以正常工作。

回答by valexa

You can use either ApplicationServices's SpeechSynthesis (10.0+)

您可以使用 ApplicationServices 的 SpeechSynthesis (10.0+)

CFStringRef cfstr = CFStringCreateWithCString(NULL,"Hello World!", kCFStringEncodingMacRoman);
Str255 pstr;    
CFStringGetPascalString(cfstr, pstr, 255, kCFStringEncodingMacRoman);   
SpeakString(pstr);

or AppKit's NSSpeechSynthesizer (10.3+)

或 AppKit 的 NSSpeechSynthesizer (10.3+)

NSSpeechSynthesizer *synth = [[NSSpeechSynthesizer alloc] initWithVoice:@"com.apple.speech.synthesis.voice.Alex"];
[synth startSpeakingString:@"Hello world!"];

回答by Charlie Martin

Here's a good O'Reilly articleto get you started.

这是一篇很好的 O'Reilly 文章,可以帮助您入门。