使用录制的声音剪辑在Android上进行语音识别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2319735/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Voice recognition on android with recorded sound clip?
提问by CodeFusionMobile
I've used the voice recognition feature on Android and I love it. It's one of my customers' most praised features. However, the format is somewhat restrictive. You have to call the recognizer intent, have it send the recording for transcription to google, and wait for the text back.
我在 Android 上使用过语音识别功能,我很喜欢它。这是我的客户最受赞誉的功能之一。但是,格式有些限制。您必须调用识别器意图,让它将转录录音发送给谷歌,然后等待文本返回。
Some of my ideas would require recording the audio within my app and then sending the clip to google for transcription.
我的一些想法需要在我的应用程序中录制音频,然后将剪辑发送到谷歌进行转录。
Is there any way I can send an audio clip to be processed with speech to text?
有什么方法可以发送音频剪辑以进行语音转文本处理吗?
回答by lsantsan
I got a solution that is working well to have speech recognizing and audio recording. Here is the linkto a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.
我得到了一个可以很好地进行语音识别和录音的解决方案。这是我创建的一个简单 Android 项目的链接,用于展示解决方案的工作情况。此外,我在项目中放置了一些打印屏幕来说明应用程序。
I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.
我将尝试简要解释我使用的方法。我在该项目中结合了两个功能:Google Speech API 和 Flac 录音。
Google Speech API is called through HTTP connections. Mike Pultzgives more details about the API:
Google Speech API 通过 HTTP 连接调用。Mike Pultz提供了有关 API 的更多详细信息:
"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."
“(...) 新的 [Google] API 是一个全双工流媒体 API。这意味着它实际上使用了两个 HTTP 连接——一个 POST 请求将内容上传为“实时”分块流,以及一个第二个 GET 请求访问结果,这对于更长的音频样本或流式音频更有意义。”
However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording
但是,此 API 需要接收 FLAC 声音文件才能正常工作。这让我们进入第二部分:Flac 录音
I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.
我通过从名为 AudioBoo 的开源应用程序中提取和改编一些代码和库,在该项目中实现了 Flac 录音。AudioBoo 使用本机代码来录制和播放 flac 格式。
Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.
因此,可以录制flac 声音,将其发送到Google Speech API,获取文本并播放刚刚录制的声音。
The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.
我创建的项目具有使其工作的基本原则,并且可以针对特定情况进行改进。为了让它在不同的场景中工作,有必要获得一个 Google Speech API 密钥,该密钥是通过加入 Google Chromium-dev 组获得的。我在那个项目中留下了一把钥匙只是为了表明它正在工作,但我最终会删除它。如果有人需要更多相关信息,请告诉我,因为我无法在这篇文章中放置 2 个以上的链接。
回答by Trevor Johns
Unfortunately not at this time. The only interface currently supported by Android's voice recognition service is the RecognizerIntent
, which doesn't allow you to provide your own sound data.
不幸的是,不是在这个时候。Android 的语音识别服务目前唯一支持的界面是RecognizerIntent
,它不允许您提供自己的声音数据。
If this is something you'd like to see, file a feature request at http://b.android.com. This is also tangentially related to existing issue 4541.
如果这是您希望看到的内容,请在http://b.android.com 上提交功能请求。这也与现有问题 4541密切相关。
回答by zen_of_kermit
As far as I know there is still no way to directly send an audio clip to Google for transcription. However, Froyo (API level 8) introduced the SpeechRecognizerclass, which provides direct access to the speech recognition service. So, for example, you can start playback of an audio clip and have your Activity start the speech recognizer listening in the background, which will return results after completion to a user-defined listener callback method.
据我所知,仍然无法直接将音频剪辑发送到 Google 进行转录。但是,Froyo(API 级别 8)引入了SpeechRecognizer类,该类提供对语音识别服务的直接访问。因此,例如,您可以开始播放音频剪辑并让您的 Activity 启动语音识别器在后台侦听,这将在完成后将结果返回给用户定义的侦听器回调方法。
The following sample code should be defined within an Activity since SpeechRecognizer's methods must be run in the main application thread. Also you will need to add the RECORD_AUDIOpermission to your AndroidManifest.xml.
以下示例代码应在 Activity 中定义,因为 SpeechRecognizer 的方法必须在主应用程序线程中运行。此外,您还需要将RECORD_AUDIO权限添加到您的 AndroidManifest.xml。
boolean available = SpeechRecognizer.isRecognitionAvailable(this);
if (available) {
SpeechRecognizer sr = SpeechRecognizer.createSpeechRecognizer(this);
sr.setRecognitionListener(new RecognitionListener() {
@Override
public void onResults(Bundle results) {
// process results here
}
// define your other overloaded listener methods here
});
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
// the following appears to be a requirement, but can be a "dummy" value
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, "com.dummy");
// define any other intent extras you want
// start playback of audio clip here
// this will start the speech recognizer service in the background
// without starting a separate activity
sr.startListening(intent);
}
You can also define your own speech recognition service by extending RecognitionService, but that is beyond the scope of this answer :)
您还可以通过扩展RecognitionService来定义自己的语音识别服务,但这超出了本答案的范围:)