javascript 使用 WebRTC、Node.js 和语音识别引擎进行实时语音识别
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23984369/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Real time speech recognition using WebRTC, Node.js and speech recognition engine
提问by jpen
A. What I am trying to implement.
A. 我试图实施的。
A web application allowing real-time speech recognition inside web browser (like this).
一个允许在 Web 浏览器中进行实时语音识别的 Web 应用程序(像这样)。
B. Technologies I am currently thinking of using to achieve A.
B. 我目前正在考虑使用的技术来实现 A。
- JavaScript
- Node.js
- WebRTC
- Microsoft Speech API or Pocketsphinx.js or something else (cannot use Web Speech API)
- JavaScript
- 节点.js
- 实时时钟
- Microsoft Speech API 或 Pocketsphinx.js 或其他东西(不能使用 Web Speech API)
C. Very basic workflow
C. 非常基本的工作流程
- Web browser establishes connection to Node server (server acts as a signaling server and also serves static files)
- Web browser acquires audio stream using getUserMedia() and sends user's voice to Node server
- Node server passes audio stream being received to speech recognition engine for analysis
- Speech recognition engine returns result to Node server
- Node server sends text result back to initiating web browser
- (Node server performs step 1 to 5 to process requests from other browsers)
- Web 浏览器建立与 Node 服务器的连接(服务器充当信令服务器,也提供静态文件)
- Web 浏览器使用 getUserMedia() 获取音频流并将用户的声音发送到 Node 服务器
- 节点服务器将接收到的音频流传递给语音识别引擎进行分析
- 语音识别引擎将结果返回给 Node 服务器
- 节点服务器将文本结果发送回启动 Web 浏览器
- (节点服务器执行步骤1到5处理来自其他浏览器的请求)
D. Questions
D. 问题
- Would Node.js be suitable to achieve C?
- How could I pass received audio streams from my Node server to a speech recognition engine running separately from the server?
- Could my speech recognition engine be running as another Node application (if I use Pocketsphinx)? So my Node server communicates to my Node speech recognition server.
- Node.js 是否适合实现 C?
- 如何将接收到的音频流从我的 Node 服务器传递到与服务器分开运行的语音识别引擎?
- 我的语音识别引擎能否作为另一个 Node 应用程序运行(如果我使用 Pocketsphinx)?所以我的 Node 服务器与我的 Node 语音识别服务器通信。
采纳答案by Nikolay Shmyrev
Would Node.js be suitable to achieve C?
Node.js 是否适合实现 C?
Yes, though there are no hard requirements for that. Some people are running servers with gstreamer, for example check
是的,尽管对此没有硬性要求。有些人使用 gstreamer 运行服务器,例如检查
http://kaljurand.github.io/dictate.js/
http://kaljurand.github.io/dictate.js/
node should be fine too.
节点也应该没问题。
How could I pass received audio streams from my Node server to a speech recognition engine running separately from the server?
如何将接收到的音频流从我的 Node 服务器传递到与服务器分开运行的语音识别引擎?
There are many ways for node-to-node communication. One of them is http://socket.io. There are also plain sockets. The particular framework depends on your requirements for fault-tolerance and scalability.
节点到节点的通信有多种方式。其中之一是http://socket.io。还有普通的插座。特定的框架取决于您对容错和可扩展性的要求。
Could my speech recognition engine be running as another Node application (if I use Pocketsphinx)? So my Node server communicates to my Node speech recognition server.
我的语音识别引擎能否作为另一个 Node 应用程序运行(如果我使用 Pocketsphinx)?所以我的 Node 服务器与我的 Node 语音识别服务器通信。
Yes, sure. You can create a node module to warp pocketsphinx API.
是的,当然。您可以创建一个节点模块来扭曲 Pocketsphinx API。
UPDATE: check this, it should be similar to what you need:
更新:检查这个,它应该类似于你需要的:
回答by jesup
You should contact Andre Natal, who has shown demos similar to this at last fall's Firefox Summit, and is now on a Google Summer of Code project implementing offline speech recognition in Firefox/FxOS: http://cmusphinx.sourceforge.net/2014/04/speech-projects-on-gsoc-2014/
您应该联系 Andre Natal,他在去年秋天的 Firefox 峰会上展示了与此类似的演示,现在正在参与在 Firefox/FxOS 中实现离线语音识别的 Google Summer of Code 项目:http://cmusphinx.sourceforge.net/2014/ 04/speech-projects-on-gsoc-2014/