java Java语音识别

Question

提问by guyumu

Is there Anyone that has experience with any open source, or relatively cheap voice recognition API for java? I'm pretty much looking for something that will turn spoken words into text.

有没有人有任何开源或相对便宜的Java语音识别API的经验？我几乎在寻找可以将口语变成文字的东西。

From the java speech recognition page on sun, it seems that it is something that is rather dead. My requirements is something that at the least runs on linux.

从sun上的java语音识别页面来看，似乎是比较死的东西。我的要求是至少在 linux 上运行的东西。

Can anyone recommend something? Pure java would be a bonus, else a linux based solution could be considered. And since this is a home project... the cheaper the better.

任何人都可以推荐一些东西吗？纯 java 将是一个奖励，否则可以考虑基于 linux 的解决方案。而且由于这是一个家庭项目......越便宜越好。

Edit

编辑

CMU Sphinx As Amit pointed out CMU Sphinx http://cmusphinx.sourceforge.net/html/cmusphinx.phpMy problem is a massive word error rate. Training seems like a project all in itself, I'm hoping to gather some strength to try it this weekend.

CMU Sphinx 正如 Amit 指出的那样 CMU Sphinx http://cmusphinx.sourceforge.net/html/cmusphinx.php我的问题是大量的单词错误率。训练本身就是一个项目，我希望能在这个周末集中力量去尝试。

IBM ViaVoice
There are news announcements floating around for 2004 about Via Voice being made open source. It seems the news release was premature and that it never happened. VIA Voice was released for linuxat some point, but It seems they stopped. All that seems to be left on IBM's website is ViaVoice embedded.

IBM ViaVoice
2004 年有很多关于Via Voice 开源的新闻公告。似乎新闻发布为时过早，而且从未发生过。VIA Voice 曾在某个时候针对 linux 发布，但似乎停止了。IBM 网站上似乎只剩下ViaVoice 嵌入式。

IBM Websphere Voice
I imagine this is why ViaVoice (desktop) seems discontinued. IBM created this commercial solution which will cost allot more than an arm and a leg. And just using it will take the ones you have left, at least after my experience with websphere and their IDE.

IBM Websphere Voice
我想这就是 ViaVoice（桌面）似乎停产的原因。IBM 创建了这个商业解决方案，其成本将超过一条胳膊和一条腿。至少在我使用 websphere 和他们的 IDE 之后，仅仅使用它就会占用你剩下的那些。

Nuance
It seems they still might create products for linux. But I think they got lost and followed IBM into the server market. I'm not that sure about this one, their web-site is not that friendly in finding useful information.

Nuance
似乎他们仍然可能为 linux 创建产品。但我认为他们迷失了方向并跟随 IBM 进入了服务器市场。我对这个不太确定，他们的网站在查找有用信息方面不是那么友好。

Open Mind / Free Speech
These guys keep changing their project name. Probably some money hungry company keeps threatening them, but I dont know. The project looks a bit dead.

Open Mind / Free Speech
这些人不断改变他们的项目名称。可能是一些饥肠辘辘的公司不断威胁他们，但我不知道。该项目看起来有点死了。

I might try training Sphinx this weekend to see if it wants to be friends. Else worse case, I'll be looking at using Microsoft's speech solution. It has worked well for me in the past, but it's not a great linux solution. I could probably use it through wine, but then I'll have two separate servers... messy messy.

我可能会在这个周末尝试训练 Sphinx，看看它是否愿意成为朋友。更糟糕的情况是，我将考虑使用 Microsoft 的语音解决方案。过去它对我来说效果很好，但它不是一个很好的 linux 解决方案。我可能可以通过 wine 使用它，但是我将有两个单独的服务器......凌乱凌乱。

Oh and what seems a good place to visit for voice/speech SpeechTechMag. They have a 'Anual Reference' that has a list of companies that somehow relates themselves to voice/speech.

哦，什么似乎是访问语音/语音SpeechTechMag的好地方。他们有一个“年度参考”，其中包含以某种方式将自己与语音/语音相关联的公司列表。

Answer 1

采纳答案by guyumu

Mostly Java: http://cmusphinx.sourceforge.net/html/cmusphinx.php

主要是 Java：http: //cmusphinx.sourceforge.net/html/cmusphinx.php

Answer 2

回答by si28719e

sphinx is by far the best option available if you are on a budget. however it also makes a hugedifference what models you use, how you tune them andhow you tune your audio source. absolutely everything has to match otherwise it just wont work. given the problem you described id be willing to bet a substantial sum that you've got you got your models mixed up and your mic is not correctly calibrated. also, if you have an accent it probably will not work - this is not an issue with the decoder but with the acoustic models - if no one with a voice/accent similar to yours was included in the training data you'll get poor results.

如果您的预算有限，sphinx 是迄今为止最好的选择。然而，它也会对您使用的模型、调整它们的方式以及调整音频源的方式产生巨大影响。绝对一切都必须匹配，否则它将无法正常工作。考虑到你描述的问题，我愿意赌一大笔钱，因为你把你的模型搞混了，你的麦克风没有正确校准。此外，如果你有口音，它可能不会起作用——这不是解码器的问题，而是声学模型的问题——如果训练数据中没有包含与你相似的语音/口音的人，你会得到很差的结果.

that said, have you looked at their open source models page?

也就是说，你看过他们的开源模型页面吗？

http://www.speech.cs.cmu.edu/sphinx/models/

depending on what you are trying to do you should be able to obtain about 90% accuracy on free speech with the 16kHz WSJ models and the gigaword LMs NVP. i caution however that ASR is a massive undertaking and hasn't yet reached commodity status.

根据您尝试执行的操作，您应该能够使用 16kHz WSJ 模型和 gigaword LMs NVP 获得大约 90% 的言论自由准确率。但我提醒说，ASR 是一项艰巨的任务，尚未达到商品状态。

Answer 3

回答by Andreas

you can download vPass (voice password) from http://www.basic-signalprocessing.com.

您可以从http://www.basic-signalprocessing.com下载 vPass（语音密码）。

For (vText) voice to text, i can send the vText.jar file to your email. Pls notify [email protected]

对于 (vText) 语音到文本，我可以将 vText.jar 文件发送到您的电子邮件。请通知 [email protected]

The components are designed for Java and .Net language. The recognition period is 5 seconds. VPass is well tested vText is not, still new, that's why not packaged yet.

这些组件是为 Java 和 .Net 语言设计的。识别时间为 5 秒。VPass 经过充分测试 vText 不是，仍然是新的，这就是为什么还没有打包的原因。

regards, Andreas

问候，安德烈亚斯

Answer 4

回答by Kiet Tran

My group finished a mini program in Java to recognize spoken digits using Sphinx.

我的小组用 Java 完成了一个使用Sphinx识别口语数字的小程序。

Answer 5

回答by user74339

I have been looking for the same thing for a few days now. So far I have found Sphinx4 and FreeTTS. Both are java implementations and Sphinx seems like it is updated rather frequently unlike FreeTTS. The only problem that I am having is that Sphinx is having problems understanding me in an office environment, and I need a solution for a warehouse environment.

几天来我一直在寻找同样的东西。到目前为止，我已经找到了 Sphinx4 和 FreeTTS。两者都是 java 实现，Sphinx 似乎与 FreeTTS 不同，它更新得相当频繁。我遇到的唯一问题是 Sphinx 在办公环境中无法理解我，我需要一个仓库环境的解决方案。

java Java语音识别

提问by guyumu

采纳答案by guyumu

回答by si28719e

回答by Andreas

回答by Kiet Tran

回答by user74339

相关推荐

最近更新

标签

java Java语音识别

提问by guyumu

采纳答案by guyumu

回答by si28719e

回答by Andreas

回答by Kiet Tran

回答by user74339

相关推荐

java 为什么“未选择数据库”SQLException？

java HQL 中的 between 是否严格比较？

java 为什么 SWT Composite 有时需要调用 resize() 才能正确布局？

Java 无缘无故地锁定文件

相关推荐

最近更新

标签