java 将音频转换为文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3958342/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert audio to text
提问by Amira Elsayed Ismail
I just want to know if there is any build in libraries or external libraries in Java or C# that allow me to take an audio file and parse it and extract the text from it.
我只想知道是否有任何内置的库或 Java 或 C# 中的外部库允许我获取音频文件并解析它并从中提取文本。
I need to make an application to do so, but I don't know from where I can start.
我需要提出申请才能这样做,但我不知道从哪里开始。
回答by Ohad Schneider
Here are some of your options:
以下是您的一些选择:
回答by bulltorious
Here is a complete example using C# and System.Speech
这是一个使用 C# 和 System.Speech 的完整示例
The code can be divided into 2 main parts:
代码可以分为2个主要部分:
configuring the SpeechRecognitionEngine object (and its required elements) handling the SpeechRecognized and SpeechHypothesized events.
配置 SpeechRecognitionEngine 对象(及其所需元素)处理 SpeechRecognized 和 SpeechHypothesized 事件。
Step 1: Configuring the SpeechRecognitionEngine
步骤 1:配置 SpeechRecognitionEngine
_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
At this point your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.
此时,您的对象已准备好开始从麦克风转录音频。不过,您需要处理一些事件,以便实际访问结果。
Step 2: Handling the SpeechRecognitionEngine Events
步骤 2:处理 SpeechRecognitionEngine 事件
_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);
_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);
private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e) { ///real-time results from the engine string realTimeResults = e.Result.Text; }
private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { ///final answer from the engine string finalAnswer = e.Result.Text; }
_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);
_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);
private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e) { ///来自引擎字符串的实时结果 realTimeResults = e.Result.Text; }
private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { ///来自引擎字符串的最终答案 finalAnswer = e.Result.Text; }
That's it. If you want to use a pre-recorded .wav file instead of a microphone, you would use
而已。如果您想使用预先录制的 .wav 文件而不是麦克风,您可以使用
_speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile);
_speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile);
instead of
代替
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
There are a bunch of different options in these classes and they are worth exploring in more detail.
这些类中有很多不同的选项,值得更详细地探索。
回答by jassuncao
You might check Microsoft Speech API. I think they provide a SDK that you can use for your objective.
您可以查看Microsoft Speech API。我认为他们提供了一个可以用于您的目标的 SDK。
回答by Grant Peters
For Java, it seems there is a solution from Sun: javax.speech.recognition
对于 Java,似乎有来自 Sun 的解决方案:javax.speech.recognition
回答by Ivelin
You can use SoX (the Swiss Army knife of sound processing programs) to convert audio file to text file with numeric values corresponding to sound frequency/volume.
您可以使用 SoX(声音处理程序的瑞士军刀)将音频文件转换为文本文件,其数值对应于声音频率/音量。
I have done it for a previous project but don't know the exact command options.
我为以前的项目做过,但不知道确切的命令选项。
Here is a link to the project: http://sox.sourceforge.net/Main/HomePage
这是该项目的链接:http: //sox.sourceforge.net/Main/HomePage