用于“文本到语音”和“语音到文本”的 C++ API

Question

提问by Soldier

I would like to know whether there is a good API for "voice recognition" and "text to speech" in C++. I have gone through Festival, which you can't even say whether the computer is talking because it is so real and voceas well.

我想知道在C++中是否有一个很好的“语音识别”和“文本到语音”的API。我经历过节日，你甚至不能说电脑是否在说话，因为它是如此真实和声音。

Unfortunately Festivalseems not supporting to voice recognition (I mean "Voice to Text") and voceis built in Java and it is a mess in C++ because of JNI.

不幸的是，它Festival似乎不支持语音识别（我的意思是“语音到文本”）并且voce是用 Java 构建的，由于 JNI，它在 C++ 中是一团糟。

The API should support both "Text to voice" and "Voice to Text", and it should have a good set of examples, at least outside the owner's website. Perfect if it has a facility to identify set of given voices, but that is optional, so no worries.

API 应该支持“文本到语音”和“语音到文本”，并且它应该有一组很好的示例，至少在所有者的网站之外。如果它具有识别一组给定声音的功能，那就完美了，但这是可选的，所以不用担心。

What I am going to do with the API is, when set of voice commands given, turn the robot device left, right, etc. And also, speak to me saying "Good Morning", "Good Night" etc. These words will be coded in the program.

我将使用 API 做的是，当给出一组语音命令时，将机器人设备向左、向右等转动。此外，对我说“早安”、“晚安”等。这些话将是在程序中编码。

Please help me to find a good C++ voice API for this purpose. If you have access to a tutorial/installation tutorial, please be kind enough to share it with me as well.

为此，请帮我找到一个好的 C++ 语音 API。如果您可以访问教程/安装教程，也请与我分享。

Answer 1

采纳答案by Cyril Leroux

if you develop on Windows you can use MS Speech APIwhich allow you to perform Voice Recognition (ASR) and Text-to-Speech (TTS).
You can find some examples on this pageand a very basic example of Voice Recognition in this post.

如果您在 Windows 上开发，您可以使用MS Speech API，它允许您执行语音识别 (ASR) 和文本到语音 (TTS)。
你可以找到一些例子此页和语音识别的一个非常基本的例子这篇文章。

Answer 2

回答by bobweaver

I found that If I make a audio recording (I used qtmultimedia for this) has to be flac Read more here

我发现如果我进行录音（为此我使用了 qtmultimedia）必须是 flac 在这里阅读更多

I can then upload to google and then have it send me back some JSON
I then wrote some c++/qt for this to make into a qml plugin Here is that (alpha) code. Note make sure that you replace
< YOUR FLAC FILE.flac > with your real flac file.

然后我可以上传到谷歌，然后让它给我发回一些 JSON
我然后写了一些 c++/qt 来制作一个 qml 插件这是（alpha）代码。请注意，请确保将
< YOUR FLAC FILE.flac >替换为您的真实 flac 文件。

speechrecognition.cpp

语音识别.cpp

#include <QNetworkReply>
#include <QNetworkRequest>
#include <QSslSocket>
#include <QUrl>
#include <QJsonDocument>
#include <QJsonArray>
#include <QJsonObject>
#include "speechrecognition.h"
#include <QFile>
#include <QDebug>
const char* SpeechRecognition::kContentType = "audio/x-flac; rate=8000";
const char* SpeechRecognition::kUrl = "http://www.google.com/speech-api/v1/recognize?xjerr=1&client=directions&lang=en";

SpeechRecognition::SpeechRecognition(QObject* parent)
  : QObject(parent)
{
    network_ = new QNetworkAccessManager(this);
    connect(network_, SIGNAL(finished(QNetworkReply*)),
            this, SLOT(replyFinished(QNetworkReply*)));
}

void SpeechRecognition::start(){
    const QUrl url(kUrl);
    QNetworkRequest req(url);
    req.setHeader(QNetworkRequest::ContentTypeHeader, kContentType);
    req.setAttribute(QNetworkRequest::DoNotBufferUploadDataAttribute, false);
    req.setAttribute(QNetworkRequest::CacheLoadControlAttribute,
                     QNetworkRequest::AlwaysNetwork);
    QFile *compressedFile = new QFile("<YOUR FLAC FILE.flac>");
    compressedFile->open(QIODevice::ReadOnly);
    reply_ = network_->post(req, compressedFile);
}

void SpeechRecognition::replyFinished(QNetworkReply* reply) {

  Result result = Result_ErrorNetwork;
  Hypotheses hypotheses;

  if (reply->error() != QNetworkReply::NoError) {
    qDebug() << "ERROR \n" << reply->errorString();
  } else {
      qDebug() << "Running ParserResponse for \n" << reply << result;
      ParseResponse(reply, &result, &hypotheses);
  }
  emit Finished(result, hypotheses);
  reply_->deleteLater();
  reply_ = NULL;
}

void SpeechRecognition::ParseResponse(QIODevice* reply, Result* result,
                                      Hypotheses* hypotheses)
{
 QString getReplay ;
 getReplay = reply->readAll();
 qDebug() << "The Replay " << getReplay;
 QJsonDocument jsonDoc = QJsonDocument::fromJson(getReplay.toUtf8());
  QVariantMap data = jsonDoc.toVariant().toMap();

  const int status = data.value("status", Result_ErrorNetwork).toInt();
  *result = static_cast<Result>(status);

  if (status != Result_Success)
    return;

  QVariantList list = data.value("hypotheses", QVariantList()).toList();
  foreach (const QVariant& variant, list) {
    QVariantMap map = variant.toMap();

    if (!map.contains("utterance") || !map.contains("confidence"))
      continue;

    Hypothesis hypothesis;
    hypothesis.utterance = map.value("utterance", QString()).toString();
    hypothesis.confidence = map.value("confidence", 0.0).toReal();
    *hypotheses << hypothesis;
    qDebug() << "confidence = " << hypothesis.confidence << "\n Your Results = "<< hypothesis.utterance;
    setResults(hypothesis.utterance);
}
}

  void SpeechRecognition::setResults(const QString &results)
{
    if(m_results == results)
    return;
        m_results = results;
    emit resultsChanged();
}

QString SpeechRecognition::results()const
{
    return m_results;
}

speechrecognition.h

语音识别.h

#ifndef SPEECHRECOGNITION_H
#define SPEECHRECOGNITION_H

#include <QObject>
#include <QList>

class QIODevice;
class QNetworkAccessManager;
class QNetworkReply;
class SpeechRecognition : public QObject {
  Q_OBJECT
    Q_PROPERTY(QString results READ results NOTIFY resultsChanged)

public:
  SpeechRecognition( QObject* parent = 0);
  static const char* kUrl;
  static const char* kContentType;

  struct Hypothesis {
    QString utterance;
    qreal confidence;
  };
  typedef QList<Hypothesis> Hypotheses;

  // This enumeration follows the values described here:
  // http://www.w3.org/2005/Incubator/htmlspeech/2010/10/google-api-draft.html#speech-input-error
  enum Result {
    Result_Success = 0,
    Result_ErrorAborted,
    Result_ErrorAudio,
    Result_ErrorNetwork,
    Result_NoSpeech,
    Result_NoMatch,
    Result_BadGrammar
  };
  Q_INVOKABLE void start();
  void Cancel();
  QString results()const;
  void setResults(const QString &results);

signals:
  void Finished(Result result, const Hypotheses& hypotheses);
  void resultsChanged();

private slots:
  void replyFinished(QNetworkReply* reply);

private:
  void ParseResponse(QIODevice* reply, Result* result, Hypotheses* hypotheses);

private:
  QNetworkAccessManager* network_;
  QNetworkReply* reply_;
  QByteArray buffered_raw_data_;
  int num_samples_recorded_;
    QString m_results;
};

#endif // SPEECHRECOGNITION_H

Answer 3

回答by Rod Burns

You could theoretically use Twilio if you have an internet connection in the robot and are willing to pay for the service. They have libraries and examples for a bunch of different languages and platforms http://www.twilio.com/docs/libraries

如果您在机器人中有互联网连接并愿意为该服务付费，则理论上您可以使用 Twilio。他们有许多不同语言和平台的库和示例http://www.twilio.com/docs/libraries

Also, check out this blog explaining how to build and control an arduino based robot using Twilio http://www.twilio.com/blog/2012/06/build-a-phone-controlled-robot-using-node-js-arduino-rn-xv-wifly-arduinoand-twilio.html

此外，请查看此博客，解释如何使用 Twilio http://www.twilio.com/blog/2012/06/build-a-phone-controlled-robot-using-node-js-构建和控制基于 arduino 的机器人arduino-rn-xv-wifly-arduinoand-twilio.html

用于“文本到语音”和“语音到文本”的 C++ API

提问by Soldier

采纳答案by Cyril Leroux

回答by bobweaver

回答by Rod Burns

相关推荐

最近更新

标签

用于“文本到语音”和“语音到文本”的 C++ API

提问by Soldier

采纳答案by Cyril Leroux

回答by bobweaver

回答by Rod Burns

相关推荐

我什么时候会在 C++ 中使用 const volatile、register volatile、static volatile？

C/C++：强制位域顺序和对齐

C++ std::map<std::string, int> 获取键以特定字符串开头的值

C++ 在 Qt 中制作绘图

相关推荐

最近更新

标签