在python中将mp4声音转换为文本

Question

提问by Stergios

I want to convert a sound recording from Facebook Messenger to text. Here is an example of an .mp4 file send using Facebook's API: https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833

我想将录音从 Facebook Messenger 转换为文本。下面是一个使用Facebook的API的.MP4文件发送的例子： https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833

So this file includes only audio (not video) and I want to convert it to text.

所以这个文件只包含音频（不是视频），我想把它转换成文本。

Moreover, I want to do it as fast as possible since I'll use the generated text in an almost real-time application (i.e. user sends the .mp4 file, the script translates it to text and shows it back).

此外，我想尽快完成，因为我将在几乎实时的应用程序中使用生成的文本（即用户发送 .mp4 文件，脚本将其转换为文本并显示回来）。

I've found this example https://github.com/Uberi/speech_recognition/blob/master/examples/audio_transcribe.pyand here is the code I use:

我找到了这个例子https://github.com/Uberi/speech_recognition/blob/master/examples/audio_transcribe.py，这是我使用的代码：

import requests
import speech_recognition as sr

url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
r = requests.get(url)

with open("test.mp4", "wb") as handle:
    for data in r.iter_content():
        handle.write(data)

r = sr.Recognizer()
with sr.AudioFile('test.mp4') as source:
    audio = r.record(source)

command = r.recognize_google(audio)
print command

But I'm getting this error:

但我收到此错误：

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Asterios\Anaconda2\lib\site-packages\speech_recognition\__init__.py", line 200, in __enter__
    self.audio_reader = aifc.open(aiff_file, "rb")
  File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 952, in open
    return Aifc_read(f)
  File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 347, in __init__
    self.initfp(f)
  File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 298, in initfp
    chunk = Chunk(file)
  File "C:\Users\Asterios\Anaconda2\lib\chunk.py", line 63, in __init__
    raise EOFError
EOFError

Any ideas?

有任何想法吗？

EDIT: I want to run the script on the free-plan of pythonanywhere.com, so I'm not sure how I can install tools like ffmpeg there.

编辑：我想在 pythonanywhere.com 的免费计划上运行脚本，所以我不确定如何在那里安装像 ffmpeg 这样的工具。

EDIT 2: If you run the above script substituting the url with this one "http://www.wavsource.com/snds_2017-01-08_2348563217987237/people/men/about_time.wav" and change 'mp4' to 'wav', the it works fine. So it is for sure something with the file format.

编辑 2：如果您运行上面的脚本，用这个“ http://www.wavsource.com/snds_2017-01-08_2348563217987237/people/men/about_time.wav”替换 url并将“mp4”更改为“wav”，它工作正常。所以它肯定与文件格式有关。

Answer 1

回答by Stergios

Finally I found an solution. I'm posting it here in case it helps someone in the future.

最后我找到了解决方案。我把它贴在这里以防将来对某人有帮助。

Fortunately, pythonanywhere.com comes with avconv pre-installed (avconv is similar to ffmpeg).

幸运的是，pythonanywhere.com 预装了 avconv（avconv 类似于 ffmpeg）。

So here is some code that works:

所以这里有一些有效的代码：

import urllib2
import speech_recognition as sr
import subprocess
import os

url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
mp4file = urllib2.urlopen(url)

with open("test.mp4", "wb") as handle:
    handle.write(mp4file.read())

cmdline = ['avconv',
           '-i',
           'test.mp4',
           '-vn',
           '-f',
           'wav',
           'test.wav']
subprocess.call(cmdline)

r = sr.Recognizer()
with sr.AudioFile('test.wav') as source:
    audio = r.record(source)

command = r.recognize_google(audio)
print command

os.remove("test.mp4")
os.remove("test.wav")

In the free plan, cdn.fbsbx.comwas not on the white list of sites on pythonanywhere so I could not download the content with urllib2. I contacted them and they added the domain to the white list within 1-2 hours!

在免费计划中，cdn.fbsbx.com不在 pythonanywhere 网站的白名单中，所以我无法下载带有urllib2. 我联系了他们，他们在 1-2 小时内将域添加到白名单中！

So a huge thanks and congrats to them for the excellent service even though I'm using the free tier.

因此，即使我使用的是免费套餐，也非常感谢并祝贺他们提供的优质服务。

Answer 2

回答by Robert Smith

Use Python Video Converter https://github.com/senko/python-video-converter

使用 Python 视频转换器 https://github.com/senko/python-video-converter

import requests
import speech_recognition as sr
from converter import Converter

url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
r = requests.get(url)
c = Converter()

with open("/tmp/test.mp4", "wb") as handle:
for data in r.iter_content():
handle.write(data)

conv = c.convert('/tmp/test.mp4', '/tmp/test.wav', {
    'format': 'wav',
    'audio': {
    'codec': 'pcm',
    'samplerate': 44100,
    'channels': 2
    },
})

for timecode in conv:
    pass

r = sr.Recognizer()
with sr.AudioFile('/tmp/test.wav') as source:
audio = r.record(source)

command = r.recognize_google(audio)
print command

在python中将mp4声音转换为文本

提问by Stergios

回答by Stergios

回答by Robert Smith

相关推荐

最近更新

标签

在python中将mp4声音转换为文本

提问by Stergios

回答by Stergios

回答by Robert Smith

相关推荐

Python：如何将两个平面列表组合成一个二维数组？

使用 Python 海龟在螺旋中绘制螺旋

Visual Studio Code - 如何向 python 路径添加多个路径？

Python 从熊猫中的数据框列中删除空间

相关推荐

最近更新

标签