如何在 Linux 中创建 MP3 的波形图像?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4468157/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 00:46:04  来源:igfitidea点击:

How can I create a waveform image of an MP3 in Linux?

linuxaudiomp3packages

提问by Prakash Raman

Given an MP3 I would like to extract the waveform from the file into an image (.png)

给定一个 MP3,我想将文件中的波形提取为图像 (.png)

Is there a package that can do what I need ?

有没有可以做我需要的包?

回答by Lifeguard

If you have a GUI environment you can use the audacityaudio editor to load the mp3 and then use the print command to generate a pdf of the waveform. Then convert the pdf to png.

如果您有 GUI 环境,您可以使用audacity音频编辑器加载 mp3,然后使用打印命令生成波形的 pdf。然后将pdf转换为png。

回答by shodanex

I would do something like this :

我会做这样的事情:

  • find a tool to convert mp3 to PCM, ie binary data with one 8 or 16 bit value per sample. I guess mplayer can do that

  • pipe the result to a utility converting binary data to an ascii representation of the numbers in decimal format

  • use gnuplot to transform this list of value into a png graph.

  • 找到一种将 mp3 转换为 PCM 的工具,即每个样本具有一个 8 位或 16 位值的二进制数据。我想 mplayer 可以做到这一点

  • 将结果通过管道传输到实用程序,将二进制数据转换为十进制格式的数字的 ascii 表示

  • 使用 gnuplot 将此值列表转换为 png 图。

And voilà, the power of piping between unix tools. Now Step 2 in this list might be optionnal if gnuplot is able to read it's data from a binary format.

瞧,unix 工具之间管道的力量。现在,如果 gnuplot 能够从二进制格式读取数据,则此列表中的第 2 步可能是可选的。

回答by pforret

This is a standard function in SoX (command line tool for sound, Windows & Linux) Check the 'spectrogram' function on http://sox.sourceforge.net/sox.html

这是 SoX 中的标准功能(用于声音、Windows 和 Linux 的命令行工具) 检查http://sox.sourceforge.net/sox.html上的“频谱图”功能

"The spectrogram is rendered in a Portable Network Graphic (PNG) file, and shows time in the X-axis, frequency in the Y-axis, and audio signal magnitude in the Z-axis. Z-axis values are represented by the colour (or optionally the intensity) of the pixels in the X-Y plane. If the audio signal contains multiple channels then these are shown from top to bottom starting from channel 1 (which is the left channel for stereo audio)."

“频谱图在便携式网络图形 (PNG) 文件中呈现,X 轴显示时间,Y 轴显示频率,Z 轴显示音频信号幅度。Z 轴值由颜色表示(或可选的强度)在 XY 平面中的像素。如果音频信号包含多个通道,则这些通道从通道 1(立体声音频的左通道)开始从上到下显示。”

回答by qubodup

Using soxand gnuplotyou can create basic waveform images:

使用sox并且gnuplot您可以创建基本的波形图像:

sox audio.mp3 audio.dat #create plaintext file of amplitude values
tail -n+3 audio.dat > audio_only.dat #remove comments

# write script file for gnuplot
echo set term png size 320,180 > audio.gpi #set output format
echo set output \"audio.png\" >> audio.gpi #set output file
echo plot \"audio_only.dat\" with lines >> audio.gpi #plot data

gnuplot audio.gpi #run script

enter image description here

在此处输入图片说明

To create something simpler/prettier, use the following GNU Plot file as a template (save it as audio.gpi):

要创建更简单/更漂亮的内容,请使用以下 GNU Plot 文件作为模板(将其保存为audio.gpi):

#set output format and size
set term png size 320,180

#set output file
set output "audio.png"

# set y range
set yr [-1:1]

# we want just the data
unset key
unset tics
unset border
set lmargin 0             
set rmargin 0
set tmargin 0
set bmargin 0

# draw rectangle to change background color
set obj 1 rectangle behind from screen 0,0 to screen 1,1
set obj 1 fillstyle solid 1.0 fillcolor rgbcolor "#222222"

# draw data with foreground color
plot "audio_only.dat" with lines lt rgb 'white'

and just run:

只需运行:

sox audio.mp3 audio.dat #create plaintext file of amplitude values
tail -n+3 audio.dat > audio_only.dat #remove comments

gnuplot audio.gpi #run script

enter image description here

在此处输入图片说明

Based on this answerto a similar question that is more general regarding file format but less general in regards to software used.

基于对类似问题的回答,该问题在文件格式方面更通用,但在使用的软件方面不太通用。

回答by Ken Fallon

You might want to consider audiowaveform from the BBC.

您可能需要考虑来自 BBC 的音频波形。

audiowaveform is a C++ command-line application that generates waveform data from either MP3, WAV, or FLAC format audio files. Waveform data can be used to produce a visual rendering of the audio, similar in appearance to audio editing applications.

Waveform data files are saved in either binary format (.dat) or JSON (.json). Given an input waveform data file, audiowaveform can also render the audio waveform as a PNG image at a given time offset and zoom level.

The waveform data is produced from an input stereo audio signal by first combining the left and right channels to produce a mono signal. The next stage is to compute the minimum and maximum sample values over groups of N input samples (where N is controlled by the --zoom command-line option), such that each N input samples produces one pair of minimum and maxmimum points in the output.

audiowaveform 是一个 C++ 命令行应用程序,可从 MP3、WAV 或 FLAC 格式的音频文件生成波形数据。波形数据可用于生成音频的视觉渲染,其外观类似于音频编辑应用程序。

波形数据文件以二进制格式 (.dat) 或 JSON (.json) 保存。给定输入波形数据文件,audiowaveform 还可以在给定的时间偏移和缩放级别将音频波形呈现为 PNG 图像。

通过首先组合左右声道以产生单声道信号,从输入立体声音频信号产生波形数据。下一阶段是计算 N 个输入样本组的最小和最大样本值(其中 N 由 --zoom 命令行选项控制),以便每个 N 个输入样本在输出。

https://github.com/bbcrd/audiowaveform

https://github.com/bbcrd/audiowaveform