用C/C++读取和处理WAV文件数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16075233/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 20:03:35  来源:igfitidea点击:

Reading and processing WAV file data in C/C++

c++cvoicevoice-recognition

提问by Luxk

I'm currently doing a very very important school project. I need to extract the information of a WAVE file in C/C++ and use the information to obtain the LPC of a voice signal. But, in order to do that, I need to do some pre-processing to the signal, like doing Zero crossing and energy analysis, among other things. Which means that I need the sign and a real value. The problem is that I don't know how to obtain useful information and the correct format for that. I have already read every single field in the file, but I'm not sure I am doing it right. Suggestions, please?

我目前正在做一个非常非常重要的学校项目。我需要在 C/C++ 中提取 WAVE 文件的信息,并使用该信息获取语音信号的 LPC。但是,为了做到这一点,我需要对信号进行一些预处理,例如过零和能量分析等。这意味着我需要符号和实际值。问题是我不知道如何获得有用的信息和正确的格式。我已经阅读了文件中的每一个字段,但我不确定我是否做得对。请给点建议?

This is the way I read the file at the moment:

这是我目前读取文件的方式:

readI = fread(&bps, 1, 2, audio); printf("bits per sample = %d \n", bps);

readI = fread(&bps, 1, 2, 音频); printf("每个样本的位数 = %d \n", bps);

Thanks in advance.

提前致谢。

回答by

My first recommendation would be to use some kind of library to help you out. Most sound solutions seem overkill, so a simple library (like the one recommended in the comment of your question, libsndfile) should do the trick.

我的第一个建议是使用某种图书馆来帮助你。大多数合理的解决方案似乎有点矫枉过正,所以一个简单的库(比如你的问题评论中推荐的库libsndfile)应该可以解决问题。

If you just want to know how to read WAV files so you can write your own (since your school might turn its nose up at having you use a library like any other regular person), a quick google search will give you all the info you need plus some people who have already wrote many tutorials on reading the .wav format.

如果您只是想知道如何阅读 WAV 文件以便您可以编写自己的文件(因为您的学校可能会对您像其他普通人一样使用图书馆而嗤之以鼻),快速谷歌搜索将为您提供所有信息需要加上一些已经写了很多关于阅读 .wav 格式教程的人

If you still don't get it, here's some of my own code where I read the header and all other chunks of the WAV/RIFF data file until I get to the data chunk. It's based exclusively off the WAV Format Specification. Extracting the actual sound data is not very hard: you can either read it raw and use it raw or do a conversion to a format you'd have more comfort with internally (32-bit PCM uncompressed data or something).

如果您仍然不明白,这里是我自己的一些代码,我在其中读取头和 WAV/RIFF 数据文件的所有其他块,直到我到达数据块。它完全基于WAV 格式规范。提取实际的声音数据并不难:您可以原始读取并使用原始数据,也可以转换为您在内部更熟悉的格式(32 位 PCM 未压缩数据或其他格式)。

When looking at the below code, replace reader.Read...( ... )with equivalent freadcalls for integer values and byte sizes of the indicated type. WavChunksis an enum that is the Little Endian values of the IDs inside of a WAV file chunk, and the formatvariable is one of the types of the Wav Format Types that can be contained in the WAV File Format:

查看以下代码时,替换为对指定类型的整数值和字节大小的reader.Read...( ... )等效fread调用。WavChunks是一个枚举,它是 WAV 文件块内 ID 的 Little Endian 值,format变量是可以包含在 WAV 文件格式中的 Wav 格式类型之一:

enum class WavChunks {
    RiffHeader = 0x46464952,
    WavRiff = 0x54651475,
    Format = 0x020746d66,
    LabeledText = 0x478747C6,
    Instrumentation = 0x478747C6,
    Sample = 0x6C706D73,
    Fact = 0x47361666,
    Data = 0x61746164,
    Junk = 0x4b4e554a,
};

enum class WavFormat {
    PulseCodeModulation = 0x01,
    IEEEFloatingPoint = 0x03,
    ALaw = 0x06,
    MuLaw = 0x07,
    IMAADPCM = 0x11,
    YamahaITUG723ADPCM = 0x16,
    GSM610 = 0x31,
    ITUG721ADPCM = 0x40,
    MPEG = 0x50,
    Extensible = 0xFFFE
};

int32 chunkid = 0;
bool datachunk = false;
while ( !datachunk ) {
    chunkid = reader.ReadInt32( );
    switch ( (WavChunks)chunkid ) {
    case WavChunks::Format:
        formatsize = reader.ReadInt32( );
        format = (WavFormat)reader.ReadInt16( );
        channels = (Channels)reader.ReadInt16( );
        channelcount = (int)channels;
        samplerate = reader.ReadInt32( );
        bitspersecond = reader.ReadInt32( );
        formatblockalign = reader.ReadInt16( );
        bitdepth = reader.ReadInt16( );
        if ( formatsize == 18 ) {
            int32 extradata = reader.ReadInt16( );
            reader.Seek( extradata, SeekOrigin::Current );
        }
        break;
    case WavChunks::RiffHeader:
        headerid = chunkid;
        memsize = reader.ReadInt32( );
        riffstyle = reader.ReadInt32( );
        break;
    case WavChunks::Data:
        datachunk = true;
        datasize = reader.ReadInt32( );
        break;
    default:
        int32 skipsize = reader.ReadInt32( );
        reader.Seek( skipsize, SeekOrigin::Current );
        break;
    }
}