objective-c 如何使用 VideoToolbox 解压 H.264 视频流

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29525000/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 21:33:31  来源:igfitidea点击:

How to use VideoToolbox to decompress H.264 video stream

objective-cios8h.264video-toolbox

提问by Olivia Stork

I had a lot of trouble figuring out how to use Apple's Hardware accelerated video framework to decompress an H.264 video stream. After a few weeks I figured it out and wanted to share an extensive example since I couldn't find one.

我在弄清楚如何使用 Apple 的硬件加速视频框架来解压缩 H.264 视频流时遇到了很多麻烦。几周后我想通了,并想分享一个广泛的例子,因为我找不到一个。

My goal is to give a thorough, instructive example of Video Toolbox introduced in WWDC '14 session 513. My code will not compile or run since it needs to be integrated with an elementary H.264 stream (like a video read from a file or streamed from online etc) and needs to be tweaked depending on the specific case.

我的目标是给出WWDC '14 session 513 中引入的 Video Toolbox 的详尽、有指导意义的示例。我的代码将无法编译或运行,因为它需要与基本的 H.264 流(如从文件读取的视频或从在线流等)集成,并且需要根据具体情况进行调整。

I should mention that I have very little experience with video en/decoding except what I learned while googling the subject. I don't know all the details about video formats, parameter structure etc. so I've only included what I think you need to know.

我应该提一下,除了我在谷歌搜索这个主题时学到的东西之外,我对视频编码/解码的经验很少。我不知道关于视频格式、参数结构等的所有细节,所以我只包含了我认为你需要知道的内容。

I am using XCode 6.2 and have deployed to iOS devices that are running iOS 8.1 and 8.2.

我正在使用 XCode 6.2 并已部署到运行 iOS 8.1 和 8.2 的 iOS 设备。

回答by Olivia Stork

Concepts:

概念:

NALUs:NALUs are simply a chunk of data of varying length that has a NALU start code header 0x00 00 00 01 YYwhere the first 5 bits of YYtells you what type of NALU this is and therefore what type of data follows the header. (Since you only need the first 5 bits, I use YY & 0x1Fto just get the relevant bits.) I list what all these types are in the method NSString * const naluTypesStrings[], but you don't need to know what they all are.

的NALU:的NALU是简单地改变具有NALU起始码报头长度的数据块0x00 00 00 01 YY,其中的前5位YY告诉你什么类型的NALU这是因此什么类型的数据在报头。(因为您只需要前 5 位,所以我YY & 0x1F过去只获取相关位。)我列出了方法中所有这些类型的内容NSString * const naluTypesStrings[],但您不需要知道它们都是什么。

Parameters:Your decoder needs parameters so it knows how the H.264 video data is stored. The 2 you need to set are Sequence Parameter Set (SPS)and Picture Parameter Set (PPS)and they each have their own NALU type number. You don't need to know what the parameters mean, the decoder knows what to do with them.

参数:您的解码器需要参数才能知道 H.264 视频数据的存储方式。您需要设置的 2 个是序列参数集 (SPS)图片参数集 (PPS),它们都有自己的 NALU 类型编号。您不需要知道参数的含义,解码器知道如何处理它们。

H.264 Stream Format:In most H.264 streams, you will receive with an initial set of PPS and SPS parameters followed by an i frame (aka IDR frame or flush frame) NALU. Then you will receive several P frame NALUs (maybe a few dozen or so), then another set of parameters (which may be the same as the initial parameters) and an i frame, more P frames, etc. i frames are much bigger than P frames. Conceptually you can think of the i frame as an entire image of the video, and the P frames are just the changes that have been made to that i frame, until you receive the next i frame.

H.264 流格式:在大多数 H.264 流中,您将收到一组初始 PPS 和 SPS 参数,后跟一个 i 帧(又名 IDR 帧或刷新帧)NALU。然后你会收到几个P帧NALU(可能几十个左右),然后是另一组参数(可能和初始参数一样)和一个i帧,更多的P帧等等。i帧比P帧。从概念上讲,您可以将 i 帧视为视频的整个图像,而 P 帧只是对该 i 帧所做的更改,直到您收到下一个 i 帧。

Procedure:

程序:

  1. Generate individual NALUs from your H.264 stream.I cannot show code for this step since it depends a lot on what video source you're using. I made this graphic to show what I was working with ("data" in the graphic is "frame" in my following code), but your case may and probably will differ. What I was working withMy method receivedRawVideoFrame:is called every time I receive a frame (uint8_t *frame) which was one of 2 types. In the diagram, those 2 frame types are the 2 big purple boxes.

  2. Create a CMVideoFormatDescriptionRef from your SPS and PPS NALUs with CMVideoFormatDescriptionCreateFromH264ParameterSets( ). You cannot display any frames without doing this first. The SPS and PPS may look like a jumble of numbers, but VTD knows what to do with them. All you need to know is that CMVideoFormatDescriptionRefis a description of video data., like width/height, format type (kCMPixelFormat_32BGRA, kCMVideoCodecType_H264etc.), aspect ratio, color space etc. Your decoder will hold onto the parameters until a new set arrives (sometimes parameters are resent regularly even when they haven't changed).

  3. Re-package your IDR and non-IDR frame NALUs according to the "AVCC" format.This means removing the NALU start codes and replacing them with a 4-byte header that states the length of the NALU. You don't need to do this for the SPS and PPS NALUs. (Note that the 4-byte NALU length header is in big-endian, so if you have a UInt32value it must be byte-swapped before copying to the CMBlockBufferusing CFSwapInt32. I do this in my code with the htonlfunction call.)

  4. Package the IDR and non-IDR NALU frames into CMBlockBuffer.Do not do this with the SPS PPS parameter NALUs. All you need to know about CMBlockBuffersis that they are a method to wrap arbitrary blocks of data in core media. (Any compressed video data in a video pipeline is wrapped in this.)

  5. Package the CMBlockBuffer into CMSampleBuffer.All you need to know about CMSampleBuffersis that they wrap up our CMBlockBufferswith other information (here it would be the CMVideoFormatDescriptionand CMTime, if CMTimeis used).

  6. Create a VTDecompressionSessionRef and feed the sample buffers into VTDecompressionSessionDecodeFrame( ).Alternatively, you can use AVSampleBufferDisplayLayerand its enqueueSampleBuffer:method and you won't need to use VTDecompSession. It's simpler to set up, but will not throw errors if something goes wrong like VTD will.

  7. In the VTDecompSession callback, use the resultant CVImageBufferRef to display the video frame.If you need to convert your CVImageBufferto a UIImage, see my StackOverflow answer here.

  1. 从您的 H.264 流中生成单独的 NALU。我无法显示此步骤的代码,因为这在很大程度上取决于您使用的视频源。我制作了这个图形来显示我正在处理的内容(图形中的“数据”在我的以下代码中是“框架”),但您的情况可能并且可能会有所不同。每次收到帧 ( )时都会调用我在做什么我的方法,receivedRawVideoFrame:该帧uint8_t *frame是 2 种类型之一。在图中,这 2 种帧类型是 2 个大紫色框。

  2. 使用 CMVideoFormatDescriptionCreateFromH264ParameterSets() 从您的 SPS 和 PPS NALU 创建一个 CMVideoFormatDescriptionRef。如果不先执行此操作,您将无法显示任何帧。SPS 和 PPS 可能看起来像是一堆数字,但 VTD 知道如何处理它们。所有你需要知道的是,CMVideoFormatDescriptionRef就是视频数据的描述中,相同的宽度/高度,格式类型(kCMPixelFormat_32BGRAkCMVideoCodecType_H264等),宽高比,色彩空间等你的解码器将不放参数,直到一个新的组到达时(有时参数即使他们没有改变,也会经常受到反感)。

  3. 根据“AVCC”格式重新打包您的 IDR 和非 IDR 帧 NALU。这意味着删除 NALU 起始代码并用一个 4 字节的标头替换它们,该标头说明 NALU 的长度。您不需要为 SPS 和 PPS NALU 执行此操作。(请注意,4 字节 NALU 长度标头采用大端格式,因此如果您有一个UInt32值,则必须在复制到CMBlockBufferusing之前对其进行字节交换CFSwapInt32。我在我的代码中使用htonl函数调用执行此操作。)

  4. 将 IDR 和非 IDR NALU 帧打包到 CMBlockBuffer 中。不要使用 SPS PPS 参数 NALU 执行此操作。您需要知道的CMBlockBuffers是,它们是一种将任意数据块包装在核心媒体中的方法。(视频管道中的任何压缩视频数据都包含在其中。)

  5. 将 CMBlockBuffer 打包成 CMSampleBuffer。您需要知道的CMSampleBuffers是,它们将我们的CMBlockBuffers信息与其他信息一起打包(这里是CMVideoFormatDescriptionand CMTime,如果CMTime使用了的话)。

  6. 创建一个 VTDecompressionSessionRef 并将样本缓冲区提供给 VTDecompressionSessionDecodeFrame()。或者,您可以使用AVSampleBufferDisplayLayer及其enqueueSampleBuffer:方法,而无需使用 VTDecompSession。设置更简单,但不会像 VTD 那样出现问题时抛出错误。

  7. 在 VTDecompSession 回调中,使用生成的 CVImageBufferRef 来显示视频帧。如果您需要将您的转换CVImageBuffer为 a UIImage,请在此处查看我的 StackOverflow 答案。

Other notes:

其他注意事项:

  • H.264 streams can vary a lot. From what I learned, NALU start code headers are sometimes 3 bytes(0x00 00 01) and sometimes 4(0x00 00 00 01). My code works for 4 bytes; you will need to change a few things around if you're working with 3.

  • If you want to know more about NALUs, I found this answerto be very helpful. In my case, I found that I didn't need to ignore the "emulation prevention" bytes as described, so I personally skipped that step but you may need to know about that.

  • If your VTDecompressionSession outputs an error number (like -12909)look up the error code in your XCode project. Find the VideoToolbox framework in your project navigator, open it and find the header VTErrors.h. If you can't find it, I've also included all the error codes below in another answer.

  • H.264 流可以有很大差异。据我所知NALU 起始代码标头有时是 3 个字节( 0x00 00 01) ,有时是 4 个( 0x00 00 00 01)。我的代码适用于 4 个字节;如果您使用 3,则需要更改一些内容。

  • 如果您想了解更多关于 NALUs 的信息,我发现这个答案非常有帮助。就我而言,我发现我不需要忽略所描述的“仿真预防”字节,因此我个人跳过了该步骤,但您可能需要了解这一点。

  • 如果您的VTDecompressionSession 输出错误编号(如 -12909),请在您的 XCode 项目中查找错误代码。在项目导航器中找到 VideoToolbox 框架,打开它并找到标题 VTErrors.h。如果找不到,我还在另一个答案中包含了下面的所有错误代码。

Code Example:

代码示例:

So let's start by declaring some global variables and including the VT framework (VT = Video Toolbox).

因此,让我们首先声明一些全局变量并包括 VT 框架(VT = Video Toolbox)。

#import <VideoToolbox/VideoToolbox.h>

@property (nonatomic, assign) CMVideoFormatDescriptionRef formatDesc;
@property (nonatomic, assign) VTDecompressionSessionRef decompressionSession;
@property (nonatomic, retain) AVSampleBufferDisplayLayer *videoLayer;
@property (nonatomic, assign) int spsSize;
@property (nonatomic, assign) int ppsSize;

The following array is only used so that you can print out what type of NALU frame you are receiving. If you know what all these types mean, good for you, you know more about H.264 than me :) My code only handles types 1, 5, 7 and 8.

以下数组仅用于打印您正在接收的 NALU 帧类型。如果您知道所有这些类型的含义,那么您对 ​​H.264 的了解比我多 :) 我的代码仅处理类型 1、5、7 和 8。

NSString * const naluTypesStrings[] =
{
    @"0: Unspecified (non-VCL)",
    @"1: Coded slice of a non-IDR picture (VCL)",    // P frame
    @"2: Coded slice data partition A (VCL)",
    @"3: Coded slice data partition B (VCL)",
    @"4: Coded slice data partition C (VCL)",
    @"5: Coded slice of an IDR picture (VCL)",      // I frame
    @"6: Supplemental enhancement information (SEI) (non-VCL)",
    @"7: Sequence parameter set (non-VCL)",         // SPS parameter
    @"8: Picture parameter set (non-VCL)",          // PPS parameter
    @"9: Access unit delimiter (non-VCL)",
    @"10: End of sequence (non-VCL)",
    @"11: End of stream (non-VCL)",
    @"12: Filler data (non-VCL)",
    @"13: Sequence parameter set extension (non-VCL)",
    @"14: Prefix NAL unit (non-VCL)",
    @"15: Subset sequence parameter set (non-VCL)",
    @"16: Reserved (non-VCL)",
    @"17: Reserved (non-VCL)",
    @"18: Reserved (non-VCL)",
    @"19: Coded slice of an auxiliary coded picture without partitioning (non-VCL)",
    @"20: Coded slice extension (non-VCL)",
    @"21: Coded slice extension for depth view components (non-VCL)",
    @"22: Reserved (non-VCL)",
    @"23: Reserved (non-VCL)",
    @"24: STAP-A Single-time aggregation packet (non-VCL)",
    @"25: STAP-B Single-time aggregation packet (non-VCL)",
    @"26: MTAP16 Multi-time aggregation packet (non-VCL)",
    @"27: MTAP24 Multi-time aggregation packet (non-VCL)",
    @"28: FU-A Fragmentation unit (non-VCL)",
    @"29: FU-B Fragmentation unit (non-VCL)",
    @"30: Unspecified (non-VCL)",
    @"31: Unspecified (non-VCL)",
};

Now this is where all the magic happens.

现在这就是所有魔法发生的地方。

-(void) receivedRawVideoFrame:(uint8_t *)frame withSize:(uint32_t)frameSize isIFrame:(int)isIFrame
{
    OSStatus status;

    uint8_t *data = NULL;
    uint8_t *pps = NULL;
    uint8_t *sps = NULL;

    // I know what my H.264 data source's NALUs look like so I know start code index is always 0.
    // if you don't know where it starts, you can use a for loop similar to how i find the 2nd and 3rd start codes
    int startCodeIndex = 0;
    int secondStartCodeIndex = 0;
    int thirdStartCodeIndex = 0;

    long blockLength = 0;

    CMSampleBufferRef sampleBuffer = NULL;
    CMBlockBufferRef blockBuffer = NULL;

    int nalu_type = (frame[startCodeIndex + 4] & 0x1F);
    NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);

    // if we havent already set up our format description with our SPS PPS parameters, we
    // can't process any frames except type 7 that has our parameters
    if (nalu_type != 7 && _formatDesc == NULL)
    {
        NSLog(@"Video error: Frame is not an I Frame and format description is null");
        return;
    }

    // NALU type 7 is the SPS parameter NALU
    if (nalu_type == 7)
    {
        // find where the second PPS start code begins, (the 0x00 00 00 01 code)
        // from which we also get the length of the first SPS code
        for (int i = startCodeIndex + 4; i < startCodeIndex + 40; i++)
        {
            if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
            {
                secondStartCodeIndex = i;
                _spsSize = secondStartCodeIndex;   // includes the header in the size
                break;
            }
        }

        // find what the second NALU type is
        nalu_type = (frame[secondStartCodeIndex + 4] & 0x1F);
        NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
    }

    // type 8 is the PPS parameter NALU
    if(nalu_type == 8)
    {
        // find where the NALU after this one starts so we know how long the PPS parameter is
        for (int i = _spsSize + 4; i < _spsSize + 30; i++)
        {
            if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
            {
                thirdStartCodeIndex = i;
                _ppsSize = thirdStartCodeIndex - _spsSize;
                break;
            }
        }

        // allocate enough data to fit the SPS and PPS parameters into our data objects.
        // VTD doesn't want you to include the start code header (4 bytes long) so we add the - 4 here
        sps = malloc(_spsSize - 4);
        pps = malloc(_ppsSize - 4);

        // copy in the actual sps and pps values, again ignoring the 4 byte header
        memcpy (sps, &frame[4], _spsSize-4);
        memcpy (pps, &frame[_spsSize+4], _ppsSize-4);

        // now we set our H264 parameters
        uint8_t*  parameterSetPointers[2] = {sps, pps};
        size_t parameterSetSizes[2] = {_spsSize-4, _ppsSize-4};

        // suggestion from @Kris Dude's answer below
        if (_formatDesc) 
        {
            CFRelease(_formatDesc);
            _formatDesc = NULL;
        }

        status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, 
                                                (const uint8_t *const*)parameterSetPointers, 
                                                parameterSetSizes, 4, 
                                                &_formatDesc);

        NSLog(@"\t\t Creation of CMVideoFormatDescription: %@", (status == noErr) ? @"successful!" : @"failed...");
        if(status != noErr) NSLog(@"\t\t Format Description ERROR type: %d", (int)status);

        // See if decomp session can convert from previous format description 
        // to the new one, if not we need to remake the decomp session.
        // This snippet was not necessary for my applications but it could be for yours
        /*BOOL needNewDecompSession = (VTDecompressionSessionCanAcceptFormatDescription(_decompressionSession, _formatDesc) == NO);
         if(needNewDecompSession)
         {
             [self createDecompSession];
         }*/

        // now lets handle the IDR frame that (should) come after the parameter sets
        // I say "should" because that's how I expect my H264 stream to work, YMMV
        nalu_type = (frame[thirdStartCodeIndex + 4] & 0x1F);
        NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
    }

    // create our VTDecompressionSession.  This isnt neccessary if you choose to use AVSampleBufferDisplayLayer
    if((status == noErr) && (_decompressionSession == NULL))
    {
        [self createDecompSession];
    }

    // type 5 is an IDR frame NALU.  The SPS and PPS NALUs should always be followed by an IDR (or IFrame) NALU, as far as I know
    if(nalu_type == 5)
    {
        // find the offset, or where the SPS and PPS NALUs end and the IDR frame NALU begins
        int offset = _spsSize + _ppsSize;
        blockLength = frameSize - offset;
        data = malloc(blockLength);
        data = memcpy(data, &frame[offset], blockLength);

        // replace the start code header on this NALU with its size.
        // AVCC format requires that you do this.  
        // htonl converts the unsigned int from host to network byte order
        uint32_t dataLength32 = htonl (blockLength - 4);
        memcpy (data, &dataLength32, sizeof (uint32_t));

        // create a block buffer from the IDR NALU
        status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold buffered data
                                                    blockLength,  // block length of the mem block in bytes.
                                                    kCFAllocatorNull, NULL,
                                                    0, // offsetToData
                                                    blockLength,   // dataLength of relevant bytes, starting at offsetToData
                                                    0, &blockBuffer);

        NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
    }

    // NALU type 1 is non-IDR (or PFrame) picture
    if (nalu_type == 1)
    {
        // non-IDR frames do not have an offset due to SPS and PSS, so the approach
        // is similar to the IDR frames just without the offset
        blockLength = frameSize;
        data = malloc(blockLength);
        data = memcpy(data, &frame[0], blockLength);

        // again, replace the start header with the size of the NALU
        uint32_t dataLength32 = htonl (blockLength - 4);
        memcpy (data, &dataLength32, sizeof (uint32_t));

        status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold data. If NULL, block will be alloc when needed
                                                    blockLength,  // overall length of the mem block in bytes
                                                    kCFAllocatorNull, NULL,
                                                    0,     // offsetToData
                                                    blockLength,  // dataLength of relevant data bytes, starting at offsetToData
                                                    0, &blockBuffer);

        NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
    }

    // now create our sample buffer from the block buffer,
    if(status == noErr)
    {
        // here I'm not bothering with any timing specifics since in my case we displayed all frames immediately
        const size_t sampleSize = blockLength;
        status = CMSampleBufferCreate(kCFAllocatorDefault,
                                      blockBuffer, true, NULL, NULL,
                                      _formatDesc, 1, 0, NULL, 1,
                                      &sampleSize, &sampleBuffer);

        NSLog(@"\t\t SampleBufferCreate: \t %@", (status == noErr) ? @"successful!" : @"failed...");
    }

    if(status == noErr)
    {
        // set some values of the sample buffer's attachments
        CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, YES);
        CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
        CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);

        // either send the samplebuffer to a VTDecompressionSession or to an AVSampleBufferDisplayLayer
        [self render:sampleBuffer];
    }

    // free memory to avoid a memory leak, do the same for sps, pps and blockbuffer
    if (NULL != data)
    {
        free (data);
        data = NULL;
    }
}

The following method creates your VTD session. Recreate it whenever you receive newparameters. (You don't have to recreate it everytime you receive parameters, pretty sure.)

以下方法创建您的 VTD 会话。每当您收到参数时重新创建它。(您不必每次收到参数时都重新创建它,这是肯定的。)

If you want to set attributes for the destination CVPixelBuffer, read up on CoreVideo PixelBufferAttributes valuesand put them in NSDictionary *destinationImageBufferAttributes.

如果要为目标设置属性CVPixelBuffer,请阅读CoreVideo PixelBufferAttributes 值并将它们放入NSDictionary *destinationImageBufferAttributes.

-(void) createDecompSession
{
    // make sure to destroy the old VTD session
    _decompressionSession = NULL;
    VTDecompressionOutputCallbackRecord callBackRecord;
    callBackRecord.decompressionOutputCallback = decompressionSessionDecodeFrameCallback;

    // this is necessary if you need to make calls to Objective C "self" from within in the callback method.
    callBackRecord.decompressionOutputRefCon = (__bridge void *)self;

    // you can set some desired attributes for the destination pixel buffer.  I didn't use this but you may
    // if you need to set some attributes, be sure to uncomment the dictionary in VTDecompressionSessionCreate
    NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
                                                      [NSNumber numberWithBool:YES],
                                                      (id)kCVPixelBufferOpenGLESCompatibilityKey,
                                                      nil];

    OSStatus status =  VTDecompressionSessionCreate(NULL, _formatDesc, NULL,
                                                    NULL, // (__bridge CFDictionaryRef)(destinationImageBufferAttributes)
                                                    &callBackRecord, &_decompressionSession);
    NSLog(@"Video Decompression Session Create: \t %@", (status == noErr) ? @"successful!" : @"failed...");
    if(status != noErr) NSLog(@"\t\t VTD ERROR type: %d", (int)status);
}

Now this method gets called every time VTD is done decompressing any frame you sent to it. This method gets called even if there's an error or if the frame is dropped.

现在,每次 VTD 解压缩您发送给它的任何帧时,都会调用此方法。即使出现错误或帧丢失,也会调用此方法。

void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon,
                                             void *sourceFrameRefCon,
                                             OSStatus status,
                                             VTDecodeInfoFlags infoFlags,
                                             CVImageBufferRef imageBuffer,
                                             CMTime presentationTimeStamp,
                                             CMTime presentationDuration)
{
    THISCLASSNAME *streamManager = (__bridge THISCLASSNAME *)decompressionOutputRefCon;

    if (status != noErr)
    {
        NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        NSLog(@"Decompressed error: %@", error);
    }
    else
    {
        NSLog(@"Decompressed sucessfully");

        // do something with your resulting CVImageBufferRef that is your decompressed frame
        [streamManager displayDecodedFrame:imageBuffer];
    }
}

This is where we actually send the sampleBuffer off to the VTD to be decoded.

这是我们实际将 sampleBuffer 发送到 VTD 进行解码的地方。

- (void) render:(CMSampleBufferRef)sampleBuffer
{
    VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
    VTDecodeInfoFlags flagOut;
    NSDate* currentTime = [NSDate date];
    VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
                                      (void*)CFBridgingRetain(currentTime), &flagOut);

    CFRelease(sampleBuffer);

    // if you're using AVSampleBufferDisplayLayer, you only need to use this line of code
    // [videoLayer enqueueSampleBuffer:sampleBuffer];
}

If you're using AVSampleBufferDisplayLayer, be sure to init the layer like this, in viewDidLoad or inside some other init method.

如果您正在使用AVSampleBufferDisplayLayer,请确保像这样在 viewDidLoad 或其他一些 init 方法中初始化图层。

-(void) viewDidLoad
{
    // create our AVSampleBufferDisplayLayer and add it to the view
    videoLayer = [[AVSampleBufferDisplayLayer alloc] init];
    videoLayer.frame = self.view.frame;
    videoLayer.bounds = self.view.bounds;
    videoLayer.videoGravity = AVLayerVideoGravityResizeAspect;

    // set Timebase, you may need this if you need to display frames at specific times
    // I didn't need it so I haven't verified that the timebase is working
    CMTimebaseRef controlTimebase;
    CMTimebaseCreateWithMasterClock(CFAllocatorGetDefault(), CMClockGetHostTimeClock(), &controlTimebase);

    //videoLayer.controlTimebase = controlTimebase;
    CMTimebaseSetTime(self.videoLayer.controlTimebase, kCMTimeZero);
    CMTimebaseSetRate(self.videoLayer.controlTimebase, 1.0);

    [[self.view layer] addSublayer:videoLayer];
}

回答by Olivia Stork

If you can't find the VTD error codes in the framework, I decided to just include them here. (Again, all these errors and more can be found inside the VideoToolbox.frameworkitself in the project navigator, in the file VTErrors.h.)

如果您在框架中找不到 VTD 错误代码,我决定将它们包含在这里。(同样,所有这些错误以及更多错误都可以VideoToolbox.framework在项目导航器中的文件中找到VTErrors.h。)

You will get one of these error codes either in the the VTD decode frame callback or when you create your VTD session if you did something incorrectly.

如果您做错了什么,您将在 VTD 解码帧回调中或在创建 VTD 会话时收到这些错误代码之一。

kVTPropertyNotSupportedErr              = -12900,
kVTPropertyReadOnlyErr                  = -12901,
kVTParameterErr                         = -12902,
kVTInvalidSessionErr                    = -12903,
kVTAllocationFailedErr                  = -12904,
kVTPixelTransferNotSupportedErr         = -12905, // c.f. -8961
kVTCouldNotFindVideoDecoderErr          = -12906,
kVTCouldNotCreateInstanceErr            = -12907,
kVTCouldNotFindVideoEncoderErr          = -12908,
kVTVideoDecoderBadDataErr               = -12909, // c.f. -8969
kVTVideoDecoderUnsupportedDataFormatErr = -12910, // c.f. -8970
kVTVideoDecoderMalfunctionErr           = -12911, // c.f. -8960
kVTVideoEncoderMalfunctionErr           = -12912,
kVTVideoDecoderNotAvailableNowErr       = -12913,
kVTImageRotationNotSupportedErr         = -12914,
kVTVideoEncoderNotAvailableNowErr       = -12915,
kVTFormatDescriptionChangeNotSupportedErr   = -12916,
kVTInsufficientSourceColorDataErr       = -12917,
kVTCouldNotCreateColorCorrectionDataErr = -12918,
kVTColorSyncTransformConvertFailedErr   = -12919,
kVTVideoDecoderAuthorizationErr         = -12210,
kVTVideoEncoderAuthorizationErr         = -12211,
kVTColorCorrectionPixelTransferFailedErr    = -12212,
kVTMultiPassStorageIdentifierMismatchErr    = -12213,
kVTMultiPassStorageInvalidErr           = -12214,
kVTFrameSiloInvalidTimeStampErr         = -12215,
kVTFrameSiloInvalidTimeRangeErr         = -12216,
kVTCouldNotFindTemporalFilterErr        = -12217,
kVTPixelTransferNotPermittedErr         = -12218,

回答by leppert

A good Swift example of much of this can be found in Josh Baker's Avios library: https://github.com/tidwall/Avios

在 Josh Baker 的 Avios 库中可以找到一个很好的 Swift 示例:https: //github.com/tidwall/Avios

Note that Avios currently expects the user to handle chunking data at NAL start codes, but does handle decoding the data from that point forward.

请注意,Avios 当前希望用户在 NAL 起始代码处处理分块数据,但确实会从该点开始处理解码数据。

Also worth a look is the Swift based RTMP library HaishinKit (formerly "LF"), which has its own decoding implementation, including more robust NALU parsing: https://github.com/shogo4405/lf.swift

同样值得一看的是基于 Swift 的 RTMP 库 HaishinKit(以前称为“LF”),它有自己的解码实现,包括更强大的 NALU 解析:https: //github.com/shogo4405/lf.swift

回答by Jetdog

In addition to VTErrors above, I thought it's worth adding CMFormatDescription, CMBlockBuffer, CMSampleBuffer errors that you may encounter while trying Livy's example.

除了上面的 VTErrors 之外,我认为值得添加在尝试 Livy 示例时可能遇到的 CMFormatDescription、CMBlockBuffer、CMSampleBuffer 错误。

kCMFormatDescriptionError_InvalidParameter  = -12710,
kCMFormatDescriptionError_AllocationFailed  = -12711,
kCMFormatDescriptionError_ValueNotAvailable = -12718,

kCMBlockBufferNoErr                             = 0,
kCMBlockBufferStructureAllocationFailedErr      = -12700,
kCMBlockBufferBlockAllocationFailedErr          = -12701,
kCMBlockBufferBadCustomBlockSourceErr           = -12702,
kCMBlockBufferBadOffsetParameterErr             = -12703,
kCMBlockBufferBadLengthParameterErr             = -12704,
kCMBlockBufferBadPointerParameterErr            = -12705,
kCMBlockBufferEmptyBBufErr                      = -12706,
kCMBlockBufferUnallocatedBlockErr               = -12707,
kCMBlockBufferInsufficientSpaceErr              = -12708,

kCMSampleBufferError_AllocationFailed             = -12730,
kCMSampleBufferError_RequiredParameterMissing     = -12731,
kCMSampleBufferError_AlreadyHasDataBuffer         = -12732,
kCMSampleBufferError_BufferNotReady               = -12733,
kCMSampleBufferError_SampleIndexOutOfRange        = -12734,
kCMSampleBufferError_BufferHasNoSampleSizes       = -12735,
kCMSampleBufferError_BufferHasNoSampleTimingInfo  = -12736,
kCMSampleBufferError_ArrayTooSmall                = -12737,
kCMSampleBufferError_InvalidEntryCount            = -12738,
kCMSampleBufferError_CannotSubdivide              = -12739,
kCMSampleBufferError_SampleTimingInfoInvalid      = -12740,
kCMSampleBufferError_InvalidMediaTypeForOperation = -12741,
kCMSampleBufferError_InvalidSampleData            = -12742,
kCMSampleBufferError_InvalidMediaFormat           = -12743,
kCMSampleBufferError_Invalidated                  = -12744,
kCMSampleBufferError_DataFailed                   = -16750,
kCMSampleBufferError_DataCanceled                 = -16751,

回答by Kris Dude

@Livy to remove memory leaks before CMVideoFormatDescriptionCreateFromH264ParameterSetsyou should add the following:

@Livy 在CMVideoFormatDescriptionCreateFromH264ParameterSets添加以下内容之前删除内存泄漏:

if (_formatDesc) {
    CFRelease(_formatDesc);
    _formatDesc = NULL;
}