C# HttpWebRequest 和本机 GZip 压缩

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/839888/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 03:38:49  来源:igfitidea点击:

HttpWebRequest & Native GZip Compression

c#.netstreamgziphttp-compression

提问by Pat

When requesting a page with Gzip compression I am getting a lot of the following errors:

请求使用 Gzip 压缩的页面时,我收到很多以下错误:

System.IO.InvalidDataException: The CRC in GZip footer does not match the CRC calculated from the decompressed data

System.IO.InvalidDataException: GZip 页脚中的 CRC 与根据解压数据计算的 CRC 不匹配

I am using native GZipStream to decompress and am looking at addressing this. With that in mind is there a work around for addressing this or another GZip library (free?) which will handle this issue properly?

我正在使用本机 GZipStream 进行解压缩,并且正在考虑解决这个问题。考虑到这一点,是否有解决此问题或其他 GZip 库(免费?)可以正确处理此问题的解决方法?

I am verifying the webResponse ContentEncoding is GZIP

我正在验证 webResponse ContentEncoding 是 GZIP

Update 5/11A simplified snippit

更新 5/11一个简化的片段

//Caller
public void SOSampleGet(string url) 
{
    // Initialize the WebRequest.
    webRequest = (HttpWebRequest)WebRequest.Create(url);
    webRequest.Method = WebRequestMethods.Http.Get;
    webRequest.KeepAlive = true;
    webRequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
    webRequest.Headers.Add("Accept-Encoding", "gzip,deflate");
    webRequest.Referer = WebUtil.GetDomain(url);

    HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();    

    using (Stream stream = GetStreamForResponse(webResponse, READTIMEOUT_CONST))
    {
        //use stream
    }
}

//Method
private static Stream GetStreamForResponse(HttpWebResponse webResponse, int readTimeOut)
{
    Stream stream;
    switch (webResponse.ContentEncoding.ToUpperInvariant())
    {
        case "GZIP":
            stream = new GZipStream(webResponse.GetResponseStream(), CompressionMode.Decompress);
            break;
        case "DEFLATE":
            stream = new DeflateStream(webResponse.GetResponseStream(), CompressionMode.Decompress);
            break;

        default:
            stream = webResponse.GetResponseStream();
            stream.ReadTimeout = readTimeOut;
            break;
        }    
    return stream;
}

回答by Andomar

The native GZipStream can read a compressed GZIP (RFC 1952) stream, but it can't handle the ZIP file format.

本机 GZipStream 可以读取压缩的 GZIP ( RFC 1952) 流,但无法处理 ZIP 文件格式。

From http://www.geekpedia.com/tutorial190_Zipping-files-using-GZipStream.html:

来自http://www.geekpedia.com/tutorial190_Zipping-files-using-GZipStream.html

The disadvantage of using the GZipStream class over a 3rd party product is that it has limited capabilities. One of the limitations is that you cannot give a name to the file that you place in the archive. When GZipStream compresses the file into a ZIP archive, it takes the sequence of bytes from that file and uses compression algorithms that create a smaller sequence of bytes. The new sequence of bytes is put into the new ZIP file. When you open the ZIP file you will open the archived file itself; most popular ZIP extractors (WinZip, WinRar, etc.) will show you the content of the ZIP as a file that has the same as the archive itself.

在第 3 方产品上使用 GZipStream 类的缺点是它的功能有限。限制之一是您无法为放置在存档中的文件命名。当 GZipStream 将文件压缩为 ZIP 存档时,它会从该文件中获取字节序列并使用压缩算法创建更小的字节序列。新的字节序列被放入新的 ZIP 文件中。当您打开 ZIP 文件时,您将打开存档文件本身;最流行的 ZIP 提取器(WinZip、WinRar 等)会将 ZIP 的内容显示为与存档本身具有相同的文件。



EDIT: The above note is incorrect. GZipStream does not produce a ZIP file. It is not a "Single file ZIP stream". It is a GZIP Stream. They are different things. There's no guarantee that tools that handle ZIP archives will handle a .gz file.

编辑:上面的注释是不正确的。GZipStream 不生成 ZIP 文件。它不是“单个文件 ZIP 流”。它是一个 GZIP 流。它们是不同的东西。不能保证处理 ZIP 档案的工具会处理 .gz 文件。



For an implementation that can read ZIP archives, as opposed to single-file ZIP streams, try #ziplib (SharpZipLib, formerly NZipLib).

对于可以读取 ZIP 档案而不是单文件 ZIP 流的实现,请尝试#ziplib(SharpZipLib,以前称为 NZipLib)

回答by MichaelICE

See my comment above, but this usually is a symptom of a corrupted file. If the site is your own, replace the file you are trying to access.

请参阅我上面的评论,但这通常是文件损坏的症状。如果该站点是您自己的站点,请替换您尝试访问的文件。

回答by Matthew Whited

Are you flushing and closing the stream? Try wrapping your GZipStream with a Using Statement.

您是否正在冲洗和关闭流?尝试使用 Using 语句包装您的 GZipStream。

回答by Mike L

I found some sample code that shows the entire request/response for GZip encoded pages. It uses GZipStream.

我找到了一些示例代码,显示了 GZip 编码页面的整个请求/响应。它使用 GZipStream。

http://www.know24.net/blog/Decompress+GZip+Deflate+HTTP+Responses.aspx

http://www.know24.net/blog/Decompress+GZip+Deflate+HTTP+Responses.aspx

回答by Eugene

What about the webrequest AutomaticDecompression Property available since .net 2? Simply add:

自 .net 2 起可用的 webrequest AutomaticDecompression 属性怎么样?只需添加:

webRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;

It also adds the gzip,deflate to the accept encoding header.

它还将 gzip,deflate 添加到接受编码标头中。

See http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.automaticdecompression.aspx

请参阅http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.automaticdecompression.aspx

回答by pimbrouwers

For .NET Core things are a little more involved. A GZipStreamis needed as there isn't a property (as of writing) for AutomaticCompression. See my answer here: https://stackoverflow.com/a/44508724/2421277

对于 .NET Core,事情要复杂一些。GZipStream需要A ,因为没有属性(截至撰写时)AutomaticCompression。在此处查看我的答案:https: //stackoverflow.com/a/44508724/2421277

Code from answer:

答案中的代码:

var req = WebRequest.CreateHttp(uri);

/*
 * Headers
 */
req.Headers[HttpRequestHeader.AcceptEncoding] = "gzip, deflate";

/*
 * Execute
 */
try
{
    using (var resp = await req.GetResponseAsync())
    {
        using (var str = resp.GetResponseStream())
        using (var gsr = new GZipStream(str, CompressionMode.Decompress))
        using (var sr = new StreamReader(gsr))

        {
            string s = await sr.ReadToEndAsync();  
        }
    }
}
catch (WebException ex)
{
    using (HttpWebResponse response = (HttpWebResponse)ex.Response)
    {
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            string respStr = sr.ReadToEnd();
            int statusCode = (int)response.StatusCode;

            string errorMsh = $"Request ({url}) failed ({statusCode}) on, with error: {respStr}";
        }
    }
}