使用 C# 和 System.IO.Packaging 以编程方式从 Zip 存档中提取文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/507751/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 06:08:40  来源:igfitidea点击:

Extracting files from a Zip archive programmatically using C# and System.IO.Packaging

c#zipsystem.io.packaging

提问by Craig

I have a bunch of ZIP files that are in desperate need of some hierarchical reorganization and extraction. What I can do, currently, is create the directory structure and move the zip files to the proper location. The mystic cheese that I am missing is the part that extracts the files from the ZIP archive.

我有一堆 ZIP 文件,急需进行一些层次结构重组和提取。目前我能做的是创建目录结构并将 zip 文件移动到正确的位置。我缺少的神秘奶酪是从 ZIP 存档中提取文件的部分。

I have seen the MSDN articles on the ZipArchiveclass and understand them reasonable well. I have also seen the VBScript ways to extract. This is not a complex class so extracting stuff should be pretty simple. In fact, it works "mostly". I have included my current code below for reference.

我看过有关该ZipArchive课程的 MSDN 文章,并且对它们的理解很合理。我还看到了VBScript 提取. 这不是一个复杂的类,因此提取内容应该非常简单。事实上,它“主要”起作用。我在下面包含了我当前的代码以供参考。

 using (ZipPackage package = (ZipPackage)Package.Open(@"..\..\test.zip", FileMode.Open, FileAccess.Read))
 {
    PackagePartCollection packageParts = package.GetParts();
    foreach (PackageRelationship relation in packageParts)
    {
       //Do Stuff but never gets here since packageParts is empty.
    }
 }

The problem seems to be somewhere in the GetParts(or GetAnythingfor that matter). It seems that the package, while open, is empty. Digging deeper the debugger shows that the private member _zipArchive shows that it actually has parts. Parts with the right names and everything. Why won't the GetPartsfunction retrieve them? I'ver tried casting the open to a ZipArchive and that didn't help. Grrr.

问题似乎出在GetParts(或为此获得任何东西)中的某个地方。打开的包裹似乎是空的。深入挖掘调试器表明私有成员 _zipArchive 表明它实际上有部件。具有正确名称的零件和一切。为什么GetParts函数不检索它们?我试过将 open 转换为 ZipArchive,但没有帮助。咕噜噜。

采纳答案by Cheeso

If you are manipulating ZIP files, you may want to look into a 3rd-party library to help you.

如果您正在操作 ZIP 文件,您可能需要查看 3rd-party 库来帮助您。

For example, DotNetZip, which has been recently updated. The current version is now v1.8. Here's an example to create a zip:

例如,最近更新的 DotNetZip。当前版本是 v1.8。这是创建 zip 的示例:

using (ZipFile zip = new ZipFile())
{
  zip.AddFile("c:\photos\personal\7440-N49th.png");
  zip.AddFile("c:\Desktop\2005_Annual_Report.pdf");
  zip.AddFile("ReadMe.txt");

  zip.Save("Archive.zip");
}

Here's an example to updatean existing zip; you don't need to extract the files to do it:

这是更新现有 zip的示例;你不需要提取文件来做到这一点:

using (ZipFile zip = ZipFile.Read("ExistingArchive.zip"))
{
  // 1. remove an entry, given the name
  zip.RemoveEntry("README.txt");

  // 2. Update an existing entry, with content from the filesystem
  zip.UpdateItem("Portfolio.doc");

  // 3. modify the filename of an existing entry 
  // (rename it and move it to a sub directory)
  ZipEntry e = zip["Table1.jpg"];
  e.FileName = "images/Figure1.jpg";

  // 4. insert or modify the comment on the zip archive
  zip.Comment = "This zip archive was updated " + System.DateTime.ToString("G"); 

  // 5. finally, save the modified archive
  zip.Save();
}

here's an example that extracts entries:

这是一个提取条目的示例:

using (ZipFile zip = ZipFile.Read("ExistingZipFile.zip"))
{
  foreach (ZipEntry e in zip)
  {
    e.Extract(TargetDirectory, true);  // true => overwrite existing files
  }
}

DotNetZip supports multi-byte chars in filenames, Zip encryption, AES encryption, streams, Unicode, self-extracting archives. Also does ZIP64, for file lengths greater than 0xFFFFFFFF, or for archives with more than 65535 entries.

DotNetZip 支持文件名中的多字节字符、Zip 加密、AES 加密、流、Unicode、自解压档案。对于大于 0xFFFFFFFF 的文件长度或具有超过 65535 个条目的档案,ZIP64 也是如此。

free. open source

自由。开源

get it at codeplexor direct download from windows.net- CodePlex has been discontinued and archived

codeplex 上获取 或从 windows.net 直接下载- CodePlex 已停产并存档

回答by jro

From MSDN,

MSDN

In this sample, the Package class is used (as opposed to the ZipPackage.) Having worked with both, I've only seen flakiness happen when there's corruption in the zip file. Not necessarily corruption that throws the Windows extractor or Winzip, but something that the Packaging components have trouble handling.

在此示例中,使用了 Package 类(与 ZipPackage 相对)。使用过这两个类后,我只看到 zip 文件损坏时会发生片状。不一定是引发 Windows 提取器或 Winzip 的损坏,而是打包组件无法处理的问题。

Hope this helps, maybe it can provide you an alternative to debugging the issue.

希望这会有所帮助,也许它可以为您提供调试问题的替代方法。

using System;
using System.IO;
using System.IO.Packaging;
using System.Text;

class ExtractPackagedImages
{
    static void Main(string[] paths)
    {
        foreach (string path in paths)
        {
            using (Package package = Package.Open(
                path, FileMode.Open, FileAccess.Read))
            {
                DirectoryInfo dir = Directory.CreateDirectory(path + " Images");
                foreach (PackagePart part in package.GetParts())
                {
                    if (part.ContentType.ToLowerInvariant().StartsWith("image/"))
                    {
                        string target = Path.Combine(
                            dir.FullName, CreateFilenameFromUri(part.Uri));
                        using (Stream source = part.GetStream(
                            FileMode.Open, FileAccess.Read))
                        using (Stream destination = File.OpenWrite(target))
                        {
                            byte[] buffer = new byte[0x1000];
                            int read;
                            while ((read = source.Read(buffer, 0, buffer.Length)) > 0)
                            {
                                destination.Write(buffer, 0, read);
                            }
                        }
                        Console.WriteLine("Extracted {0}", target);
                    }
                }
            }
        }
        Console.WriteLine("Done");
    }

    private static string CreateFilenameFromUri(Uri uri)
    {
        char [] invalidChars = Path.GetInvalidFileNameChars();
        StringBuilder sb = new StringBuilder(uri.OriginalString.Length);
        foreach (char c in uri.OriginalString)
        {
            sb.Append(Array.IndexOf(invalidChars, c) < 0 ? c : '_');
        }
        return sb.ToString();
    }
}

回答by Luke

From "ZipPackage Class" (MSDN):

来自“ ZipPackage Class”(MSDN):

While Packages are stored as Zip files* through the ZipPackage class, all Zip files are not ZipPackages. A ZipPackage has special requirements such as URI-compliant file (part) names and a "[Content_Types].xml" file that defines the MIME types for all the files contained in the Package. The ZipPackage class cannot be used to open arbitary Zip files that do not conform to the Open Packaging Conventions standard.

For further details see Section 9.2 "Mapping to a ZIP Archive" of the ECMA International "Open Packaging Conventions" standard, http://www.ecma-international.org/publications/files/ECMA-ST/Office%20Open%20XML%20Part%202%20(DOCX).zip(342Kb) or http://www.ecma-international.org/publications/files/ECMA-ST/Office%20Open%20XML%20Part%202%20(PDF).zip(1.3Mb)

*You can simply add ".zip" to the extension of any ZipPackage-based file (.docx, .xlsx, .pptx, etc.) to open it in your favorite Zip utility.

虽然包通过 ZipPackage 类存储为 Zip 文件*,但所有 Zip 文件都不是 ZipPackages。ZipPackage 具有特殊要求,例如符合 URI 的文件(部分)名称和定义包中包含的所有文件的 MIME 类型的“[Content_Types].xml”文件。ZipPackage 类不能用于打开不符合开放打包约定标准的任意 Zip 文件。

有关更多详细信息,请参阅 ECMA 国际“开放包装约定”标准的第 9.2 节“映射到 ZIP 档案”,http://www.ecma-international.org/publications/files/ECMA-ST/Office%20Open%20XML% 20Part%202%20(DOCX).zip(342Kb) 或http://www.ecma-international.org/publications/files/ECMA-ST/Office%20Open%20XML%20Part%202%20(PDF).zip(1.3Mb)

*您可以简单地将“.zip”添加到任何基于 ZipPackage 的文件(.docx、.xlsx、.pptx 等)的扩展名,以在您喜欢的 Zip 实用程序中打开它。

回答by Rad

I agree withe Cheeso. System.IO.Packaging is awkward when handling generic zip files, seeing as it was designed for Office Open XML documents. I'd suggest using DotNetZipor SharpZipLib

我同意 Cheeso。System.IO.Packaging 在处理通用 zip 文件时很笨拙,因为它是为 Office Open XML 文档设计的。我建议使用DotNetZipSharpZipLib

回答by Joshua

I was having the exact same problem! To get the GetParts() method to return something, I had to add the [Content_Types].xml file to the root of the archive with a "Default" node for every file extension included. Once I added this (just using Windows Explorer), my code was able to read and extract the archived contents.

我遇到了完全相同的问题!为了让 GetParts() 方法返回某些内容,我必须将 [Content_Types].xml 文件添加到存档的根目录,并为包含的每个文件扩展名添加一个“默认”节点。一旦我添加了这个(仅使用 Windows 资源管理器),我的代码就能够读取和提取存档内容。

More information on the [Content_Types].xml file can be found here:

可以在此处找到有关 [Content_Types].xml 文件的更多信息:

http://msdn.microsoft.com/en-us/magazine/cc163372.aspx- There is an example file below Figure 13 of the article.

http://msdn.microsoft.com/en-us/magazine/cc163372.aspx- 文章图 13 下方有一个示例文件。

var zipFilePath = "c:\myfile.zip"; 
var tempFolderPath = "c:\unzipped"; 

using (Package package = ZipPackage.Open(zipFilePath, FileMode.Open, FileAccess.Read)) 
{ 
    foreach (PackagePart part in package.GetParts()) 
    { 
        var target = Path.GetFullPath(Path.Combine(tempFolderPath, part.Uri.OriginalString.TrimStart('/'))); 
        var targetDir = target.Remove(target.LastIndexOf('\')); 

        if (!Directory.Exists(targetDir)) 
            Directory.CreateDirectory(targetDir); 

        using (Stream source = part.GetStream(FileMode.Open, FileAccess.Read)) 
        { 
            FileStream targetFile = File.OpenWrite(target);
            source.CopyTo(targetFile);
            targetFile.Close();
        } 
    } 
} 

Note: this code uses the Stream.CopyTo method in .NET 4.0

注意:此代码使用 .NET 4.0 中的 Stream.CopyTo 方法

回答by sharptooth

(This is basically a rephrasing of this answer)

(这基本上是对这个答案的改写)

Turns out that System.IO.Packaging.ZipPackagedoesn't support PKZIP, that's why when you open a "generic" ZIP file no "parts" are returned. This class only supports some specific flavor of ZIP files (see comments at the bottom of MSDN description) used among other as Windows Azure service packages up to SDK 1.6 - that's why if you unpack a service package and then repack it using say Info-ZIP packer it will become invalid.

结果证明它System.IO.Packaging.ZipPackage不支持 PKZIP,这就是为什么当您打开“通用”ZIP 文件时不会返回“部分”的原因。此类仅支持某些特定类型的 ZIP 文件(请参阅MSDN 描述底部的注释),用作 Windows Azure 服务包,最高可达 SDK 1.6 - 这就是为什么如果您解压服务包,然后使用 Info-ZIP 重新打包它包装器它将变得无效。