C# iTextSharp 异常:未找到 PDF 标头签名

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10621936/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 14:32:42  来源:igfitidea点击:

iTextSharp exception: PDF header signature not found

c#.netpdfitext

提问by broke

I'm using iTextSharpto read the contents of PDFdocuments:

我正在使用iTextSharp阅读PDF文档的内容:

  PdfReader reader = new PdfReader(pdfPath);

                using (StringWriter output = new StringWriter())
                {
                    for (int i = 1; i <= reader.NumberOfPages; i++)
                        output.WriteLine(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()));

                    reader.Close();
                    pdfText = output.ToString();
                }

99%of the time it works just fine. However, there is this one PDFfile that will sometimes throw this exception:

99%的时间它工作得很好。但是,有一个PDF文件有时会抛出此异常:

PDF header signature not found. StackTrace: at iTextSharp.text.pdf.PRTokeniser.CheckPdfHeader() at iTextSharp.text.pdf.PdfReader.ReadPdf() at iTextSharp.text.pdf.PdfReader..ctor(String filename, Byte[] ownerPassword) at Reader.PDF.DownloadPdf(String url) in C:\Documents\Visual Studio

未找到 PDF 标题签名。StackTrace:在 iTextSharp.text.pdf.PRTokeniser.CheckPdfHeader() 在 iTextSharp.text.pdf.PdfReader.ReadPdf() 在 iTextSharp.text.pdf.PdfReader..ctor(String filename, Byte[] ownerPassword) at Reader.PDF .DownloadPdf(String url) 在 C:\Documents\Visual Studio

What's annoying is that I can't always reproduce the error. Sometimes it works, sometimes it doesn't. Has anyone encountered this problem?

令人讨厌的是我不能总是重现错误。有时它有效,有时它不起作用。有没有人遇到过这个问题?

采纳答案by Anonymous coward

After some research, I've found that this problem relates to either a file being corrupted during PDF generation, or an error related to an object in the document that doesn't conform to the PDF standard as implemented in iTextSharp. It also seems to happen only when you read from a PDF file from disk.

经过一些研究,我发现此问题与 PDF 生成期间文件损坏有关,或者与文档中不符合 iTextSharp 中实现的 PDF 标准的对象相关的错误有关。它似乎也只有当您从磁盘读取 PDF 文件时才会发生。

I have not found a complete solution to the problem, but only a workaround. What I've done is read the PDF document using the PdfReader itextsharp object and see if an error or exception happens before reading the file in a normal operation.

我还没有找到该问题的完整解决方案,而只是一种解决方法。我所做的是使用 PdfReader itextsharp 对象读取 PDF 文档,并查看在正常操作中读取文件之前是否发生错误或异常。

So running something similar to this:

所以运行类似的东西:

private bool IsValidPdf(string filepath)
{
    bool Ret = true;

    PdfReader reader = null;

    try
    {
        reader = new PdfReader(filepath);
    }
    catch
    {
        Ret = false;
    }

    return Ret;
}

回答by Bern

I found it was because I was calling new PdfReader(pdf)with the PDF stream position at the end of the file. By setting the position to zero it resolved the issue.

我发现这是因为我在调用new PdfReader(pdf)文件末尾的 PDF 流位置。通过将位置设置为零,它解决了这个问题。

Before:

前:

// Throws: InvalidPdfException: PDF header signature not found.
var pdfReader = new PdfReader(pdf);

After:

后:

// Works correctly.
pdf.Position = Number.Zero;
var pdfReader = new PdfReader(pdf);