C# 确定 PDF 文件中的页数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/320281/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Determine number of pages in a PDF file
提问by MagicAndi
I need to determine the number of pages in a specified PDF file using C# code (.NET 2.0). The PDF file will be read from the file system, and not from a URL. Does anyone have any pointers on how this could be done? Note: Adobe Acrobat Reader is installed on the PC where this check will be carried out.
我需要使用 C# 代码 (.NET 2.0) 确定指定 PDF 文件中的页数。PDF 文件将从文件系统中读取,而不是从 URL 中读取。有没有人对如何做到这一点有任何指示?注意:Adobe Acrobat Reader 安装在将执行此检查的 PC 上。
采纳答案by darkdog
You'll need a PDF API for C#. iTextSharp is one possible API, though better ones might exist.
你需要一个 C# 的 PDF API。iTextSharp 是一种可能的 API,但可能存在更好的 API。
iTextSharp Example
iTextSharp 示例
You must install iTextSharp.dll as a reference. Download iTextsharp from SourceForge.net This is a complete working program using a console application.
您必须安装 iTextSharp.dll 作为参考。从 SourceForge.net 下载 iTextsharp 这是一个使用控制台应用程序的完整工作程序。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using iTextSharp.text.pdf;
using iTextSharp.text.xml;
namespace GetPages_PDF
{
class Program
{
static void Main(string[] args)
{
// Right side of equation is location of YOUR pdf file
string ppath = "C:\aworking\Hawkins.pdf";
PdfReader pdfReader = new PdfReader(ppath);
int numberOfPages = pdfReader.NumberOfPages;
Console.WriteLine(numberOfPages);
Console.ReadLine();
}
}
}
回答by Peter Gfader
回答by Paul Lefebvre
I have good success using CeTe Dynamic PDF products. They're not free, but are well documented. They did the job for me.
我在使用 CeTe Dynamic PDF 产品方面取得了很大的成功。它们不是免费的,但有据可查。他们为我完成了这项工作。
回答by Paul Lefebvre
found a way at http://www.dotnetspider.com/resources/21866-Count-pages-PDF-file.aspxthis does not require purchase of a pdf library
在http://www.dotnetspider.com/resources/21866-Count-pages-PDF-file.aspx找到了一种方法, 这不需要购买 pdf 库
回答by Barrett
This should do the trick:
这应该可以解决问题:
public int getNumberOfPdfPages(string fileName)
{
using (StreamReader sr = new StreamReader(File.OpenRead(fileName)))
{
Regex regex = new Regex(@"/Type\s*/Page[^s]");
MatchCollection matches = regex.Matches(sr.ReadToEnd());
return matches.Count;
}
}
From Rachael's answerand this onetoo.
回答by Barrett
I've used the code above that solves the problem using regex and it works, but it's quite slow. It reads the entire file to determine the number of pages.
我已经使用上面的代码使用正则表达式解决了问题,并且可以正常工作,但是速度很慢。它读取整个文件以确定页数。
I used it in a web app and pages would sometimes list 20 or 30 PDFs at a time and in that circumstance the load time for the page went from a couple seconds to almost a minute due to the page counting method.
我在一个网络应用程序中使用它,页面有时会一次列出 20 或 30 个 PDF,在这种情况下,由于页面计数方法,页面的加载时间从几秒钟到近一分钟。
I don't know if the 3rd party libraries are much better, I would hope that they are and I've used pdflib in other scenarios with success.
我不知道 3rd 方库是否更好,我希望它们是,并且我已经成功地在其他场景中使用了 pdflib。
回答by Bobrovsky
Docotic.Pdf librarymay be used to accomplish the task.
Docotic.Pdf 库可用于完成该任务。
Here is sample code:
这是示例代码:
PdfDocument document = new PdfDocument();
document.Open("file.pdf");
int pageCount = document.PageCount;
The library will parse as little as possible so performance should be ok.
该库将尽可能少地解析,因此性能应该没问题。
Disclaimer: I work for Bit Miracle.
免责声明:我为 Bit Miracle 工作。
回答by Medo Medo
One Line:
一条线:
int pdfPageCount = System.IO.File.ReadAllText("example.pdf").Split(new string[] { "/Type /Page" }, StringSplitOptions.None).Count()-2;
Recommended: ITEXTSHARP
推荐: ITEXTSHARP