C# 解析pdf文件

Question

提问by desi

I have a requirement to split a large pdf document into smaller files based on the content of the file. We use BCL easyPDF to manipulate pdf files. easyPDF can split pdf documents based on a page number, but it cannot split the document based on the file content. Also it does not have a search function (as far as I can tell, if I am wrong please someone let me know.) to determine the location of the content.

我需要根据文件的内容将大型 pdf 文档拆分为较小的文件。我们使用 BCL easyPDF 来操作 pdf 文件。easyPDF可以根据页码拆分pdf文档，但不能根据文件内容拆分文档。它也没有搜索功能（据我所知，如果我错了，请有人告诉我。）来确定内容的位置。

Now can someone tell me how I can find the location of text in a pdf file using .net?

现在有人可以告诉我如何使用 .net 在 pdf 文件中找到文本的位置吗？

Thanks

谢谢

Answer 1

回答by Brian

take a look at this question. there are links to some libraries that may satisfy your requirements

看看这个问题。有一些图书馆的链接可以满足您的要求

How to programatically search a PDF document in c#

如何在c#中以编程方式搜索PDF文档

Answer 2

回答by Pablo Santa Cruz

You need a PDF library in .NET such as iText.Net.

您需要 .NET 中的 PDF 库，例如 iText.Net。

Answer 3

回答by Bobrovsky

You might try Docotic.Pdf libraryfor your task.

您可以尝试使用Docotic.Pdf 库来完成您的任务。

The library can retrieve a collection of words with their bounding rectangles from PDFs. This should help you to find location of the text in a file.

该库可以从 PDF检索带有边界矩形的单词集合。这应该可以帮助您找到文件中文本的位置。

The library could also be used to extract text (with or without formatting).

该库还可用于提取文本（带或不带格式）。

Disclaimer: I work for the vendor of the library.

免责声明：我为图书馆的供应商工作。

C# 解析pdf文件

提问by desi

回答by Brian

回答by Pablo Santa Cruz

回答by Bobrovsky

相关推荐

最近更新

标签

C# 解析pdf文件

提问by desi

回答by Brian

回答by Pablo Santa Cruz

回答by Bobrovsky

相关推荐

C# 如何将嵌入的资源作为字节数组读取而不将其写入磁盘？

C# 如何以 2 个字节保存浮点数？

C# 如何检查变量的类型是否与存储在变量中的类型匹配

C# 如何通过 [WebMethod] 返回数据表

相关推荐

最近更新

标签