Java 使用 iText 阅读 pdf
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1637505/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read pdf using iText
提问by Sunil
I am getting problem to read pdf files using iText in java. I can read only one page but when I go to second page it gives exception.I want to read all the pages of any pdf file.
我在 java 中使用 iText 读取 pdf 文件时遇到问题。我只能阅读一页,但是当我转到第二页时,它给出了异常。我想阅读任何 pdf 文件的所有页面。
PdfTextExtractor parser =new PdfTextExtractor(new PdfReader("C:/Text.pdf"));
parser.getTextFromPage(3);
I am using these lines and at second line gives exception.
我正在使用这些行,在第二行给出了例外。
采纳答案by Kushal Paudyal
Try changing the file location. Sometimes OS does not allow file to be read from some system drives by other applications. Put somewhere in D: etc. I face this problem in Vista when reading files from desktop.
I in fact ran the same two lines of code on one of my PDF and it did print the text. Also make sure you have sufficient pages in the PDF. (3 pages or more) or try with parser.getTextFromPage(1) etc. to get content from other pages.
尝试更改文件位置。有时操作系统不允许其他应用程序从某些系统驱动器读取文件。放在 D: 等的某个地方。从桌面读取文件时,我在 Vista 中遇到了这个问题。
事实上,我在其中一个 PDF 上运行了相同的两行代码,它确实打印了文本。还要确保 PDF 中有足够的页面。(3 页或更多)或尝试使用 parser.getTextFromPage(1) 等从其他页面获取内容。
回答by Mark Redman
when you say one page, do you mean the first page? you might be indexing the pages incorrectly? Without any more info it could be anything.
当你说一页时,你是指第一页吗?您可能错误地索引了页面?没有更多信息,它可能是任何东西。
回答by Kevin Day
Are you re-constructing the parser and reader for each operation? You can do that, but it's not very efficient (there is a lot of overhead with creating a new PdfReader).
您是否为每个操作重新构建解析器和读取器?您可以这样做,但效率不高(创建新的 PdfReader 有很多开销)。
回答by KIBOU Hassan
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.PdfTextExtractor;
/**
* This class is used to read an existing
* pdf file using iText jar.
* @author javawithease
*/
public class PDFReadExample {
public static void main(String args[]){
try {
//Create PdfReader instance.
PdfReader pdfReader = new PdfReader("D:\testFile.pdf");
//Get the number of pages in pdf.
int pages = pdfReader.getNumberOfPages();
//Iterate the pdf through pages.
for(int i=1; i<=pages; i++) {
//Extract the page content using PdfTextExtractor.
String pageContent =
PdfTextExtractor.getTextFromPage(pdfReader, i);
//Print the page content on console.
System.out.println("Content on Page "
+ i + ": " + pageContent);
}
//Close the PdfReader.
pdfReader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}