如何使用poi jar读取java api中的docx文件内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16476711/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-16 07:05:04  来源:igfitidea点击:

How to read docx file content in java api using poi jar

javaapache-poidocxreadfile

提问by nagesh

I have done reading doc file now i'm trying to read docx file content. when i searched for sample code i found many, nothing worked. check the code for reference...

我已经阅读了 doc 文件,现在我正在尝试阅读 docx 文件内容。当我搜索示例代码时,我发现了很多,但没有任何效果。检查代码以供参考...

import java.io.*;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.text.Document;
import com.itextpdf.text.Paragraph;

public class createPdfForDocx {

public static void main(String[] args) {
InputStream fs = null;  
    Document document = new Document();
    XWPFWordExtractor extractor = null ;

try {

    fs = new FileInputStream("C:\DATASTORE\test.docx");
    //XWPFDocument hdoc=new XWPFDocument(fs);
    XWPFDocument hdoc=new XWPFDocument(OPCPackage.open(fs));
    //XWPFDocument hdoc=new XWPFDocument(fs);
    extractor = new XWPFWordExtractor(hdoc);
    OutputStream fileOutput = new FileOutputStream(new       File("C:/DATASTORE/test.pdf"));
    PdfWriter.getInstance(document, fileOutput);
    document.open();
    String fileData=extractor.getText();
    System.out.println(fileData);
    document.add(new Paragraph(fileData));
    System.out.println(" pdf document created");
        } catch(IOException e) {
            System.out.println("IO Exception");
             e.printStackTrace();
          } catch(Exception ex) {
             ex.printStackTrace();
           }finally {  
                document.close();  
           } 
 }//end of main()
}//end of class

For the above code i'm getting following Exception:

对于上面的代码,我得到以下异常:

org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
at org.apache.poi.xwpf.usermodel.XWPFFactory.createDocumentPart(XWPFFactory.java:60)
at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:277)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:186)
at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:107)
at pagecode.createPdfForDocx.main(createPdfForDocx.java:20)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:67)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:521)
at org.apache.poi.xwpf.usermodel.XWPFFactory.createDocumentPart(XWPFFactory.java:58)
... 4 more
Caused by: java.lang.NoSuchMethodError: org/openxmlformats/schemas/wordprocessingml/x2006/main/CTStyles.getStyleList()Ljava/util/List;
at org.apache.poi.xwpf.usermodel.XWPFStyles.onDocumentRead(XWPFStyles.java:78)
at org.apache.poi.xwpf.usermodel.XWPFStyles.<init>(XWPFStyles.java:59)
... 9 more

Please help Thank you

请帮忙谢谢

采纳答案by Gagravarr

This is covered in the Apache POI FAQ! The entry you want is I'm using the poi-ooxml-schemas jar, but my code is failing with "java.lang.NoClassDefFoundError: org/openxmlformats/schemas/something"

这在Apache POI 常见问题中有介绍!您想要的条目是我正在使用 poi-ooxml-schemas jar,但我的代码因“java.lang.NoClassDefFoundError: org/openxmlformats/schemas/ something”而失败

The short answer is to switch the poi-ooxml-schemasjar for the full ooxml-schemas-1.1jar. The full answer is given in the FAQ

简短的回答是将poi-ooxml-schemas罐子换成满ooxml-schemas-1.1罐子。完整答案在常见问题解答中给出

回答by user2618062

For reading excels or docx file if you want to solve errors you need to add all jars then you wont get any error.

对于阅读 excels 或 docx 文件,如果您想解决错误,则需要添加所有 jar,然后您将不会收到任何错误。