Java 从 RTF 文件中读取文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19830106/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 20:33:35  来源:igfitidea点击:

Read Text from RTF file

javaapache-poi

提问by Stunner

I tried to read rtf file using Apache POI but I found issues with it. It reports Invalid Header exception. It seems like POI doesn't support rtf files. Is there any way to read .rtf using any open source java API. (I heard about Aspose API but it's not free)

我尝试使用 Apache POI 读取 rtf 文件,但我发现它存在问题。它报告 Invalid Header 异常。似乎 POI 不支持 rtf 文件。有什么方法可以使用任何开源 java API读取 .rtf 。(我听说过 Aspose API 但它不是免费的)

Any solutions??

有什么解决办法吗??

回答by LotusUNSW

You can try the RTFEditorKit. It supports images and text as well.

你可以试试RTFEditorKit。它还支持图像和文本。

Or look at this answer: Java API to convert RTF file to Word document (97-2003 format)

或者看看这个答案:Java API to convert RTF file to Word document (97-2003 format)

There is no free library that supports this. But it may not be that hard to create a basic compare function yourself. You can read in an rtf file and then extract the text like this:

没有支持此功能的免费库。但是自己创建一个基本的比较函数可能并不难。您可以读取 rtf 文件,然后像这样提取文本:

// read rtf from file
JEditorPane p = new JEditorPane();
p.setContentType("text/rtf");
EditorKit rtfKit = p.getEditorKitForContentType("text/rtf");
rtfKit.read(new FileReader(fileName), p.getDocument(), 0);
rtfKit = null;

// convert to text
EditorKit txtKit = p.getEditorKitForContentType("text/plain");
Writer writer = new StringWriter();
txtKit.write(writer, p.getDocument(), 0, p.getDocument().getLength());
String documentText = writer.toString();