Java PDFBOX 文本编码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5306244/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 10:31:40  来源:igfitidea点击:

Java PDFBOX text encoding

javapdfunicodecharacter-encodingpdfbox

提问by javment

I try to export some data from my Java application to a pdf file. I decided to use the pdfBox library,but I realized that I could not do the Greek characters displayed properly into the pdf file. Is there a way to set the encoding? to utf8, or iso-8859-7? I try something like PdFontEncoding or Encoding but I did not get anything.

我尝试将一些数据从我的 Java 应用程序导出到 pdf 文件。我决定使用 pdfBox 库,但我意识到我无法将希腊字符正确显示到 pdf 文件中。有没有办法设置编码?到 utf8 或 iso-8859-7?我尝试了 PdFontEncoding 或 Encoding 之类的东西,但我什么也没得到。

Thank you for your time.

感谢您的时间。

回答by gutch

There are two things you would need to do:

您需要做两件事:

  • set the encoding, and
  • provide a font with Greek characters
  • 设置编码,
  • 提供带有希腊字符的字体

The inbuilt fonts that most PDF readers have (ie Adobe Reader, OS X Preview, etc) only have the latin1encoding, which doesn't include Greek characters. See http://libharu.sourceforge.net/fonts.html

大多数 PDF 阅读器的内置字体(即 Adob​​e Reader、OS X Preview 等)只有latin1编码,不包括希腊字符。见http://libharu.sourceforge.net/fonts.html

My guess is that problem here is not with the encoding, instead the problem is the font. You will need to obtain a font with Greek characters and embed it in the PDF file. Make sure you have a licence to embed the font!

我的猜测是这里的问题不在于编码,而在于字体。您将需要获得带有希腊字符的字体并将其嵌入 PDF 文件中。确保您拥有嵌入字体的许可证!

See also Using Java PDFBox library to write Russian PDF

另请参阅使用 Java PDFBox 库编写俄语 PDF