Java 将 PDF 转换为 TIFF 的好库?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/356550/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 13:46:31  来源:igfitidea点击:

A good library for converting PDF to TIFF?

javapdftiff

提问by RedFilter

I need a Java library to convert PDFs to TIFF images. The PDFs are faxes, and I will be converting to TIFF so that I can then do barcode recognition on the image. Can anyone recommend a good free open source library for conversion from PDF to TIFF?

我需要一个 Java 库来将 PDF 转换为 TIFF 图像。PDF 是传真,我将转换为 TIFF,以便我可以对图像进行条形码识别。任何人都可以推荐一个好的免费开源库来从 PDF 转换为 TIFF 吗?

采纳答案by Lou Franco

Disclaimer: I work for Atalasoft

免责声明:我为 Atalasoft 工作

We have an SDK that can convert PDF to TIFF. The rendering is powered by Foxit software which makes a very powerful and efficient PDF renderer.

我们有一个可以将 PDF 转换为 TIFF 的 SDK。渲染由 Foxit 软件提供支持,该软件是一款非常强大且高效的 PDF 渲染器。

回答by Alnitak

I can't recommend any code library, but it's easy to use GhostScript to convert PDF into bitmap formats. I've personally used the script below (which also uses the netpbm utilties) to convert the firstpage of a PDF into a JPEG thumbnail:

我不能推荐任何代码库,但是使用 GhostScript 将 PDF 转换为位图格式很容易。我个人使用下面的脚本(也使用 netpbm 实用程序)将PDF的第一页转换为 JPEG 缩略图:

#!/bin/sh

/opt/local/bin/gs -q -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER -r300 \
    -sDEVICE=pnmraw -sOutputFile=- $* |
    pnmcrop |
    pnmscale -width 240 |
    cjpeg

You can use -sDEVICE=tiff...to get direct TIFF output in various TIFF sub-formats from GhostScript.

您可以使用-sDEVICE=tiff...从 GhostScript 获得各种 TIFF 子格式的直接 TIFF 输出。

回答by Alnitak

Maybe it is not neccessary to convert the PDF into TIFF. The fax will most likely be an embedded image in the PDF, so you could just extract these images again. That should be possible with the already mentioned iText library.

也许没有必要将 PDF 转换为 TIFF。传真很可能是 PDF 中的嵌入图像,因此您可以再次提取这些图像。已经提到的 iText 库应该可以做到这一点。

I don't know if this is easier than the other approach.

我不知道这是否比其他方法更容易。

回答by Alnitak

No Itext can not convert PDFs to Tiff.

否 Itext 无法将 PDF 转换为 Tiff。

However, there are commercial libraries that can do that. jPDFImages is a 100% java library that can convert PDF to images in TIFF, JPEG or PNG formats (and maybe JBIG? I am not sure). It can also do the reverse, create PDF from images. It starts at $300 for a server.

但是,有一些商业图书馆可以做到这一点。jPDFImages 是一个 100% 的 java 库,可以将 PDF 转换为 TIFF、JPEG 或 PNG 格式的图像(也许是 JBIG?我不确定)。它也可以反过来,从图像创建 PDF。服务器的起价为 300 美元。

回答by serge_gubenko

we here also doing conversion PDF -> G3 tiffs with high and low res. From my experience the best tool you can have is Adobe PDF SDK, the only problem with it is its insane price. So we don't use it.

我们在这里也做转换 PDF -> G3 tiffs 的高分辨率和低分辨率。根据我的经验,你能拥有的最好的工具是 Adob​​e PDF SDK,唯一的问题是它的疯狂价格。所以我们不使用它。

what works fine for us is ghostscript, last versions are pretty much robust and do render correctly majority of the pdfs. And we have quite a few of them coming during the day. In production conversion is done using the gsdll32.dll; but if you want to try it use the following command line:

对我们来说效果很好的是ghostscript,最新版本非常强大,并且可以正确渲染大部分 pdf。我们有相当多的人在白天来。在生产中转换是使用 gsdll32.dll 完成的;但如果您想尝试使用以下命令行:

gswin32c -dNOPAUSE -dBATCH -dMaxStripSize=8192 -sDEVICE=tiffg3 -r204x196 -dDITHERPPI=200 -sOutputFile=test.tif prefix.ps test.pdf

it would convert your PDF into the high res G3 TIFF. and prefix.ps code is here:

它会将您的 PDF 转换为高分辨率 G3 TIFF。和 prefix.ps 代码在这里:

<< currentpagedevice /InputAttributes get
0 1 2 index length 1 sub {1 index exch undef } for
/InputAttributes exch dup 0 <</PageSize [0 0 612 1728]>> put
/Policies << /PageSize 3 >> >> setpagedevice

another thing about this sdk is that it's open source; you're getting both c and ps (postscript) source code for it. Also if you're going with another tool check what kind of an engine they have to power the pdf rendering, it could happen they are using gs for it; like for instance LeadTools does.

这个 SDK 的另一件事是它是开源的;你得到了它的 c 和 ps (postscript) 源代码。此外,如果您使用其他工具检查他们必须使用哪种引擎来为 pdf 渲染提供动力,那么他们可能会使用 gs;就像 LeadTools 那样。

hope this helps, regards

希望这有帮助,问候

回答by Andrea Redshot

I have some great experience with iText (now, I'm using 5.0.6 version) and this is the code for tiff convertion into pdf:

我对 iText 有一些很好的经验(现在,我使用的是 5.0.6 版本),这是将 tiff 转换为 pdf 的代码:

private static String convertTiff2Pdf(String tiff) {

    // target path PDF
    String pdf = null;

    try {

        pdf = tiff.substring(0, tiff.lastIndexOf('.') + 1) + "pdf";

        // New document A4 standard (LETTER)
        Document document = new Document(PageSize.LETTER, 0, 0, 0, 0);

        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(pdf));
        int pages = 0;
        document.open();
        PdfContentByte cb = writer.getDirectContent();
        RandomAccessFileOrArray ra = null;
        int comps = 0;
        ra = new RandomAccessFileOrArray(tiff);
        comps = TiffImage.getNumberOfPages(ra);

        // Convertion statement
        for (int c = 0; c < comps; ++c) {
            Image img = TiffImage.getTiffImage(ra, c + 1);
            if (img != null) {
                System.out.println("page " + (c + 1));
                img.scalePercent(7200f / img.getDpiX(), 7200f / img.getDpiY());
                document.setPageSize(new Rectangle(img.getScaledWidth(), img.getScaledHeight()));
                img.setAbsolutePosition(0, 0);
                cb.addImage(img);
                document.newPage();
                ++pages;
            }
        }

        ra.close();
        document.close();

    } catch (Exception e) {
        logger.error("Convert fail");
        logger.debug("", e);
        pdf = null;
    }

    logger.debug("[" + tiff + "] -> [" + pdf + "] OK");
    return pdf;

}

回答by woggles

Here is a good article and wrapper classes for using GhostScript with C# .NET...ended up using this in production

这是一篇很好的文章和包装类,用于将 GhostScript 与 C# .NET 一起使用……最终在生产中使用了它

http://www.codeproject.com/KB/cs/GhostScriptUseWithCSharp.aspx

http://www.codeproject.com/KB/cs/GhostScriptUseWithCSharp.aspx

回答by agrz

You can use the icepdf library (Apache 2.0 License). They even provide this exact use case as one of their example source code: http://wiki.icesoft.org/display/PDF/Multi-page+Tiff+Capture

您可以使用 icepdf 库(Apache 2.0 许可证)。他们甚至提供了这个确切的用例作为他们的示例源代码之一:http: //wiki.icesoft.org/display/PDF/Multi-page+Tiff+Capture