C# 使用 itextsharp 将 Pdf 文件页面转换为图像

Question

提问by Prithvi Raj Nandiwal

I want to convert Pdf pages in Images using ItextSharp lib.

我想使用 ItextSharp lib 转换图像中的 Pdf 页面。

Have any idea how to convert each page in image file

知道如何转换图像文件中的每个页面

Answer 1

回答by Chris Haas

iText/iTextSharp can generate and/or modify existing PDFs but they do not perform any rendering which is what you are looking for. I would recommend checking out Ghostscriptor some other library that knows how to actually render a PDF.

iText/iTextSharp 可以生成和/或修改现有的 PDF，但它们不执行您正在寻找的任何渲染。我建议查看Ghostscript或其他一些知道如何实际呈现 PDF 的库。

Answer 2

回答by changcn

you can use ImageMagick convert pdf to image

您可以使用 ImageMagick 将 pdf 转换为图像

convert -density 300 "d:\1.pdf" -scale @1500000 "d:\a.jpg"

转换 -密度 300 "d:\1.pdf" -scale @1500000 "d:\a.jpg"

and split pdf can use itextsharp

和拆分pdf可以使用itextsharp

here is the code from others.

这是其他人的代码。

void SplitePDF(string filepath)
    {
        iTextSharp.text.pdf.PdfReader reader = null;
        int currentPage = 1;
        int pageCount = 0;
        //string filepath_New = filepath + "\PDFDestination\";

        System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
        //byte[] arrayofPassword = encoding.GetBytes(ExistingFilePassword);
        reader = new iTextSharp.text.pdf.PdfReader(filepath);
        reader.RemoveUnusedObjects();
        pageCount = reader.NumberOfPages;
        string ext = System.IO.Path.GetExtension(filepath);
        for (int i = 1; i <= pageCount; i++)
        {
            iTextSharp.text.pdf.PdfReader reader1 = new iTextSharp.text.pdf.PdfReader(filepath);
            string outfile = filepath.Replace((System.IO.Path.GetFileName(filepath)), (System.IO.Path.GetFileName(filepath).Replace(".pdf", "") + "_" + i.ToString()) + ext);
            reader1.RemoveUnusedObjects();
            iTextSharp.text.Document doc = new iTextSharp.text.Document(reader.GetPageSizeWithRotation(currentPage));
            iTextSharp.text.pdf.PdfCopy pdfCpy = new iTextSharp.text.pdf.PdfCopy(doc, new System.IO.FileStream(outfile, System.IO.FileMode.Create));
            doc.Open();
            for (int j = 1; j <= 1; j++)
            {
                iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader1, currentPage);
                pdfCpy.SetFullCompression();
                pdfCpy.AddPage(page);
                currentPage += 1;
            }
            doc.Close();
            pdfCpy.Close();
            reader1.Close();
            reader.Close();

        }
    }

Answer 3

回答by Amer Sawan

You can use Ghostscriptto convert the PDF files into Images, I used the following parameters to convert the needed PDF into tiff image with multiple frames :

您可以使用Ghostscript将 PDF 文件转换为图像，我使用以下参数将所需的 PDF 转换为具有多帧的 tiff 图像：

gswin32c.exe   -sDEVICE=tiff12nc -dBATCH -r200 -dNOPAUSE  -sOutputFile=[Output].tiff [PDF FileName]

Also you can use the -q parameter for silent mode You can get more information about its output devices from here

您也可以将 -q 参数用于静默模式您可以从此处获取有关其输出设备的更多信息

After that I can easily load the tiff frames like the following

之后，我可以轻松加载如下所示的 tiff 帧

using (FileStream stream = new FileStream(@"C:\tEMP\image_$i.tiff", FileMode.Open, FileAccess.Read, FileShare.Read))
{
    BitmapDecoder dec = BitmapDecoder.Create(stream, BitmapCreateOptions.IgnoreImageCache, BitmapCacheOption.None);
    BitmapEncoder enc = BitmapEncoder.Create(dec.CodecInfo.ContainerFormat);
    enc.Frames.Add(dec.Frames[frameIndex]);
}

Answer 4

回答by Reddy

you can extract Image from PDF and save as JPG here is the sample code you need Itext Sharp

您可以从 PDF 中提取图像并另存为 JPG，这里是您需要的示例代码 Itext Sharp

 public IEnumerable<System.Drawing.Image> ExtractImagesFromPDF(string sourcePdf)
    {
        // NOTE:  This will only get the first image it finds per page.
        var pdf = new PdfReader(sourcePdf);
        var raf = new RandomAccessFileOrArray(sourcePdf);

        try
        {
            for (int pageNum = 1; pageNum <= pdf.NumberOfPages; pageNum++)
            {
                PdfDictionary pg = pdf.GetPageN(pageNum);

                // recursively search pages, forms and groups for images.
                PdfObject obj = ExtractImagesFromPDF_FindImageInPDFDictionary(pg);
                if (obj != null)
                {
                    int XrefIndex = Convert.ToInt32(((PRIndirectReference)obj).Number.ToString(CultureInfo.InvariantCulture));
                    PdfObject pdfObj = pdf.GetPdfObject(XrefIndex);
                    PdfStream pdfStrem = (PdfStream)pdfObj;
                    PdfImageObject pdfImage = new PdfImageObject((PRStream)pdfStrem);
                    System.Drawing.Image img = pdfImage.GetDrawingImage();
                    yield return img;
                }
            }
        }
        finally
        {
            pdf.Close();
            raf.Close();
        }
    }

C# 使用 itextsharp 将 Pdf 文件页面转换为图像

提问by Prithvi Raj Nandiwal

回答by Chris Haas

回答by changcn

回答by Amer Sawan

回答by Reddy

相关推荐

最近更新

标签

C# 使用 itextsharp 将 Pdf 文件页面转换为图像

提问by Prithvi Raj Nandiwal

回答by Chris Haas

回答by changcn

回答by Amer Sawan

回答by Reddy

相关推荐

C# 向字符串添加回车

C# 不允许新事务，因为会话 LINQ To Entity 中有其他线程正在运行

C# 我想了解@Html.DisplayFor(modelItem => item.FirstName) 中的 lambda 表达式

用字符串作为输入在c#中创建简单的excel表

相关推荐

最近更新

标签