VB.Net 将多个 pdf 合并为一个并导出
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33043151/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
VB.Net Merge multiple pdfs into one and export
提问by Vikky
I have to merge multiple PDFs into a single PDF.
我必须将多个 PDF 合并为一个 PDF。
I am using the iText.sharp library, and collect converted the code and tried to use it (from here) The actual code is in C# and I converted that to VB.NET.
我正在使用 iText.sharp 库,并收集转换后的代码并尝试使用它(从这里)实际代码在 C# 中,我将其转换为 VB.NET。
Private Function MergeFiles(ByVal sourceFiles As List(Of Byte())) As Byte()
Dim mergedPdf As Byte() = Nothing
Using ms As New MemoryStream()
Using document As New Document()
Using copy As New PdfCopy(document, ms)
document.Open()
For i As Integer = 0 To sourceFiles.Count - 1
Dim reader As New PdfReader(sourceFiles(i))
' loop over the pages in that document
Dim n As Integer = reader.NumberOfPages
Dim page As Integer = 0
While page < n
page = page + 1
copy.AddPage(copy.GetImportedPage(reader, page))
End While
Next
End Using
End Using
mergedPdf = ms.ToArray()
End Using
End Function
I am now getting the following error:
我现在收到以下错误:
An item with the same key has already been added.
已添加具有相同键的项目。
I did some debugging and have tracked the problem down to the following lines:
我做了一些调试,并将问题追踪到以下几行:
copy.AddPage(copy.GetImportedPage(reader,
copy.AddPage(copy.GetImportedPage(reader, page)))
Why is this error happening?
为什么会发生此错误?
采纳答案by Sean Wessell
I have a console that monitors individual folders in a designated folder then needs to merge all of the pdf's in that folder into a single pdf. I pass an array of file paths as strings and the output file i would like.
我有一个控制台可以监控指定文件夹中的各个文件夹,然后需要将该文件夹中的所有 pdf 合并为一个 pdf。我将一组文件路径作为字符串传递,并传递我想要的输出文件。
This is the function i use.
这是我使用的功能。
Public Shared Function MergePdfFiles(ByVal pdfFiles() As String, ByVal outputPath As String) As Boolean
Dim result As Boolean = False
Dim pdfCount As Integer = 0 'total input pdf file count
Dim f As Integer = 0 'pointer to current input pdf file
Dim fileName As String
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
Dim pageCount As Integer = 0
Dim pdfDoc As iTextSharp.text.Document = Nothing 'the output pdf document
Dim writer As PdfWriter = Nothing
Dim cb As PdfContentByte = Nothing
Dim page As PdfImportedPage = Nothing
Dim rotation As Integer = 0
Try
pdfCount = pdfFiles.Length
If pdfCount > 1 Then
'Open the 1st item in the array PDFFiles
fileName = pdfFiles(f)
reader = New iTextSharp.text.pdf.PdfReader(fileName)
'Get page count
pageCount = reader.NumberOfPages
pdfDoc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1), 18, 18, 18, 18)
writer = PdfWriter.GetInstance(pdfDoc, New FileStream(outputPath, FileMode.OpenOrCreate))
With pdfDoc
.Open()
End With
'Instantiate a PdfContentByte object
cb = writer.DirectContent
'Now loop thru the input pdfs
While f < pdfCount
'Declare a page counter variable
Dim i As Integer = 0
'Loop thru the current input pdf's pages starting at page 1
While i < pageCount
i += 1
'Get the input page size
pdfDoc.SetPageSize(reader.GetPageSizeWithRotation(i))
'Create a new page on the output document
pdfDoc.NewPage()
'If it is the 1st page, we add bookmarks to the page
'Now we get the imported page
page = writer.GetImportedPage(reader, i)
'Read the imported page's rotation
rotation = reader.GetPageRotation(i)
'Then add the imported page to the PdfContentByte object as a template based on the page's rotation
If rotation = 90 Then
cb.AddTemplate(page, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(i).Height)
ElseIf rotation = 270 Then
cb.AddTemplate(page, 0, 1.0F, -1.0F, 0, reader.GetPageSizeWithRotation(i).Width + 60, -30)
Else
cb.AddTemplate(page, 1.0F, 0, 0, 1.0F, 0, 0)
End If
End While
'Increment f and read the next input pdf file
f += 1
If f < pdfCount Then
fileName = pdfFiles(f)
reader = New iTextSharp.text.pdf.PdfReader(fileName)
pageCount = reader.NumberOfPages
End If
End While
'When all done, we close the document so that the pdfwriter object can write it to the output file
pdfDoc.Close()
result = True
End If
Catch ex As Exception
Return False
End Try
Return result
End Function
回答by Gerald Leesmann
the code that was marked correct does not close all the file streams therefore the files stay open within the app and you wont be able to delete unused PDFs within your project
标记为正确的代码不会关闭所有文件流,因此文件在应用程序中保持打开状态,您将无法删除项目中未使用的 PDF
This is a better solution:
这是一个更好的解决方案:
Public Sub MergePDFFiles(ByVal outPutPDF As String)
Dim StartPath As String = FileArray(0) ' this is a List Array declared Globally
Dim document = New Document()
Dim outFile = Path.Combine(outPutPDF)' The outPutPDF varable is passed from another sub this is the output path
Dim writer = New PdfCopy(document, New FileStream(outFile, FileMode.Create))
Try
document.Open()
For Each fileName As String In FileArray
Dim reader = New PdfReader(Path.Combine(StartPath, fileName))
For i As Integer = 1 To reader.NumberOfPages
Dim page = writer.GetImportedPage(reader, i)
writer.AddPage(page)
Next i
reader.Close()
Next
writer.Close()
document.Close()
Catch ex As Exception
'catch a Exception if needed
Finally
writer.Close()
document.Close()
End Try
End Sub
回答by G_Hosa_Phat
I realize I'm pretty late to the party, but after reading the comments from @BrunoLowagie, I wanted to see if I could put something together myself that uses the examples from his linked sample chapter. It's probably overkill, but I put together some code that merges multiple PDFs into a single file that I posted on the Code Review SEsite (the post, VB.NET - Error Handling in Generic Class for PDF Merge, contains the full class code). It only merges PDF files right now, but I'm planning on adding methods for additional functionality later.
我意识到我参加聚会已经很晚了,但是在阅读了@BrunoLowagie的评论后,我想看看我是否可以使用他链接的示例章节中的示例将一些东西放在一起。这可能有点矫枉过正,但我将一些将多个 PDF 合并到我发布在Code Review SE网站上的单个文件的代码放在一起(帖子,VB.NET - 用于 PDF 合并的通用类中的错误处理,包含完整的类代码) . 它现在只合并 PDF 文件,但我计划稍后添加其他功能的方法。
The "master" method (towards the end of the Classblock in the linked post, and also posted below for reference) handles the actual merging of the PDF files, but the multiple overloads provide a number of options for how to define the list of original files. So far, I've included the following features:
“主”方法(Class在链接帖子中的块末尾,也在下面发布以供参考)处理 PDF 文件的实际合并,但多个重载提供了许多关于如何定义原始列表的选项文件。到目前为止,我已经包含了以下功能:
- The methods return a
System.IO.FileInfoobject if the merge is successful. - Provide a
System.IO.DirectoryInfoobject or aSystem.Stringidentifying a path and it will collect all PDF files in that directory (including sub-directories if specified) to merge. - Provide a
List(Of System.String)or aList(Of System.IO.FileInfo)specifying the PDFs you want to merge. - Identify how the PDFs should be sorted before the merge (especially useful if you use one of the
MergeAllmethods to get all PDF files in a directory). - If the specified output PDF file already exists, you can specify whether or not you want to overwrite it. (I'm considering adding the "ability" to automatically adjust the output PDF file's name if it already exists).
WarningandErrorproperties provide a way to get feedback in the calling method, whether or not the merge is successful.
System.IO.FileInfo如果合并成功,这些方法将返回一个对象。- 提供一个
System.IO.DirectoryInfo对象或一个System.String标识路径,它将收集该目录(包括子目录,如果指定)中的所有 PDF 文件以进行合并。 - 提供 a
List(Of System.String)或 aList(Of System.IO.FileInfo)指定要合并的 PDF。 - 确定在合并之前应如何对 PDF 进行排序(如果您使用其中一种
MergeAll方法获取目录中的所有 PDF 文件,则特别有用)。 - 如果指定的输出 PDF 文件已经存在,您可以指定是否要覆盖它。(我正在考虑添加“功能”以自动调整已存在的输出 PDF 文件的名称)。
Warning和Error属性提供了一种在调用方法中获取反馈的方法,无论合并是否成功。
Once the code is in place, it can be used like this:
一旦代码就位,就可以像这样使用它:
Dim PDFDir As New IO.DirectoryInfo("C:\Test Data\PDF\")
Dim ResultFile As IO.FileInfo = Nothing
Dim Merger As New PDFManipulator
ResultFile = Merger.MergeAll(PDFDir, "C:\Test Data\PDF\Merged.pdf", True, PDFManipulator.PDFMergeSortOrder.FileName, True)
Here is the "master" method. As I said, it's probably overkill (and I'm stilltweaking it some), but I wanted to do my best to try to make it work as effectively as possible. Obviously it requires a Reference to the itextsharp.dllfor access to the library's functions.
这是“主”方法。正如我所说,这可能有点矫枉过正(我仍在对其进行一些调整),但我想尽最大努力使其尽可能有效地工作。显然,它需要一个引用来itextsharp.dll访问库的函数。
I've commented out the references to the Errorand Warningproperties of the class for thispost to help reduce any confusion.
我已经注释掉了这篇文章中对类的Error和Warning属性的引用,以帮助减少任何混淆。
Public Function Merge(ByVal PDFFiles As List(Of System.IO.FileInfo), ByVal OutputFileName As String, ByVal OverwriteExistingPDF As Boolean, ByVal SortOrder As PDFMergeSortOrder) As System.IO.FileInfo
Dim ResultFile As System.IO.FileInfo = Nothing
Dim ContinueMerge As Boolean = True
If OverwriteExistingPDF Then
If System.IO.File.Exists(OutputFileName) Then
Try
System.IO.File.Delete(OutputFileName)
Catch ex As Exception
ContinueMerge = False
'If Errors Is Nothing Then
' Errors = New List(Of String)
'End If
'Errors.Add("Could not delete existing output file.")
Throw
End Try
End If
End If
If ContinueMerge Then
Dim OutputPDF As iTextSharp.text.Document = Nothing
Dim Copier As iTextSharp.text.pdf.PdfCopy = Nothing
Dim PDFStream As System.IO.FileStream = Nothing
Dim SortedList As New List(Of System.IO.FileInfo)
Try
Select Case SortOrder
Case PDFMergeSortOrder.Original
SortedList = PDFFiles
Case PDFMergeSortOrder.FileDate
SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.LastWriteTime).ToList
Case PDFMergeSortOrder.FileName
SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.Name).ToList
Case PDFMergeSortOrder.FileNameWithDirectory
SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.FullName).ToList
End Select
If Not IO.Directory.Exists(New IO.FileInfo(OutputFileName).DirectoryName) Then
Try
IO.Directory.CreateDirectory(New IO.FileInfo(OutputFileName).DirectoryName)
Catch ex As Exception
ContinueMerge = False
'If Errors Is Nothing Then
' Errors = New List(Of String)
'End If
'Errors.Add("Could not create output directory.")
Throw
End Try
End If
If ContinueMerge Then
OutputPDF = New iTextSharp.text.Document
PDFStream = New System.IO.FileStream(OutputFileName, System.IO.FileMode.OpenOrCreate)
Copier = New iTextSharp.text.pdf.PdfCopy(OutputPDF, PDFStream)
OutputPDF.Open()
For Each PDF As System.IO.FileInfo In SortedList
If ContinueMerge Then
Dim InputReader As iTextSharp.text.pdf.PdfReader = Nothing
Try
InputReader = New iTextSharp.text.pdf.PdfReader(PDF.FullName)
For page As Integer = 1 To InputReader.NumberOfPages
Copier.AddPage(Copier.GetImportedPage(InputReader, page))
Next page
If InputReader.IsRebuilt Then
'If Warnings Is Nothing Then
' Warnings = New List(Of String)
'End If
'Warnings.Add("Damaged PDF: " & PDF.FullName & " repaired and successfully merged into output file.")
End If
Catch InvalidEx As iTextSharp.text.exceptions.InvalidPdfException
'Skip this file
'If Errors Is Nothing Then
' Errors = New List(Of String)
'End If
'Errors.Add("Invalid PDF: " & PDF.FullName & " not merged into output file.")
Catch FormatEx As iTextSharp.text.pdf.BadPdfFormatException
'Skip this file
'If Errors Is Nothing Then
' Errors = New List(Of String)
'End If
'Errors.Add("Bad PDF Format: " & PDF.FullName & " not merged into output file.")
Catch PassworddEx As iTextSharp.text.exceptions.BadPasswordException
'Skip this file
'If Errors Is Nothing Then
' Errors = New List(Of String)
'End If
'Errors.Add("Password-protected PDF: " & PDF.FullName & " not merged into output file.")
Catch OtherEx As Exception
ContinueMerge = False
Finally
If Not InputReader Is Nothing Then
InputReader.Close()
InputReader.Dispose()
End If
End Try
End If
Next PDF
End If
Catch ex As iTextSharp.text.pdf.PdfException
ResultFile = Nothing
ContinueMerge = False
'If Errors Is Nothing Then
' Errors = New List(Of String)
'End If
'Errors.Add("iTextSharp Error: " & ex.Message)
If System.IO.File.Exists(OutputFileName) Then
If Not OutputPDF Is Nothing Then
OutputPDF.Close()
OutputPDF.Dispose()
End If
If Not PDFStream Is Nothing Then
PDFStream.Close()
PDFStream.Dispose()
End If
If Not Copier Is Nothing Then
Copier.Close()
Copier.Dispose()
End If
System.IO.File.Delete(OutputFileName)
End If
Throw
Catch other As Exception
ResultFile = Nothing
ContinueMerge = False
'If Errors Is Nothing Then
' Errors = New List(Of String)
'End If
'Errors.Add("General Error: " & other.Message)
If System.IO.File.Exists(OutputFileName) Then
If Not OutputPDF Is Nothing Then
OutputPDF.Close()
OutputPDF.Dispose()
End If
If Not PDFStream Is Nothing Then
PDFStream.Close()
PDFStream.Dispose()
End If
If Not Copier Is Nothing Then
Copier.Close()
Copier.Dispose()
End If
System.IO.File.Delete(OutputFileName)
End If
Throw
Finally
If Not OutputPDF Is Nothing Then
OutputPDF.Close()
OutputPDF.Dispose()
End If
If Not PDFStream Is Nothing Then
PDFStream.Close()
PDFStream.Dispose()
End If
If Not Copier Is Nothing Then
Copier.Close()
Copier.Dispose()
End If
If System.IO.File.Exists(OutputFileName) Then
If ContinueMerge Then
ResultFile = New System.IO.FileInfo(OutputFileName)
If ResultFile.Length <= 0 Then
ResultFile = Nothing
Try
System.IO.File.Delete(OutputFileName)
Catch ex As Exception
Throw
End Try
End If
Else
ResultFile = Nothing
Try
System.IO.File.Delete(OutputFileName)
Catch ex As Exception
Throw
End Try
End If
Else
ResultFile = Nothing
End If
End Try
End If
Return ResultFile
End Function
回答by Coder999
Some may have to make a change to the code at "writer = PdfWriter.GetInstance(pdfDoc, New FileStream(outputPath, FileMode.OpenOrCreate))" as iTextSharpmay not support
有些人可能需要对“ writer = PdfWriter.GetInstance(pdfDoc, New FileStream(outputPath, FileMode.OpenOrCreate))”处的代码进行更改,因为iTextSharp可能不支持
Change to:
改成:
Dim fs As IO.FileStream = New IO.FileStream(outputPath, IO.FileMode.Create)
writer = iTextSharp.text.pdf.PdfWriter.GetInstance(pdfDoc, fs)

