C# 如何以编程方式将 Word 文件转换为 PDF?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/607669/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 10:07:53  来源:igfitidea点击:

How do I convert Word files to PDF programmatically?

c#vb.netpdfms-word

提问by Shaul Behr

I have found several open-source/freeware programs that allow you to convert .doc files to .pdf files, but they're all of the application/printer driver variety, with no SDK attached.

我发现了几个允许您将 .doc 文件转换为 .pdf 文件的开源/免费软件程序,但它们都是应用程序/打印机驱动程序类型,没有附加 SDK。

I have found several programs that do have an SDK allowing you to convert .doc files to .pdf files, but they're all of the proprietary type, $2,000 a license or thereabouts.

我发现有几个程序确实有一个 SDK,允许您将 .doc 文件转换为 .pdf 文件,但它们都是专有类型,许可证 2,000 美元左右。

Does anyone know of any clean, inexpensive (preferably free) programmatic solution to my problem, using C# or VB.NET?

有谁知道使用 C# 或 VB.NET 解决我的问题的任何干净、廉价(最好是免费的)编程解决方案吗?

Thanks!

谢谢!

采纳答案by Eric Ness

Use a foreach loop instead of a for loop - it solved my problem.

使用 foreach 循环而不是 for 循环 - 它解决了我的问题。

int j = 0;
foreach (Microsoft.Office.Interop.Word.Page p in pane.Pages)
{
    var bits = p.EnhMetaFileBits;
    var target = path1 +j.ToString()+  "_image.doc";
    try
    {
        using (var ms = new MemoryStream((byte[])(bits)))
        {
            var image = System.Drawing.Image.FromStream(ms);
            var pngTarget = Path.ChangeExtension(target, "png");
            image.Save(pngTarget, System.Drawing.Imaging.ImageFormat.Png);
        }
    }
    catch (System.Exception ex)
    {
        MessageBox.Show(ex.Message);  
    }
    j++;
}

Here is a modification of a program that worked for me. It uses Word 2007 with the Save As PDF add-ininstalled. It searches a directory for .doc files, opens them in Word and then saves them as a PDF. Note that you'll need to add a reference to Microsoft.Office.Interop.Word to the solution.

这是对我有用的程序的修改。它使用安装了“另存为 PDF”加载项的Word 2007 。它在目录中搜索 .doc 文件,在 Word 中打开它们,然后将它们另存为 PDF。请注意,您需要向解决方案添加对 Microsoft.Office.Interop.Word 的引用。

using Microsoft.Office.Interop.Word;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;

...

// Create a new Microsoft Word application object
Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();

// C# doesn't have optional arguments so we'll need a dummy value
object oMissing = System.Reflection.Missing.Value;

// Get list of Word files in specified directory
DirectoryInfo dirInfo = new DirectoryInfo(@"\server\folder");
FileInfo[] wordFiles = dirInfo.GetFiles("*.doc");

word.Visible = false;
word.ScreenUpdating = false;

foreach (FileInfo wordFile in wordFiles)
{
    // Cast as Object for word Open method
    Object filename = (Object)wordFile.FullName;

    // Use the dummy value as a placeholder for optional arguments
    Document doc = word.Documents.Open(ref filename, ref oMissing,
        ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
        ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
        ref oMissing, ref oMissing, ref oMissing, ref oMissing);
    doc.Activate();

    object outputFileName = wordFile.FullName.Replace(".doc", ".pdf");
    object fileFormat = WdSaveFormat.wdFormatPDF;

    // Save document into PDF Format
    doc.SaveAs(ref outputFileName,
        ref fileFormat, ref oMissing, ref oMissing,
        ref oMissing, ref oMissing, ref oMissing, ref oMissing,
        ref oMissing, ref oMissing, ref oMissing, ref oMissing,
        ref oMissing, ref oMissing, ref oMissing, ref oMissing);

    // Close the Word document, but leave the Word application open.
    // doc has to be cast to type _Document so that it will find the
    // correct Close method.                
    object saveChanges = WdSaveOptions.wdDoNotSaveChanges;
    ((_Document)doc).Close(ref saveChanges, ref oMissing, ref oMissing);
    doc = null;
}

// word has to be cast to type _Application so that it will find
// the correct Quit method.
((_Application)word).Quit(ref oMissing, ref oMissing, ref oMissing);
word = null;

回答by Todd Gamblin

回答by MikeW

Seems to be some relevent info here:

这里似乎有一些相关信息:

Converting MS Word Documents to PDF in ASP.NET

在 ASP.NET 中将 MS Word 文档转换为 PDF

Also, with Office 2007 having publish to PDF functionality, I guess you could use office automation to open the *.DOC file in Word 2007 and Save as PDF. I'm not too keen on office automation as it's slow and prone to hanging, but just throwing that out there...

此外,由于 Office 2007 具有发布到 PDF 的功能,我想您可以使用办公自动化在 Word 2007 中打开 *.DOC 文件并另存为 PDF。我不太热衷于办公自动化,因为它很慢而且容易挂起,但只是把它扔在那里......

回答by Mark Brackett

PDFCreatorhas a COM component, callable from .NET or VBScript (samples included in the download).

PDFCreator有一个 COM 组件,可从 .NET 或 VBScript(下载中包含示例)调用。

But, it seems to me that a printer is just what you need - just mix that with Word's automation, and you should be good to go.

但是,在我看来,打印机正是您所需要的——只需将其与Word 的自动化结合起来,您就可以开始使用了。

回答by Arvand

Microsoft PDF add-in for word seems to be the best solution for now but you should take into consideration that it does not convert all word documents correctly to pdf and in some cases you will see huge difference between the word and the output pdf. Unfortunately I couldn't find any api that would convert all word documents correctly. The only solution I found to ensure the conversion was 100% correct was by converting the documents through a printer driver. The downside is that documents are queued and converted one by one, but you can be sure the resulted pdf is exactly the same as word document layout. I personally preferred using UDC (Universal document converter) and installed Foxit Reader(free version) on server too then printed the documents by starting a "Process" and setting its Verb property to "print". You can also use FileSystemWatcher to set a signal when the conversion has completed.

Word 的 Microsoft PDF 插件似乎是目前最好的解决方案,但您应该考虑到它不会将所有 Word 文档正确转换为 pdf,并且在某些情况下,您会看到 word 和输出 pdf 之间存在巨大差异。不幸的是,我找不到任何可以正确转换所有 Word 文档的 api。我发现确保转换 100% 正确的唯一解决方案是通过打印机驱动程序转换文档。缺点是文档要一个一个排队转换,但是你可以确定生成的pdf和word文档布局完全一样。我个人更喜欢使用 UDC(通用文档转换器)并在服务器上安装 Foxit Reader(免费版),然后通过启动“进程”并将其动词属性设置为“打印”来打印文档。

回答by Elger Mensonides

To sum it up for vb.net users, the free option (must have office installed):

总结一下 vb.net 用户,免费选项(必须安装 office):

Microsoft office assembies download:

Microsoft Office 组件下载:

VB.NET example:

VB.NET 示例:

        Dim word As Application = New Application()
        Dim doc As Document = word.Documents.Open("c:\document.docx")
        doc.Activate()
        doc.SaveAs2("c:\document.pdf", WdSaveFormat.wdFormatPDF)
        doc.Close()

回答by Ggalla1779

I went through the Word to PDF pain when someone dumped me with 10000 word files to convert to PDF. Now I did it in C# and used Word interop but it was slow and crashed if I tried to use PC at all.. very frustrating.

当有人向我倾倒 10000 个单词文件以转换为 PDF 时,我经历了 Word 到 PDF 的痛苦。现在我在 C# 中完成了它并使用了 Word 互操作,但是如果我尝试使用 PC,它会很慢并且崩溃......非常令人沮丧。

This lead me to discovering I could dump interops and their slowness..... for Excel I use (EPPLUS) and then I discovered that you can get a free tool called Spire that allows converting to PDF... with limitations!

这让我发现我可以转储互操作和它们的缓慢..... 对于我使用的 Excel (EPPLUS),然后我发现您可以获得一个名为 Spire 的免费工具,它允许转换为 PDF...有限制!

http://www.e-iceblue.com/Introduce/free-doc-component.html#.VtAg4PmLRhE

http://www.e-iceblue.com/Introduce/free-doc-component.html#.VtAg4PmlRhE

回答by zeta

Just wanted to add that I used Microsoft.Interop libraries, specifically ExportAsFixedFormat function which I did not see used in this thread.

只是想补充一点,我使用了 Microsoft.Interop 库,特别是 ExportAsFixedFormat 函数,我在此线程中没有看到使用该函数。

using Microsoft.Office.Interop.Word;
using System.Runtime.InteropServices;
using System.IO;
using Microsoft.Office.Core;

Application app;

public string CreatePDF(string path, string exportDir)
{
    Application app = new Application();
    app.DisplayAlerts = WdAlertLevel.wdAlertsNone;
    app.Visible = true;

    var objPresSet = app.Documents;
    var objPres = objPresSet.Open(path, MsoTriState.msoTrue, MsoTriState.msoTrue, MsoTriState.msoFalse);

    var pdfFileName = Path.ChangeExtension(path, ".pdf");
    var pdfPath = Path.Combine(exportDir, pdfFileName);

    try
    {
        objPres.ExportAsFixedFormat(
            pdfPath,
            WdExportFormat.wdExportFormatPDF,
            false,
            WdExportOptimizeFor.wdExportOptimizeForPrint,
            WdExportRange.wdExportAllDocument
        );
    }
    catch
    {
        pdfPath = null;
    }
    finally
    {
        objPres.Close();
    }
    return pdfPath;
}

回答by daniele3004

Easy code and solution using Microsoft.Office.Interop.Wordto converd WORD in PDF

Microsoft.Office.Interop.Word用于将 WORD 转换为 PDF 的简单代码和解决方案

using Word = Microsoft.Office.Interop.Word;

private void convertDOCtoPDF()
{

  object misValue = System.Reflection.Missing.Value;
  String  PATH_APP_PDF = @"c:\..\MY_WORD_DOCUMENT.pdf"

  var WORD = new Word.Application();

  Word.Document doc   = WORD.Documents.Open(@"c:\..\MY_WORD_DOCUMENT.docx");
  doc.Activate();

  doc.SaveAs2(@PATH_APP_PDF, Word.WdSaveFormat.wdFormatPDF, misValue, misValue, misValue, 
  misValue, misValue, misValue, misValue, misValue, misValue, misValue);

  doc.Close();
  WORD.Quit();


  releaseObject(doc);
  releaseObject(WORD);

}

Add this procedure to release memory:

添加此过程以释放内存:

private void releaseObject(object obj)
{
  try
  {
      System.Runtime.InteropServices.Marshal.ReleaseComObject(obj);
      obj = null;
  }
  catch (Exception ex)
  {
      //TODO
  }
  finally
  {
     GC.Collect();
  }
}