如何在 HTML/PHP 中显示格式化的 Word Doc?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5334301/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 21:02:49  来源:igfitidea点击:

How do you display a formatted Word Doc in HTML/PHP?

phpms-wordopenxmldoc

提问by mm-93

What is the best way to display a formatted Word Doc in HTML/PHP?

在 HTML/PHP 中显示格式化的 Word Doc 的最佳方式是什么?

Here is the code I currently have but it doesn't format it:

这是我目前拥有的代码,但没有对其进行格式化:

$word = new COM("word.application") or die ("Could not initialise MS Word object.");
$word->Documents->Open(realpath("ACME.doc"));

// Extract content.
$content = (string) $word->ActiveDocument->Content;

echo $content;

$word->ActiveDocument->Close(false);

$word->Quit();
$word = null;
unset($word);

采纳答案by Charles

I know nothing about COM, but poking around the Word API docs on MSDN, it looks like your best bet is going to be using Document.SaveAsto save as wsFormatFilteredHTMLto a temporary file, then serving that HTML to the user. Be sure to pick the filteredHTML, otherwise you're going to get the soupiest tag soup ever.

我对 COM 一无所知,但是在 MSDN 上浏览 Word API 文档,看起来您最好的选择是将Document.SaveAs其保存为wsFormatFilteredHTML临时文件,然后将该 HTML 提供给用户。一定要挑过滤HTML,否则你会得到soupiest标签汤永远

回答by mm-93

I figured this out. Check out the solution to reading a Word Doc and formatting it in HTML:

我想通了。查看阅读 Word Doc 并将其格式化为 HTML 的解决方案:

$filename = "ACME.doc";
$word = new COM("word.application") or die ("Could not initialise MS Word object.");
$word->Documents->Open(realpath($filename));

$new_filename = substr($filename,0,-4) . ".html";

// the '2' parameter specifies saving in txt format
// the '6' parameter specifies saving in rtf format
// the '8' parameter specifies saving in html format
$word->Documents[1]->SaveAs("C:/a1/projects/---full path--- /".$new_filename,8);
$word->Documents[1]->Close(false);
$word->Quit();
//$word->Release();
$word = NULL;
unset($word);

$fh = fopen($new_filename, 'r');
$contents = fread($fh, filesize($new_filename));
echo $contents;
fclose($fh);
//unlink($new_filename);

Couple of things... Having "charset=UTF-8" at the top of my PHP page was adding a bunch of diamonds with questions marks... I deleted that and it works perfectly.

有几件事......在我的 PHP 页面顶部有“charset=UTF-8”是添加了一堆带问号的菱形......我删除了它,它完美地工作。

Also, the SaveAs has to have the full path, at least locally, I added that to get it to work.

此外, SaveAs 必须具有完整路径,至少在本地,我添加了它以使其工作。

Thanks again for your help.

再次感谢你的帮助。

回答by Tony Nassar

I needed correct XHTML, which Office won't give you (I do notunderstand that). You can use tools such as JTidy or TagSoup to fix the HTML, if you need to. Cf. http://slideguitarist.blogspot.com/2011/03/exporting-word-documents-to-html.html

我需要正确的 XHTML,Office 不会给你(我明白)。如果需要,您可以使用 JTidy 或 TagSoup 等工具来修复 HTML。参见 http://slideguitarist.blogspot.com/2011/03/exporting-word-documents-to-html.html