Html 将 pdf、doc、ppt 转换为 html5

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3199659/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 03:43:12  来源:igfitidea点击:

Convert pdf, doc, ppt to html5

htmlscribddocument-conversion

提问by KevMo

I've googled (without any luck) for open source software that can convert doc, ppt, and pdf to HTML5.(Exactly what Scribddoes) Are there open source equivalents to the type of conversion Scribd does?

我已经用谷歌搜索了(没有任何运气)open source software that can convert doc, ppt, and pdf to HTML5.(正是Scribd所做的)是否有开源等效于 Scribd 所做的转换类型?

If anyone knows of a paid service, that would also work. Scribd has an API, but that's for use with the flash viewer. Also, I would like to host my own content as I need further control over converted html document.

如果有人知道付费服务,那也行。Scribd 有一个API,但它用于 Flash 查看器。另外,我想托管我自己的内容,因为我需要进一步控制转换后的 html 文档

回答by imoatama

You're unlikely to find a single offering that does all this, especially in the open source world. It's more likely that you'll end up relying on a mishmash of things, and may even need to chain some converters in order to get to HTML. (Eg PDF -> ps -> HTML)

您不太可能找到能够完成所有这些工作的单一产品,尤其是在开源世界中。您更有可能最终依赖于混杂的事物,甚至可能需要链接一些转换器才能获得 HTML。(例如 PDF -> ps -> HTML)

OpenOffice supports conversion to HTML, and can be called from the command line.

OpenOffice 支持转换为 HTML,并且可以从命令行调用。

http://pdftohtml.sourceforge.net/looks reasonably good at converting pdf to html.

http://pdftohtml.sourceforge.net/在将 pdf 转换为 html 方面看起来相当不错。

For Doc that is Word ML or OpenXML format it's conceivable that you could use XSLT transforms since both input and output formats are XML. I've seen some stylesheets floating around the net that do this, but YMMV.

对于 Word ML 或 OpenXML 格式的 Doc,可以想象您可以使用 XSLT 转换,因为输入和输出格式都是 XML。我已经看到一些在网上漂浮的样式表可以做到这一点,但 YMMV。

Incidentally, why is there a specific requirement for open source? MS Powerpoint already supports save-as-HTML for example.

顺便说一句,为什么对开源有特定的要求?例如,MS Powerpoint 已经支持另存为 HTML。

回答by Mark Essel

Open Office will convert pdf to html but you'll take a hit to design quality.

Open Office 会将 pdf 转换为 html,但您会受到设计质量的影响。

I suggest either: Crocodocas a paid service (It provides different flavours for different platforms such as Python,Ruby,Java,PHPDevelopers are allowed to work on their APIs.) or waiting for an official Adobe tool (it's in the works).

我建议要么:Crocodoc作为付费服务(它为不同的平台提供不同的风格,如Python、Ruby、Java、PHP开发人员可以在他们的 API 上工作。)或等待官方的 Adob​​e 工具(它正在开发中)。

回答by amit_saxena

For PDF to HTML conversion, pdf2htmlEX seems like a pretty good tool (looking at all the examples/samples):

对于 PDF 到 HTML 的转换,pdf2htmlEX 似乎是一个非常好的工具(查看所有示例/示例):

https://github.com/coolwanglu/pdf2htmlEX

https://github.com/coolwanglu/pdf2htmlEX

回答by PF4Public

http://wvware.sourceforge.net/

http://wvware.sourceforge.net/

wvHtml: convert your Word document into HTML4.0.

wvHtml:将您的 Word 文档转换为 HTML4.0。

Possibly: http://www.abisource.com/but in this case it looks like "open doc" > "export html" manually, maybe plugins help. Not sure, what do you mean: "source software that can convert".

可能:http: //www.abisource.com/但在这种情况下,它看起来像手动“打开文档”>“导出 html”,也许插件有帮助。不确定,你是什么意思:“可以转换的源软件”。

Or this: http://www.zope.org/Members/sf/NuxDocument

或者这个:http: //www.zope.org/Members/sf/NuxDocument

Also the pdftohtml will give you an html page output.But you will have to work upon its graphical interface.Since it doesn't seems to be very interactive.

pdftohtml 也会给你一个 html 页面输出。但你必须在它的图形界面上工作。因为它似乎不是很互动。

回答by Doua Beri

For pdf there is an open source project started by mozilla and it's very good: https://github.com/mozilla/pdf.js/

对于 pdf 有一个由 mozilla 启动的开源项目,它非常好:https: //github.com/mozilla/pdf.js/

You can see a hello world example : https://github.com/mozilla/pdf.js/tree/master/examples/helloworld

您可以看到一个 hello world 示例:https: //github.com/mozilla/pdf.js/tree/master/examples/helloworld

For the rest of document types I think LibreOffice said that are planning to build something in html5, but so far there isn't anything done.

对于其余的文档类型,我认为 LibreOffice 表示计划在 html5 中构建一些内容,但到目前为止还没有完成任何事情。

回答by siddhant Kumar

I know the question is bit old however I have found new Open source tool called flaxpaper http://flexpaper.devaldi.com/

我知道这个问题有点老了,但是我发现了名为 flaxpaper http://flexpaper.devaldi.com/ 的新开源工具