javascript 使用 PDF.js 将 PDF 静态转换为 HTML

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16785198/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-27 05:59:11  来源:igfitidea点击:

Use PDF.js to statically convert a PDF to HTML

javascriptfirefoxpdf.js

提问by TheFlash

PDF.js is the latest library from Mozilla, and is a standards-based PDF renderer that is written entirely in Javascript. Currently you cannot access the generated HTML, and the library can only be used as a viewer. Is it possible to use PDF.js to statically convert a PDF to its HTML equivalent? Considering it renders in a browser, it must be HTML+CSS, and the JS would be used only for navigation.

PDF.js 是 Mozilla 的最新库,是完全用 Javascript 编写的基于标准的 PDF 渲染器。目前您无法访问生成的 HTML,该库只能用作查看器。是否可以使用 PDF.js 将 PDF 静态转换为其等效的 HTML?考虑到在浏览器中呈现,它必须是HTML+CSS,并且JS将仅用于导航。

After converting it to HTML I plan to use our existing HTML workflow to import/index/consume the page as if it were an ordinary HTML webpage.

将其转换为 HTML 后,我计划使用我们现有的 HTML 工作流程来导入/索引/使用页面,就好像它是一个普通的 HTML 网页一样。

回答by Asad Malik

Note: this is for the original question, as well as for others who may be visiting this for related help, as was the case with me. ;)

注意:这是针对原始问题,以及可能访问此问题寻求相关帮助的其他人,就像我的情况一样。;)

Answer:
You may try: Poppleror pdf2htmlEXwhich is based on Poppler.

答:
您可以尝试:Poppler或 基于Poppler 的pdf2htmlEX

I'd recommend looking at the pdf2htmlEX documentationit also has as very good comparison table.

我建议查看pdf2htmlEX 文档,它也有非常好的比较表

回答by Ika

pdf.js renders to Canvas so it can't be used to statically convert a PDF to HTML

pdf.js 呈现为 Canvas,因此它不能用于将 PDF 静态转换为 HTML

回答by TheFlash

DocPubis powered by PDFNet, a PDF SDK with C# support, which supports converting PDF to HTML offline.

DocPubPDFNet提供支持,这是一个支持 C# 的 PDF SDK,支持将 PDF 离线转换为 HTML。

WebViewerfrom the same company is an HTML5-based PDF viewer that renders documents on-the-fly within the browser.

来自同一家公司的WebViewer是基于 HTML5 的 PDF 查看器,可在浏览器中即时呈现文档。

WebViewer works with all major Web platforms; the viewer can be directly embedded and customized within any HTML5, Silverlight, or Flash application. The content can be instantly accessed from any system or device - including iPad/iPhone (iOS), Android, Windows (desktop & tablets), WP8, Linux, Mac, etc. -- demo

WebViewer 适用于所有主要的 Web 平台;查看器可以直接嵌入和自定义到任何 HTML5、Silverlight 或 Flash 应用程序中。内容可以从任何系统或设备即时访问 - 包括 iPad/iPhone (iOS)、Android、Windows(桌面和平板电脑)、WP8、Linux、Mac 等。 --演示

回答by TheFlash

AccuSoft has an HTML5-based PDF/DOC viewer called Prizm. I don't think this can convert the PDF statically to HTML, but it looks like a functional HTML5-based viewer. I have no experience with it, but the online HTML5 demo (the link) looks pretty impressive. They claim it can be used on PC & Mobile for great rendering of such files.

AccuSoft 有一个基于 HTML5 的 PDF/DOC 查看器,称为Prizm。我不认为这可以将 PDF 静态转换为 HTML,但它看起来像一个基于 HTML5 的功能查看器。我没有这方面的经验,但在线 HTML5 演示(链接)看起来非常令人印象深刻。他们声称它可以在 PC 和移动设备上用于对此类文件进行出色的渲染。

Accusoft HTML5 viewing technology can display virtually any document file—DOC, PDF, PPT, CAD and dozens more—through the native browser on almost any smartphone or tablet, with no additional apps or players required on users' devices.

Accusoft HTML5 查看技术几乎可以通过任何智能手机或平板电脑上的本机浏览器显示几乎任何文档文件——DOC、PDF、PPT、CAD 等等,而无需在用户设备上安装额外的应用程序或播放器。