php 将 HTML + CSS 转换为 PDF
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/391005/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert HTML + CSS to PDF
提问by cletus
I have an HTML (not XHTML) document that renders fine in Firefox 3 and IE 7. It uses fairly basic CSS to style it and renders fine in HTML.
我有一个 HTML(不是 XHTML)文档,可以在 Firefox 3 和 IE 7 中很好地呈现。它使用相当基本的 CSS 来设置样式并在 HTML 中很好地呈现。
I'm now after a way of converting it to PDF. I have tried:
我现在正在寻找一种将其转换为 PDF 的方法。我试过了:
- DOMPDF: it had huge problems with tables. I factored out my large nested tables and it helped (before it was just consuming up to 128M of memory then dying--thats my limit on memory in php.ini) but it makes a complete mess of tables and doesn't seem to get images. The tables were just basic stuff with some border styles to add some lines at various points;
- HTML2PDF and HTML2PS: I actually had better luck with this. It rendered some of the images (all the images are Google Chart URLs) and the table formatting was much better but it seemed to have some complexity problem I haven't figured out yet and kept dying with unknown node_type() errors. Not sure where to go from here; and
- Htmldoc: this seems to work fine on basic HTML but has almost no support for CSS whatsoever so you have to do everything in HTML (I didn't realize it was still 2001 in Htmldoc-land...) so it's useless to me.
- DOMPDF:它在表格方面存在巨大问题。我分解了我的大型嵌套表,它有所帮助(之前它只消耗了多达 128M 的内存然后死了——这是我在 php.ini 中的内存限制)但它使表变得一团糟,似乎没有得到图片。表格只是一些基本的东西,带有一些边框样式,可以在不同的点添加一些线条;
- HTML2PDF 和 HTML2PS:实际上我的运气更好。它渲染了一些图像(所有图像都是谷歌图表 URL)并且表格格式要好得多,但它似乎有一些我还没有弄清楚的复杂问题,并且不断因未知的 node_type() 错误而死亡。不知道从这里去哪里;和
- Htmldoc:这似乎在基本 HTML 上运行良好,但几乎不支持 CSS,所以你必须在 HTML 中做所有事情(我没有意识到它在 Htmldoc 领域仍然是 2001 年......)所以它对我没用。
I tried a Windows app called Html2Pdf Pilot that actually did a pretty decent job but I need something that at a minimum runs on Linux and ideally runs on-demand via PHP on the Webserver.
我尝试了一个名为 Html2Pdf Pilot 的 Windows 应用程序,它实际上做得相当不错,但我需要一些至少可以在 Linux 上运行的东西,并且理想情况下可以通过 Web 服务器上的 PHP 按需运行。
What am I missing, or how can I resolve this issue?
我错过了什么,或者我该如何解决这个问题?
采纳答案by SchizoDuckie
Important:Please note that this answer was written in 2009 and it might not be the most cost-effective solution today in 2019. Online alternatives are better today at this than they were back then.
重要提示:请注意,此答案是在 2009 年编写的,它可能不是 2019 年今天最具成本效益的解决方案。今天的在线替代方案比当时更好。
Here are some online services that you can use:
以下是您可以使用的一些在线服务:
Have a look at PrinceXML.
看看PrinceXML。
It's definitely the best HTML/CSS to PDF converter out there, although it's not free (But hey, your programming might not be free either, so if it saves you 10 hours of work, you're home free (since you also need to take into account that the alternative solutions will require you to setup a dedicated server with the right software)
它绝对是最好的 HTML/CSS 到 PDF 转换器,虽然它不是免费的(但是,嘿,你的编程也可能不是免费的,所以如果它为你节省了 10 个小时的工作,你就可以免费回家了(因为你还需要考虑到替代解决方案将要求您使用正确的软件设置专用服务器)
Oh yeah, did I mention that this is the first (and probably only) HTML2PDF solution that does full ACID2?
哦,是的,我有没有提到这是第一个(也可能是唯一一个)执行完整ACID2 的HTML2PDF 解决方案?
回答by Mic
Have a look at wkhtmltopdf. It is open source, based on webkit and free.
看看wkhtmltopdf。它是开源的,基于 webkit 并且免费。
We wrote a small tutorial here.
我们在这里写了一个小教程。
EDIT( 2017 ):
编辑(2017 年):
If it was to build something today, I wouldn't go that route anymore.
But would use http://pdfkit.org/instead.
Probably stripping it of all its nodejs dependencies, to run in the browser.
如果是今天建造一些东西,我就不会再走这条路了。
但会使用http://pdfkit.org/代替。
可能剥离它所有的 nodejs 依赖项,以在浏览器中运行。
回答by cletus
After some investigation and general hair-pulling the solution seems to be HTML2PDF. DOMPDFdid a terrible job with tables, borders and even moderately complex layout and htmldocseems reasonably robust but is almost completely CSS-ignorant and I don't want to go back to doing HTML layout without CSS just for that program.
经过一些调查和一般的梳理,解决方案似乎是HTML2PDF。 DOMPDF在表格、边框甚至中等复杂的布局方面做得很糟糕,而且htmldoc看起来相当健壮,但几乎完全不了解 CSS,我不想只为那个程序做没有 CSS 的 HTML 布局。
HTML2PDF looked the most promising but I kept having this weird error about null reference arguments to node_type. I finally found the solution to this. Basically, PHP 5.1.x worked fine with regex replaces (preg_replace_*) on strings of any size. PHP 5.2.1 introduced a php.ini config directive called pcre.backtrack_limit. What this config parameter does is limits the string length for which matching is done. Why this was introduced I don't know. The default value was chosen as 100,000. Why such a low value? Again, no idea.
HTML2PDF 看起来最有前途,但我一直遇到关于 node_type 的空引用参数的奇怪错误。我终于找到了解决这个问题的方法。基本上,PHP 5.1.x 可以在任何大小的字符串上使用正则表达式替换 (preg_replace_*)。PHP 5.2.1 引入了一个名为pcre.backtrack_limit的 php.ini 配置指令。此配置参数的作用是限制完成匹配的字符串长度。为什么要介绍这个我不知道。默认值选择为 100,000。为什么这么低的价值?再次,不知道。
A bug was raised against PHP 5.2.1 for this, which is still open almost two years later.
为此,针对 PHP 5.2.1 提出了一个错误,该错误在将近两年后仍处于开放状态。
What's horrifying about this is that when the limit is exceeded, the replace just silently fails. At least if an error had been raised and logged you'd have some indication of what happened, why and what to change to fix it. But no.
可怕的是,当超过限制时,替换只是默默地失败。至少如果一个错误已经被提出并被记录下来,你就会有一些关于发生了什么、为什么以及改变什么来修复它的指示。但不是。
So I have a 70k HTML file to turn into PDF. It requires the following php.ini settings:
所以我有一个 70k 的 HTML 文件可以转换为 PDF。它需要以下 php.ini 设置:
- pcre.backtrack_limit = 2000000; # probably more than I need but that's OK
- memory_limit = 1024M; # yes, one gigabyte; and
- max_execution_time = 600; # yes, 10 minutes.
- pcre.backtrack_limit = 2000000; # 可能比我需要的多,但没关系
- 内存限制 = 1024M;# 是的,1 GB; 和
- max_execution_time = 600; # 是的,10 分钟。
Now the astute reader may have noticed that my HTML file is smaller than 100k. The only reason I can guess as to why I hit this problem is that html2pdf does a conversion into xhtml as part of the process. Perhaps that took me over (although nearly 50% bloat seems odd). Whatever the case, the above worked.
现在细心的读者可能已经注意到我的 HTML 文件小于 100k。我能猜到为什么我遇到这个问题的唯一原因是 html2pdf 作为过程的一部分转换为 xhtml。也许这让我接管了(尽管近 50% 的膨胀似乎很奇怪)。不管怎样,上面的方法都奏效了。
Now, html2pdf is a resource hog. My 70k file takes approximately 5 minutes and at least 500-600M of RAM to create a 35 page PDF file. Not quick enough (by far) for a real-time download unfortunately and the memory usage puts the memory usage ratio in the order of 1000-to-1 (600M of RAM for a 70k file), which is utterly ridiculous.
现在,html2pdf 是一个资源猪。我的 70k 文件大约需要 5 分钟和至少 500-600M 的 RAM 才能创建一个 35 页的 PDF 文件。不幸的是,实时下载速度不够快(到目前为止),而且内存使用率使内存使用率达到 1000 比 1(70k 文件的 RAM 为 600M)的数量级,这非常荒谬。
Unfortunately, that's the best I've come up with.
不幸的是,这是我想出的最好的。
回答by Karthick
Why don't you try mPDF version 2.0? I used it for creating PDF a document. It works fine.
为什么不试试mPDF 2.0 版?我用它来创建 PDF 文档。它工作正常。
Meanwhile mPDF is at version 5.7 and it is actively maintained, in contrast to HTML2PS/HTML2PDF
同时,mPDF 是 5.7 版本,并且正在积极维护,与 HTML2PS/HTML2PDF 相比
But keep in mind, that the documentation can really be hard to handle. For example, take a look at this page: https://mpdf.github.io/.
但请记住,文档确实很难处理。例如,看看这个页面:https: //mpdf.github.io/。
Very basic tasks around html to pdf, can be done with this library, but more complex tasks will take some time reading and "understanding" the documentation.
非常基本的 html 到 pdf 任务,可以用这个库完成,但更复杂的任务需要一些时间阅读和“理解”文档。
回答by T.Todua
1) use MPDF!
1) 使用MPDF!
a) extract in yourfolder
a) 提取 yourfolder
b) create file.phpin yourfolderand insert such code:
b)中创建file.php在yourfolder和插入这样的代码:
<?php
include('../mpdf.php');
$mpdf=new mPDF();
$mpdf->WriteHTML('<p style="color:red;">Hallo World<br/>Fisrt sentencee</p>');
$mpdf->Output(); exit;
?>
c) open file.phpfrom your browser
c)从浏览器打开file.php
2) Use pdfToHtml!
2) 使用pdfToHtml!
1) extract pdftohtml.exeto your root folder:
1) 将pdftohtml.exe 解压到您的根文件夹:
2) inside that folder, in anyfile.phpfile, put this code (assuming, there is a source example.pdf too):
2)在该文件夹内,在anyfile.php文件中,放置此代码(假设也有源 example.pdf ):
<?php
$source="example.pdf";
$output_fold="FinalFolder";
if (!file_exists($output_fold)) { mkdir($output_fold, 0777, true);}
$result= passthru("pdftohtml $source $output_fold/new_filename",$log);
//var_dump($result); var_dump($log);
?>
3) enter FinalFolder, and there will be the converted files (as many pages, as the source PDF had..)
3)输入FinalFolder,就会有转换后的文件(和源PDF一样多的页面......)
回答by Darryl Hein
回答by Filip Dupanovi?
Just to bump the thread, I've tried DOMPDF and it worked perfectly. I've used DIVand other block level elements to position everything, I kept it strictly CSS 2.1 and it played very nicely.
只是为了解决这个问题,我试过 DOMPDF 并且它工作得很好。我已经使用DIV和其他块级元素来定位所有内容,我严格保留它的 CSS 2.1 并且它播放得非常好。
回答by Starkers
It's already been mentioned, but I'd just like to confirm that mpdf is the easiest, most powerful and most free HTML to pdf converter out there. The sky's really the limit. You can even generate pdf of dynamic, user-generated data.
已经提到过,但我只想确认 mpdf 是最简单、最强大和最免费的 HTML 到 pdf 转换器。天空真的是极限。您甚至可以生成用户生成的动态数据的 pdf。
For instance, a client wanted a CMS system so he could update the tracklist of the music he played at his club. That was no problem, but he also wanted users to be able to download a .pdf of the playlist, and so this downloadable pdf had to be updated by the cms too. Thanks to mpdf, with some simple loops and interspersed variables I could do just that. Something that I thought would take me weeks literally took me minutes.
例如,客户想要一个 CMS 系统,这样他就可以更新他在俱乐部播放的音乐的曲目列表。那没问题,但他还希望用户能够下载播放列表的 .pdf,因此这个可下载的 pdf 也必须由 cms 更新。感谢 mpdf,通过一些简单的循环和穿插变量,我可以做到这一点。我认为需要几周时间的事情实际上只花了我几分钟。
Great articlethat helped me get started.
帮助我入门的好文章。
回答by Paulo Coghi - Reinstate Monica
Good news! Snappy!!
好消息!活泼!!
Snappy is a very easyopen source PHP5 library, allowing thumbnail, snapshot or PDF generation from a url or a html page. And... it uses the excellentwebkit-based wkhtmltopdf
Snappy 是一个非常简单的开源 PHP5 库,允许从 url 或 html 页面生成缩略图、快照或 PDF。而且...它使用了基于 webkit的优秀wkhtmltopdf
Enjoy! ^_^
享受!^_^

