使用 PHP 自动将 HTML 表格转换为 CSV?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10498632/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting HTML Table to a CSV automatically using PHP?
提问by Thompson
I am just in need to convert a this html table automatically in csv using PHP. Can someone provide any idea how to do this? Thanks.
我只需要使用 PHP 在 csv 中自动转换这个 html 表。有人可以提供任何想法如何做到这一点吗?谢谢。
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';
Guys, I just need $tableto convert in only .csvfile, which could be automatically generated using some PHP function. We can define path for that csv file to /test/home/path_to_csv
伙计们,我只需要$table转换成.csv文件,它可以使用一些 PHP 函数自动生成。我们可以将该 csv 文件的路径定义为/test/home/path_to_csv
回答by Baba
You can use str_get_htmlhttp://simplehtmldom.sourceforge.net/
您可以使用str_get_htmlhttp://simplehtmldom.sourceforge.net/
include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';
$html = str_get_html($table);
header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');
$fp = fopen("php://output", "w");
foreach($html->find('tr') as $element)
{
$th = array();
foreach( $element->find('th') as $row)
{
$th [] = $row->plaintext;
}
$td = array();
foreach( $element->find('td') as $row)
{
$td [] = $row->plaintext;
}
!empty($th) ? fputcsv($fp, $th) : fputcsv($fp, $td);
}
fclose($fp);
回答by avilporwal
You can use this function in separate js file:
您可以在单独的 js 文件中使用此功能:
function exportTableToCSV($table, filename) {
var $rows = $table.find('tr:has(td)'),
// Temporary delimiter characters unlikely to be typed by keyboard
// This is to avoid accidentally splitting the actual contents
tmpColDelim = String.fromCharCode(11), // vertical tab character
tmpRowDelim = String.fromCharCode(0), // null character
// actual delimiter characters for CSV format
colDelim = '","',
rowDelim = '"\r\n"',
// Grab text from table into CSV formatted string
csv = '"' + $rows.map(function (i, row) {
var $row = $(row),
$cols = $row.find('td');
return $cols.map(function (j, col) {
var $col = $(col),
text = $col.text();
return text.replace('"', '""'); // escape double quotes
}).get().join(tmpColDelim);
}).get().join(tmpRowDelim)
.split(tmpRowDelim).join(rowDelim)
.split(tmpColDelim).join(colDelim) + '"',
// Data URI
csvData = 'data:application/csv;charset=utf-8,' + encodeURIComponent(csv);
$(this)
.attr({
'download': filename,
'href': csvData,
'target': '_blank'
});
}
Now, to initiate this function, you can use:
现在,要启动此功能,您可以使用:
$('.getfile').click(
function() {
exportTableToCSV.apply(this, [$('#thetable'), 'filename.csv']);
});
where 'getfile' should be the class assigned to button, where you want to add call to action. (On clicking this button, the download popup will appear) and "thetable" should be the ID assigned to table you want to download.
其中“getfile”应该是分配给按钮的类,您要在其中添加号召性用语。(单击此按钮时,将出现下载弹出窗口)并且“thetable”应该是分配给要下载的表的 ID。
You can also change to the custom file name to download in code.
您还可以更改为自定义文件名以在代码中下载。
回答by Jacob Cruz
You can do this with arrays and regular expressions... See below
你可以用数组和正则表达式来做到这一点......见下文
$csv = array();
preg_match('/<table(>| [^>]*>)(.*?)<\/table( |>)/is',$table,$b);
$table = $b[2];
preg_match_all('/<tr(>| [^>]*>)(.*?)<\/tr( |>)/is',$table,$b);
$rows = $b[2];
foreach ($rows as $row) {
//cycle through each row
if(preg_match('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row)) {
//match for table headers
preg_match_all('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row,$b);
$csv[] = strip_tags(implode(',',$b[2]));
} elseif(preg_match('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row)) {
//match for table cells
preg_match_all('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row,$b);
$csv[] = strip_tags(implode(',',$b[2]));
}
}
$csv = implode("\n", $csv);
var_dump($csv);
Then you can use file_put_contents()to write the csv string to file..
然后您可以使用file_put_contents()将 csv 字符串写入文件..
回答by lukeocodes
To expand on the accepted answer I did this which allows me to ignore columns by class name and also deals with blank rows/columns.
为了扩展已接受的答案,我这样做了,这允许我按类名忽略列并处理空白行/列。
You can use str_get_html http://simplehtmldom.sourceforge.net/. Just include it and away you go! :)
您可以使用 str_get_html http://simplehtmldom.sourceforge.net/。只需包含它就可以了!:)
$html = str_get_html($html); // give this your HTML string
header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');
$fp = fopen("php://output", "w");
foreach($html->find('tr') as $element) {
$td = array();
foreach( $element->find('th') as $row) {
if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
$td [] = $row->plaintext;
}
}
if (!empty($td)) {
fputcsv($fp, $td);
}
$td = array();
foreach( $element->find('td') as $row) {
if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
$td [] = $row->plaintext;
}
}
if (!empty($td)) {
fputcsv($fp, $td);
}
}
fclose($fp);
exit;
回答by kmoney12
If anyone is using Baba's answer but scratching their head over extra white spaces being added, this will work:
如果有人在使用 Baba 的答案,但对添加的额外空格感到头疼,这将起作用:
include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';
$html = str_get_html($table);
$fileName="export.csv";
header('Content-type: application/ms-excel');
header("Content-Disposition: attachment; filename=$fileName");
$fp = fopen("php://output", "w");
$csvString="";
$html = str_get_html(trim($table));
foreach($html->find('tr') as $element)
{
$td = array();
foreach( $element->find('th') as $row)
{
$row->plaintext="\"$row->plaintext\"";
$td [] = $row->plaintext;
}
$td=array_filter($td);
$csvString.=implode(",", $td);
$td = array();
foreach( $element->find('td') as $row)
{
$row->plaintext="\"$row->plaintext\"";
$td [] = $row->plaintext;
}
$td=array_filter($td);
$csvString.=implode(",", $td)."\n";
}
echo $csvString;
fclose($fp);
exit;
}
}
回答by Sudin Manandhar
Baba's answer contains extra space. So, I updated the code to this:
爸爸的回答包含额外的空间。所以,我将代码更新为:
include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';
$html = str_get_html($table);
header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');
$fp = fopen("php://output", "w");
foreach($html->find('tr') as $element)
{
$td = array();
foreach( $element->find('th') as $row)
{
$td [] = $row->plaintext;
}
foreach( $element->find('td') as $row)
{
$td [] = $row->plaintext;
}
fputcsv($fp, $td);
}
fclose($fp);
回答by Ashwin Balani
Assuming that out_str has your html table data
假设 out_str 有你的 html 表格数据
$csv = $out_str;
$csv = str_replace("<table class='gradienttable'>","",$csv);
$csv = str_replace("</table>","",$csv);
$csv = str_replace("</td><td>",",",$csv);
$csv = str_replace("<td>","",$csv);
$csv = str_replace("</td>","",$csv);
$csv = str_replace("</font>","",$csv);
$csv = str_replace("</tr>","",$csv);
$csv = str_replace("<tr>","\n",$csv);
Remove any CSS from the file
从文件中删除任何 CSS
$csv = str_replace("<font color='yellow'>","",$csv);
$csv = str_replace("<font color='red'>","",$csv);
$csv = str_replace("<font color='green'>","",$csv);
$csv = str_replace("</th><th>",",",$csv);
$csv = str_replace("<th>","",$csv);
$csv = str_replace("</th>","",$csv);
file_put_contents('currentFile.csv',$csv);
Output the file currentFile.csv to the user
将文件 currentFile.csv 输出给用户
Hope it helps!
希望能帮助到你!
回答by capikaw
I've adapted a simple class based on the code found on this thread that now handles colspanand rowspan. Not heavily tested and I'm sure it could be optimized.
我已经根据在这个线程上找到的代码改编了一个简单的类,现在可以处理colspan和rowspan。没有经过大量测试,我相信它可以优化。
Usage:
用法:
require_once('table2csv.php');
$table = '<table border="1">
<tr>
<th colspan=2>Header 1</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
<tr>
<td rowspan=2>top left row</td>
<td>top right row</td>
</tr>
<tr>
<td>bottom right</td>
</tr>
</table>';
table2csv($table,"sample.csv",true);
table2csv.php
table2csv.php
<?php
//download @ http://simplehtmldom.sourceforge.net/
require_once('simple_html_dom.php');
$repeatContentIntoSpannedCells = false;
//--------------------------------------------------------------------------------------------------------------------
function table2csv($rawHTML,$filename,$repeatContent) {
//get rid of sups - they mess up the wmus
for ($i=1; $i <= 20; $i++) {
$rawHTML = str_replace("<sup>".$i."</sup>", "", $rawHTML);
}
global $repeatContentIntoSpannedCells;
$html = str_get_html(trim($rawHTML));
$repeatContentIntoSpannedCells = $repeatContent;
//we need to pre-initialize the array based on the size of the table (how many rows vs how many columns)
//counting rows is easy
$rowCount = count($html->find('tr'));
//column counting is a bit trickier, we have to iterate through the rows and basically pull out the max found
$colCount = 0;
foreach ($html->find('tr') as $element) {
$tempColCount = 0;
foreach ($element->find('th') as $cell) {
$tempColCount++;
}
if ($tempColCount == 0) {
foreach ($element->find('td') as $cell) {
$tempColCount++;
}
}
if ($tempColCount > $colCount) $colCount = $tempColCount;
}
$mdTable = array();
for ($i=0; $i < $rowCount; $i++) {
array_push($mdTable, array_fill(0, $colCount, NULL));
}
//////////done predefining array
$rowPos = 0;
$fp = fopen($filename, "w");
foreach ($html->find('tr') as $element) {
$colPos = 0;
foreach ($element->find('th') as $cell) {
if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
parseCell($cell,$mdTable,$rowPos,$colPos);
}
$colPos++;
}
foreach ($element->find('td') as $cell) {
if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
parseCell($cell,$mdTable,$rowPos,$colPos);
}
$colPos++;
}
$rowPos++;
}
foreach ($mdTable as $key => $row) {
//clean the data
array_walk($row, "cleanCell");
fputcsv($fp, $row);
}
}
function cleanCell(&$contents,$key) {
$contents = trim($contents);
//get rid of pesky  's (aka: non-breaking spaces)
$contents = trim($contents,chr(0xC2).chr(0xA0));
$contents = str_replace(" ", "", $contents);
}
function parseCell(&$cell,&$mdTable,&$rowPos,&$colPos) {
global $repeatContentIntoSpannedCells;
//if data has already been set into the cell, skip it
while (isset($mdTable[$rowPos][$colPos])) {
$colPos++;
}
$mdTable[$rowPos][$colPos] = $cell->plaintext;
if (isset($cell->rowspan)) {
for ($i=1; $i <= ($cell->rowspan)-1; $i++) {
$mdTable[$rowPos+$i][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
}
}
if (isset($cell->colspan)) {
for ($i=1; $i <= ($cell->colspan)-1; $i++) {
$colPos++;
$mdTable[$rowPos][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
}
}
}
?>

