使用 PHP 自动将 HTML 表格转换为 CSV?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10498632/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 22:20:49  来源:igfitidea点击:

Converting HTML Table to a CSV automatically using PHP?

phpcsv

提问by Thompson

I am just in need to convert a this html table automatically in csv using PHP. Can someone provide any idea how to do this? Thanks.

我只需要使用 PHP 在 csv 中自动转换这个 html 表。有人可以提供任何想法如何做到这一点吗?谢谢。

$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';

Guys, I just need $tableto convert in only .csvfile, which could be automatically generated using some PHP function. We can define path for that csv file to /test/home/path_to_csv

伙计们,我只需要$table转换成.csv文件,它可以使用一些 PHP 函数自动生成。我们可以将该 csv 文件的路径定义为/test/home/path_to_csv

回答by Baba

You can use str_get_htmlhttp://simplehtmldom.sourceforge.net/

您可以使用str_get_htmlhttp://simplehtmldom.sourceforge.net/

include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';

$html = str_get_html($table);



header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');

$fp = fopen("php://output", "w");

foreach($html->find('tr') as $element)
{
        $th = array();
        foreach( $element->find('th') as $row)  
        {
            $th [] = $row->plaintext;
        }

        $td = array();
        foreach( $element->find('td') as $row)  
        {
            $td [] = $row->plaintext;
        }
        !empty($th) ? fputcsv($fp, $th) : fputcsv($fp, $td);
}


fclose($fp);

回答by avilporwal

You can use this function in separate js file:

您可以在单独的 js 文件中使用此功能:

function exportTableToCSV($table, filename) {

        var $rows = $table.find('tr:has(td)'),

            // Temporary delimiter characters unlikely to be typed by keyboard
            // This is to avoid accidentally splitting the actual contents
            tmpColDelim = String.fromCharCode(11), // vertical tab character
            tmpRowDelim = String.fromCharCode(0), // null character

            // actual delimiter characters for CSV format
            colDelim = '","',
            rowDelim = '"\r\n"',

            // Grab text from table into CSV formatted string
            csv = '"' + $rows.map(function (i, row) {
                var $row = $(row),
                    $cols = $row.find('td');

                return $cols.map(function (j, col) {
                    var $col = $(col),
                        text = $col.text();

                    return text.replace('"', '""'); // escape double quotes

                }).get().join(tmpColDelim);

            }).get().join(tmpRowDelim)
                .split(tmpRowDelim).join(rowDelim)
                .split(tmpColDelim).join(colDelim) + '"',

            // Data URI
            csvData = 'data:application/csv;charset=utf-8,' + encodeURIComponent(csv);

        $(this)
            .attr({
            'download': filename,
                'href': csvData,
                'target': '_blank'
        });
    }

Now, to initiate this function, you can use:

现在,要启动此功能,您可以使用:

$('.getfile').click(
            function() { 
    exportTableToCSV.apply(this, [$('#thetable'), 'filename.csv']);
             });

where 'getfile' should be the class assigned to button, where you want to add call to action. (On clicking this button, the download popup will appear) and "thetable" should be the ID assigned to table you want to download.

其中“getfile”应该是分配给按钮的类,您要在其中添加号召性用语。(单击此按钮时,将出现下载弹出窗口)并且“thetable”应该是分配给要下载的表的 ID。

You can also change to the custom file name to download in code.

您还可以更改为自定义文件名以在代码中下载。

回答by Jacob Cruz

You can do this with arrays and regular expressions... See below

你可以用数组和正则表达式来做到这一点......见下文

$csv = array();
preg_match('/<table(>| [^>]*>)(.*?)<\/table( |>)/is',$table,$b);
$table = $b[2];
preg_match_all('/<tr(>| [^>]*>)(.*?)<\/tr( |>)/is',$table,$b);
$rows = $b[2];
foreach ($rows as $row) {
    //cycle through each row
    if(preg_match('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row)) {
        //match for table headers
        preg_match_all('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row,$b);
        $csv[] = strip_tags(implode(',',$b[2]));
    } elseif(preg_match('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row)) {
        //match for table cells
        preg_match_all('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row,$b);
        $csv[] = strip_tags(implode(',',$b[2]));
    }
}
$csv = implode("\n", $csv);
var_dump($csv);

Then you can use file_put_contents()to write the csv string to file..

然后您可以使用file_put_contents()将 csv 字符串写入文件..

回答by lukeocodes

To expand on the accepted answer I did this which allows me to ignore columns by class name and also deals with blank rows/columns.

为了扩展已接受的答案,我这样做了,这允许我按类名忽略列并处理空白行/列。

You can use str_get_html http://simplehtmldom.sourceforge.net/. Just include it and away you go! :)

您可以使用 str_get_html http://simplehtmldom.sourceforge.net/。只需包含它就可以了!:)

$html = str_get_html($html); // give this your HTML string

header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');

$fp = fopen("php://output", "w");

foreach($html->find('tr') as $element) {
  $td = array();
  foreach( $element->find('th') as $row) {
    if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
      $td [] = $row->plaintext;
    }
  }
  if (!empty($td)) {
    fputcsv($fp, $td);
  }

  $td = array();
  foreach( $element->find('td') as $row) {
    if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
      $td [] = $row->plaintext;
    }
  }
  if (!empty($td)) {
    fputcsv($fp, $td);
  }
}

fclose($fp);
exit;

回答by kmoney12

If anyone is using Baba's answer but scratching their head over extra white spaces being added, this will work:

如果有人在使用 Baba 的答案,但对添加的额外空格感到头疼,这将起作用:

include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';

$html = str_get_html($table);   

$fileName="export.csv";
header('Content-type: application/ms-excel');
header("Content-Disposition: attachment; filename=$fileName");

$fp = fopen("php://output", "w");
$csvString="";

$html = str_get_html(trim($table));
foreach($html->find('tr') as $element)
{

    $td = array();
    foreach( $element->find('th') as $row)
    {
        $row->plaintext="\"$row->plaintext\"";
        $td [] = $row->plaintext;
    }
    $td=array_filter($td);
    $csvString.=implode(",", $td);

    $td = array();
    foreach( $element->find('td') as $row)
    {
        $row->plaintext="\"$row->plaintext\"";
        $td [] = $row->plaintext;
    }
    $td=array_filter($td);
    $csvString.=implode(",", $td)."\n";
}
echo $csvString;
fclose($fp);
exit;

}

}

回答by Sudin Manandhar

Baba's answer contains extra space. So, I updated the code to this:

爸爸的回答包含额外的空间。所以,我将代码更新为:

include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';

$html = str_get_html($table);



header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');

$fp = fopen("php://output", "w");

foreach($html->find('tr') as $element)
{
    $td = array();
foreach( $element->find('th') as $row)
{
    $td [] = $row->plaintext;
}

foreach( $element->find('td') as $row)
{
    $td [] = $row->plaintext;
}
fputcsv($fp, $td);
}


fclose($fp);

回答by Ashwin Balani

Assuming that out_str has your html table data

假设 out_str 有你的 html 表格数据

$csv = $out_str;
        $csv = str_replace("<table class='gradienttable'>","",$csv);
        $csv = str_replace("</table>","",$csv);
        $csv = str_replace("</td><td>",",",$csv);
        $csv = str_replace("<td>","",$csv);
        $csv = str_replace("</td>","",$csv);
        $csv = str_replace("</font>","",$csv);
        $csv = str_replace("</tr>","",$csv);
        $csv = str_replace("<tr>","\n",$csv);

Remove any CSS from the file

从文件中删除任何 CSS

        $csv = str_replace("<font color='yellow'>","",$csv);
        $csv = str_replace("<font color='red'>","",$csv);
        $csv = str_replace("<font color='green'>","",$csv);
        $csv = str_replace("</th><th>",",",$csv);
        $csv = str_replace("<th>","",$csv);
        $csv = str_replace("</th>","",$csv);

        file_put_contents('currentFile.csv',$csv);

Output the file currentFile.csv to the user

将文件 currentFile.csv 输出给用户

Hope it helps!

希望能帮助到你!

回答by capikaw

I've adapted a simple class based on the code found on this thread that now handles colspanand rowspan. Not heavily tested and I'm sure it could be optimized.

我已经根据在这个线程上找到的代码改编了一个简单的类,现在可以处理colspanrowspan。没有经过大量测试,我相信它可以优化。

Usage:

用法:

require_once('table2csv.php');

$table = '<table border="1">
    <tr>
    <th colspan=2>Header 1</th>
    </tr>
    <tr>
    <td>row 1, cell 1</td>
    <td>row 1, cell 2</td>
    </tr>
    <tr>
    <td>row 2, cell 1</td>
    <td>row 2, cell 2</td>
    </tr>
    <tr>
    <td rowspan=2>top left row</td>
    <td>top right row</td>
    </tr>
    <tr>
    <td>bottom right</td>
    </tr>
    </table>';

table2csv($table,"sample.csv",true);

table2csv.php

table2csv.php

<?php

    //download @ http://simplehtmldom.sourceforge.net/
    require_once('simple_html_dom.php');
    $repeatContentIntoSpannedCells = false;


    //--------------------------------------------------------------------------------------------------------------------

    function table2csv($rawHTML,$filename,$repeatContent) {

        //get rid of sups - they mess up the wmus
        for ($i=1; $i <= 20; $i++) { 
            $rawHTML = str_replace("<sup>".$i."</sup>", "", $rawHTML);
        }

        global $repeatContentIntoSpannedCells;

        $html = str_get_html(trim($rawHTML));
        $repeatContentIntoSpannedCells = $repeatContent;

        //we need to pre-initialize the array based on the size of the table (how many rows vs how many columns)

        //counting rows is easy
        $rowCount = count($html->find('tr'));

        //column counting is a bit trickier, we have to iterate through the rows and basically pull out the max found
        $colCount = 0;
        foreach ($html->find('tr') as $element) {

            $tempColCount = 0;

            foreach ($element->find('th') as $cell) {
                $tempColCount++;
            }

            if ($tempColCount == 0) {
                foreach ($element->find('td') as $cell) {
                    $tempColCount++;
                }
            }

            if ($tempColCount > $colCount) $colCount = $tempColCount;
        }

        $mdTable = array();

        for ($i=0; $i < $rowCount; $i++) { 
            array_push($mdTable, array_fill(0, $colCount, NULL));
        }

        //////////done predefining array

        $rowPos = 0;
        $fp = fopen($filename, "w");

        foreach ($html->find('tr') as $element) {

            $colPos = 0;

            foreach ($element->find('th') as $cell) {
                if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
                    parseCell($cell,$mdTable,$rowPos,$colPos);
                }
                $colPos++;
            }

            foreach ($element->find('td') as $cell) {
                if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
                    parseCell($cell,$mdTable,$rowPos,$colPos);
                }
                $colPos++;
            }   

            $rowPos++;
        }


        foreach ($mdTable as $key => $row) {

            //clean the data
            array_walk($row, "cleanCell");
            fputcsv($fp, $row);
        }
    }


    function cleanCell(&$contents,$key) {

        $contents = trim($contents);

        //get rid of pesky &nbsp's (aka: non-breaking spaces)
        $contents = trim($contents,chr(0xC2).chr(0xA0));
        $contents = str_replace("&nbsp;", "", $contents);
    }


    function parseCell(&$cell,&$mdTable,&$rowPos,&$colPos) {

        global $repeatContentIntoSpannedCells;

        //if data has already been set into the cell, skip it
        while (isset($mdTable[$rowPos][$colPos])) {
            $colPos++;
        }

        $mdTable[$rowPos][$colPos] = $cell->plaintext;

        if (isset($cell->rowspan)) {

            for ($i=1; $i <= ($cell->rowspan)-1; $i++) {
                $mdTable[$rowPos+$i][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
            }
        }

        if (isset($cell->colspan)) {

            for ($i=1; $i <= ($cell->colspan)-1; $i++) {

                $colPos++;
                $mdTable[$rowPos][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
            }
        }
    }

?>