php-excel-reader - UTF-8 问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3666412/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
php-excel-reader - problem with UTF-8
提问by Viktor Stískala
I'm using php-excel-reader2.21 for converting XLS file to CSV. I wrote a simple script to do that, but I have some problems with unicode characters. It does not return values from some cells.
我正在使用php-excel-reader2.21 将 XLS 文件转换为 CSV。我写了一个简单的脚本来做到这一点,但我对 unicode 字符有一些问题。它不会从某些单元格返回值。
For example it doesn't have problems with cell content ceník polo?ek
but have problems with nákup
, VYROBCE
, PáS
, HRUBY
,NáKLADNí
and some others. In these cells it returns empty value (""
).
例如,它在单元格内容方面ceník polo?ek
没有问题nákup
,但在VYROBCE
、PáS
、HRUBY
、NáKLADNí
和其他一些方面有问题。在这些单元格中,它返回空值 ( ""
)。
Here is the code snippet I use for conversion:
这是我用于转换的代码片段:
<?php
set_time_limit(120);
require_once 'excel_reader2.php';
$data = new Spreadsheet_Excel_Reader("cenik.xls", false, 'UTF-8');
$f = fopen('file.csv', 'w');
for($row = 1; $row <= $data->rowcount(); $row++)
{
$out = '';
for($col = 1; $col <= $data->colcount(); $col++)
{
$val = $data->val($row,$col);
// escape " and \ characters inside the cell
$escaped = preg_replace(array('#”#u', '#\\#u', '#[”"]#u'), array('"', '\\\\', '\"'), $val);
if(empty($val))
$out .= ',';
else
$out .= '"' . $escaped . '",';
}
// remove last comma (,)
fwrite($f, substr($out, 0, -1));
fwrite($f, "\n");
}
fclose($f);
?>
Note that the cell and row indexes starts from 1. Any suggestions?
请注意,单元格和行索引从 1 开始。有什么建议吗?
回答by cypher
I hope it's the same problem as I had: In excel_reader2.php on line 1120, replace
我希望它和我遇到的问题一样:在 excel_reader2.php 的第 1120 行,替换
$retstr = ($asciiEncoding) ? $retstr : $this->_encodeUTF16($retstr);
with
和
$retstr = ($asciiEncoding) ? iconv('cp1250', 'utf-8', $retstr) : $this->_encodeUTF16($retstr);
That should fix it, however I suggest you use a different excel reader, such as PHPExcelto avoid problems like these.
Note that you need iconv
extension enabled on the server.
那应该可以解决它,但是我建议您使用不同的 excel 阅读器,例如PHPExcel以避免此类问题。
请注意,您需要iconv
在服务器上启用扩展。
回答by thuclh
I has the answer for this problem, use php_excel_reader like common! Add a function to Spreadsheet_Excel_Reader class:
我有这个问题的答案,像普通一样使用 php_excel_reader !向 Spreadsheet_Excel_Reader 类添加一个函数:
function seems_utf8($str) {
for ($i=0; $i<strlen($str); $i++) {
if (ord($str[$i]) < 0x80) continue; # 0bbbbbbb
elseif ((ord($str[$i]) & 0xE0) == 0xC0) $n=1; # 110bbbbb
elseif ((ord($str[$i]) & 0xF0) == 0xE0) $n=2; # 1110bbbb
elseif ((ord($str[$i]) & 0xF8) == 0xF0) $n=3; # 11110bbb
elseif ((ord($str[$i]) & 0xFC) == 0xF8) $n=4; # 111110bb
elseif ((ord($str[$i]) & 0xFE) == 0xFC) $n=5; # 1111110b
else return false; # Does not match any model
for ($j=0; $j<$n; $j++) { # n bytes matching 10bbbbbb follow ?
if ((++$i == strlen($str)) || ((ord($str[$i]) & 0xC0) != 0x80))
return false;
}
}
return true;
}
And add below line 1120: $retstr = $this->seems_utf8($retstr)?$retstr:utf8_encode($retstr);
并在第 1120 行下面添加: $retstr = $this->seems_utf8($retstr)?$retstr:utf8_encode($retstr);
Finish!
结束!
You can use file php_excel_reader, that i modify! Download here : File excel_reader2.phpUse like common with Original-excel-reader
您可以使用我修改的文件 php_excel_reader!在此处下载: 文件 excel_reader2.php与Original-excel-reader 一样使用