php 使用 fgetcsv 读取 CSV 文件时出现 UTF-8 问题

Question

提问by testing

I try to read a CSV and echo the content. But the content displays the characters wrong.

我尝试读取 CSV 并回显内容。但内容显示字符错误。

M?x Müsterm?nn -> M?¤x M??sterm?¤nn

Encoding of the CSV file is UTF-8 without BOM (checked with Notepad++).

CSV 文件的编码是没有 BOM 的 UTF-8（用 Notepad++ 检查）。

This is the content of the CSV file:

这是 CSV 文件的内容：

"M?x";"Müsterm?nn"

My PHP script

我的 PHP 脚本

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<?php
$handle = fopen ("specialchars.csv","r");
echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>';
while ($data = fgetcsv ($handle, 1000, ";")) {
        $num = count ($data);
        for ($c=0; $c < $num; $c++) {
            // output data
            echo "<td>$data[$c]</td>";
        }
        echo "</tr><tr>";
}
?>
</body>
</html>

I tried to use setlocale(LC_ALL, 'de_DE.utf8');as suggested herewithout success. The content is still wrong displayed.

我尝试setlocale(LC_ALL, 'de_DE.utf8');按照此处的建议使用但没有成功。内容还是显示错误。

What I'm missing?

我缺少什么？

Edit:

编辑：

An echo mb_detect_encoding($data[$c],'UTF-8');gives me UTF-8 UTF-8.

Anecho mb_detect_encoding($data[$c],'UTF-8');给了我 UTF-8 UTF-8。

echo file_get_contents("specialchars.csv");gives me "M?¤x";"M??sterm?¤nn".

echo file_get_contents("specialchars.csv");给我"M?¤x";"M??sterm?¤nn"。

And

和

print_r(str_getcsv(reset(explode("\n", file_get_contents("specialchars.csv"))), ';'))

gives me

给我

Array ( [0] => M?¤x [1] => M??sterm?¤nn )

What does it mean?

这是什么意思？

Answer 1

采纳答案by testing

Now I got it working (after removing the headercommand). I think the problem was that the encoding of the php file was in ISO-8859-1. I set it to UTF-8 without BOM. I thought I already have done that, but perhaps I made an additional undo.

现在我开始工作了（删除header命令后）。我认为问题在于 php 文件的编码是 ISO-8859-1。我将它设置为没有 BOM 的 UTF-8。我以为我已经这样做了，但也许我做了一个额外的撤消。

Furthermore, I used SET NAMES 'utf8'for the database. Now it is also correct in the database.

再者，我用SET NAMES 'utf8'的数据库。现在它在数据库中也是正确的。

Answer 2

回答by robssanches

Try this:

尝试这个：

<?php
$handle = fopen ("specialchars.csv","r");
echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>';
while ($data = fgetcsv ($handle, 1000, ";")) {
        $data = array_map("utf8_encode", $data); //added
        $num = count ($data);
        for ($c=0; $c < $num; $c++) {
            // output data
            echo "<td>$data[$c]</td>";
        }
        echo "</tr><tr>";
}
?>

Answer 3

回答by user2992220

Encountered similar problem: parsing CSV file with special characters like é, è, ? etc ...

遇到类似问题：解析带有特殊字符如é、è、? 等等 ...

The following worked fine for me:

以下对我来说很好：

To represent the characters correctly on the html page, the header was needed :

为了在 html 页面上正确表示字符，需要标题：

header('Content-Type: text/html; charset=UTF-8');

In order to parse every character correctly, I used:

为了正确解析每个字符，我使用了：

utf8_encode(fgets($file));

Dont forget to use in all following string operations the 'Multibyte String Functions', like:

不要忘记在以下所有字符串操作中使用“多字节字符串函数”，例如：

mb_strtolower($value, 'UTF-8');

Answer 4

回答by Andreas Stokholm

Try putting this into the top of your file (before any other output):

尝试将其放入文件顶部（在任何其他输出之前）：

<?php

header('Content-Type: text/html; charset=UTF-8');

?>

Answer 5

回答by Manvel

The problem is that the function returns UTF-8 (it can check using mb_detect_encoding), but do not convert, and these characters takes as UTF-8. Тherefore, it's necessary to do the reverse-convert to initial encoding (Windows-1251 or CP1251) using iconv. But since by the fgetcsv returns an array, I suggest to write a custom function: [Sorry for my english]

问题是该函数返回UTF-8（它可以使用mb_detect_encoding检查），但不转换，这些字符作为UTF-8。因此，有必要使用iconv反向转换为初始编码（Windows-1251 或 CP1251）。但是由于 fgetcsv 返回一个数组，我建议编写一个自定义函数：[对不起我的英语]

function customfgetcsv(&$handle, $length, $separator = ';'){
    if (($buffer = fgets($handle, $length)) !== false) {
        return explode($separator, iconv("CP1251", "UTF-8", $buffer));
    }
    return false;
}

Answer 6

回答by Petr Hladík

In my case the source file has windows-1250 encoding and iconv prints tons of notices about illegal characters in input string...

就我而言，源文件具有 windows-1250 编码，而 iconv 会打印大量有关输入字符串中非法字符的通知...

So this solution helped me a lot:

所以这个解决方案对我帮助很大：

/**
 * getting CSV array with UTF-8 encoding
 *
 * @param   resource    &$handle
 * @param   integer     $length
 * @param   string      $separator
 *
 * @return  array|false
 */
private function fgetcsvUTF8(&$handle, $length, $separator = ';')
{
    if (($buffer = fgets($handle, $length)) !== false)
    {
        $buffer = $this->autoUTF($buffer);
        return str_getcsv($buffer, $separator);
    }
    return false;
}

/**
 * automatic convertion windows-1250 and iso-8859-2 info utf-8 string
 *
 * @param   string  $s
 *
 * @return  string
 */
private function autoUTF($s)
{
    // detect UTF-8
    if (preg_match('#[\x80-\x{1FF}\x{2000}-\x{3FFF}]#u', $s))
        return $s;

    // detect WINDOWS-1250
    if (preg_match('#[\x7F-\x9F\xBC]#', $s))
        return iconv('WINDOWS-1250', 'UTF-8', $s);

    // assume ISO-8859-2
    return iconv('ISO-8859-2', 'UTF-8', $s);
}

Response to @manvel's answer - use str_getcsv instead of explode - because of cases like this:

对@manvel 的回答的回应 - 使用 str_getcsv 而不是爆炸 - 因为这样的情况：

some;nice;value;"and;here;comes;combinated;value";and;some;others

explode will explode string into parts:

爆炸将把字符串炸成几部分：

some
nice
value
"and
here
comes
combinated
value"
and
some
others

but str_getcsv will explode string into parts:

但是 str_getcsv 会将字符串分解为多个部分：

some
nice
value
and;here;comes;combinated;value
and
some
others

php 使用 fgetcsv 读取 CSV 文件时出现 UTF-8 问题

提问by testing

采纳答案by testing

回答by robssanches

回答by user2992220

回答by Andreas Stokholm

回答by Manvel

回答by Petr Hladík

相关推荐

最近更新

标签

php 使用 fgetcsv 读取 CSV 文件时出现 UTF-8 问题

提问by testing

采纳答案by testing

回答by robssanches

回答by user2992220

回答by Andreas Stokholm

回答by Manvel

回答by Petr Hladík

相关推荐

php 错误：mysqlnd 无法使用旧的不安全身份验证连接到 MySQL 4.1+

php 在 JSON 编码的 HTML5 数据属性中转义/编码单引号

php 使用 Symfony2 / Symfony3 中的 FOSUserBundle 使用电子邮件删除/替换用户名字段

php 在 Magento 中使用基本的 AJAX 调用

相关推荐

最近更新

标签