php fgetcsv - 字符集编码问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13298353/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 05:08:08  来源:igfitidea点击:

php fgetcsv - charset encoding problems

phpcsvcharacter-encodingchar

提问by ElPiter

Using PHP 5.3 fgetcsvfunction, I am experiencing some problems due to encoding matters. Note that that file has spanish "special" latin characters like graphic accents á, é, í ?, etc...

使用 PHP 5.3fgetcsv函数时,由于编码问题,我遇到了一些问题。请注意,该文件具有西班牙语“特殊”拉丁字符,如图形重音 á、é、í ? 等...

I get the CSV file exporting some structured data I have in an MS 2008 for Mac Excel file.

我得到的 CSV 文件导出了我在 MS 2008 for Mac Excel 文件中的一些结构化数据。

If I open it with Mac OS X TextEditapplication, everything seems to go perfect.

如果我用 Mac OS XTextEdit应用程序打开它,一切似乎都很完美。

But when I get down to my PHP program and try to read the CSV using that fgetcsv PHP function, I am not getting it to read properly the charset.

但是当我开始我的 PHP 程序并尝试使用 fgetcsv PHP 函数读取 CSV 时,我没有让它正确读取字符集。

/**
 * @Route("/cvsLoad", name="_csv_load")
 * @Template()
 */
public function cvsLoadAction(){
    //setlocale(LC_ALL, 'es_ES.UTF-8');
    $reader = new Reader($this->get('kernel')->getRootDir().'/../web/uploads/documents/question_images/2/41/masiva.csv');

    $i = 1;
    $r = array("hhh" => $reader -> getAll());

    return new Response(json_encode($r, 200));
}

As you can see, I have tried also to use a setlocaleto es_ES.UTF-8. But nothing get it working.

如您所见,我也尝试使用 a setlocaleto es_ES.UTF-8。但没有什么可以让它发挥作用。

The read part comes here:

阅读部分在这里:

public function getRow()
{
    if (($row = fgetcsv($this->_handle, 10000, $this->_delimiter)) !== false) {
        $this->_line++;
        return $this->_headers ? array_combine($this->_headers, $row) : $row;
    } else {
        return false;
    }
}

See what I get in the $row variable after each row reading:

看看每行读取后我在 $row 变量中得到了什么:

enter image description here

在此处输入图片说明

Those ?characters are supposed to be vowels with graphic accents on them.

这些?字符应该是带有图形重音的元音。

Any clue over there? Would it work if I used MS Excel for Windows? How can I know in run time the exact encoding of the file and set it before reading it?

那边有什么线索吗?如果我使用 MS Excel for Windows 会起作用吗?如何在运行时知道文件的确切编码并在读取之前设置它?

(For those spanish speakers, don't get frightened with such awful medical stuff in those texts ;)).

(对于那些讲西班牙语的人,不要被那些文本中如此糟糕的医学内容吓到;))。

回答by Esailija

Try this:

尝试这个:

function convert( $str ) {
    return iconv( "Windows-1252", "UTF-8", $str );
}

public function getRow()
{
    if (($row = fgetcsv($this->_handle, 10000, $this->_delimiter)) !== false) {
        $row = array_map( "convert", $row );
        $this->_line++;
        return $this->_headers ? array_combine($this->_headers, $row) : $row;
    } else {
        return false;
    }
}

回答by joedixon

This is likely to do with the way excel encodes the file when saving.

这可能与excel在保存时对文件进行编码的方式有关。

Try uploading the .xls file to google docs and downloading as a .csv

尝试将 .xls 文件上传到 google docs 并下载为 .csv