如何将带有 unicode 字符集的 xls/csv 文件导入 php/mysql?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/895221/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 00:14:58  来源:igfitidea点击:

Howto import xls/csv file with unicode charset into php/mysql?

phpmysqlexcelunicode

提问by Jesper Grann Laursen

I want to give the user the ability to import a csv file into my php/mysql system, but ran into some problems with encoding when the language is russian which excel only can store in UTF-16 tab-coded tab files.

我想让用户能够将 csv 文件导入我的 php/mysql 系统,但是当语言是俄语时遇到了一些编码问题,excel 只能存储在 UTF-16 选项卡编码的选项卡文件中。

Right now my database is in latin1, but I will change that to utf-8 as described in question "a-script-to-change-all-tables-and-fields-to-the-utf-8-bin-collation-in-mysql"

现在我的数据库在 latin1 中,但我会将其更改为 utf-8,如问题“a-script-to-change-all-tables-and-fields-to-the-utf-8-bin-collat​​ion-在-mysql”

But how should I import the file? and store the strings?

但是我应该如何导入文件?并存储字符串?

Should I for example translate it to html_entitites?

例如,我应该将其翻译为 html_entitites 吗?

I am using the fgetcsvcommand to get the data out of the csv file. My code looks something like this right now.

我正在使用该fgetcsv命令从 csv 文件中获取数据。我的代码现在看起来像这样。


file_put_contents($tmpfile, str_replace("\t", ";", file_get_contents($tmpfile)));
$filehandle = fopen($tmpfile,'r');
while (($data = fgetcsv($filehandle, 1000, ";")) !== FALSE) {
  $values[] = array(
    'id' => $data[0], 
    'type' => $data[1], 
    'text' => $data[4], 
    'desc' => $data[5], 
    'pdf' => $data[7]);
}

As note, if I store the xls file as csv in excel, i special chars are replaced by '_', so the only way I can get the russian chars out of the file, is to store the file in excel as tabbed seperated file in UTF16 format.

请注意,如果我将 xls 文件作为 csv 存储在 excel 中,我的特殊字符将被替换为“_”,因此我可以从文件中获取俄语字符的唯一方法是将文件作为选项卡式分隔文件存储在 excel 中UTF16 格式。

回答by Jesper Grann Laursen

Okay, the solution was to export the file from excel to UTF16 unicode text and add the ';' instaid of '\t' and convert from utf16 to utf8.

好的,解决方案是将文件从 excel 导出为 UTF16 unicode 文本并添加“;” instaid '\t' 并从 utf16 转换为 utf8。

file_put_contents($tmpfile, str_replace("\t", ";",  iconv('UTF-16', 'UTF-8', file_get_contents($tmpfile))));

The table in mysql has to be changed from latin1 to utf8

mysql中的表必须从latin1改为utf8

ALTER TABLE  `translation` 
CHANGE  `text`  `text` VARCHAR( 100 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,
CHANGE  `desc`  `desc` VARCHAR( 255 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL

And then the file could be imported as before.

然后可以像以前一样导入文件。

When I want to export the data from the database to a excel file, the csv-version is notan option. It has to be done in excel's html mode. Where data is corrected by eg. urlencode()or htmlentities()

当我想将数据从数据库导出到 excel 文件时,csv 版本不是一个选项。它必须在excel的html模式下完成。数据由例如更正的地方。urlencode()或者htmlentities()

Here some example code.

这里有一些示例代码。


<?php
header('Content-type: application/vnd.ms-excel');
header('Content-Disposition: attachment; filename="export.xls"');
print ('<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<div id="Classeur1_16681" align=center x:publishsource="Excel">
<table x:str border=0 cellpadding=0 cellspacing=0 width=100% style="border-collapse: collapse">');
for($i = 0 ; $i < count($lines) ; $i++) {
    print ('<tr><td>');
  print implode("</td><td>",$lines[$i]);
    print ('</td></tr>');
}
?>
</div>
</body>
</html>

回答by toluju

Alternatively you could make use of the MySQL load command. This command lets you specify delimiters, character set, etc. The one caveat is that the server loading the data must have direct visibility of the file, meaning that the file must reside on a filesystem visible and readable by the db server.

或者,您可以使用 MySQL加载命令。此命令允许您指定分隔符、字符集等。一个警告是加载数据的服务器必须具有文件的直接可见性,这意味着文件必须驻留在数据库服务器可见和可读的文件系统上。

回答by soulmerge

I would not import it using PHP. Instead consider creating a temporary table to store your data using READ DATA INFILE.

我不会使用 PHP 导入它。相反,请考虑使用READ DATA INFILE创建一个临时表来存储您的数据。

$file_handle = fopen($file_name, 'r');
$first_row = fgetcsv($file_handle, 0, ',', '"');
fclose($file_handle);
# Your usual error checking
if (!is_array($first_row)) {
    ...
}
$columns = 'column'.implode(' TEXT, column', array_keys($first_row)).' TEXT';
query("CREATE TABLE $table ($columns) Engine=MyISAM DEFAULT CHARSET=ucs2");
query("LOAD DATA LOCAL INFILE '$file_name' INTO TABLE $table ...

Then you can do whatever you want with the data in that table.

然后你可以对表中的数据做任何你想做的事情。

回答by zmonteca

Okay, my solution was ALSOto export the file from excel to UTF16 unicode text. The only difference was that I grab my file using a tab delimiter:

好吧,我的解决办法是ALSO将文件从Excel导出到UTF16 Unicode文本。唯一的区别是我使用制表符分隔符抓取我的文件:

fgetcsv($fp, '999999', "\t", '"')

回答by Moxet Jan

I tried lots of alternative but the most easiest and rapid solution is to use Navicat

我尝试了很多替代方法,但最简单快速的解决方案是使用Navicat

http://www.navicat.com/

enter image description here

enter image description here