php 读取ansi文件并转换为UTF-8字符串

Question

提问by user192344

Is there any way to do that with PHP?

有没有办法用 PHP 做到这一点？

The data to be inserted looks fine when I print it out.

当我打印出来时，要插入的数据看起来不错。

But when I insert it in the database the field becomes empty.

但是当我将它插入数据库时，该字段变为空。

Answer 1

回答by Mark Bekkers

$tmp = iconv('YOUR CURRENT CHARSET', 'UTF-8', $string);

or

或者

$tmp = utf8_encode($string);

Strange thing is you end up with an empty string in your DB. I can understand you'll end up with some garbarge in your DB but nothing at all (empty string) is strange.

奇怪的是你最终在你的数据库中得到了一个空字符串。我可以理解你最终会在你的数据库中得到一些垃圾，但没有什么（空字符串）是奇怪的。

I just typed this in my console:

我刚刚在我的控制台中输入了这个：

iconv -l | grep -i ansi

It showed me:

它向我展示了：

ANSI_X3.4-1968
ANSI_X3.4-1986
ANSI_X3.4
ANSI_X3.110-1983
ANSI_X3.110
MS-ANSI

These are possible values for YOUR CURRENT CHARSETAs pointed out before when your input string contains chars that are allowed in UTF, you dont need to convert anything.

这些是您当前字符集的可能值正如之前指出的，当您的输入字符串包含 UTF 允许的字符时，您不需要转换任何内容。

Change UTF-8 in UTF-8//TRANSLIT when you dont want to omit chars but replace them with a look-a-like (when they are not in the UTF-8 set)

在 UTF-8 中更改 UTF-8//TRANSLIT 当您不想省略字符而是将它们替换为类似的字符时（当它们不在 UTF-8 集中时）

Answer 2

回答by álvaro González

"ANSI" is not really a charset. It's a short way of saying "whatever charset is the default in the computer that creates the data". So you have a double task:

“ANSI”并不是真正的字符集。这是“创建数据的计算机中默认的任何字符集”的一种简短方式。所以你有双重任务：

Find out what's the charset data is using.
Use an appropriate function to convert into UTF-8.

找出字符集数据正在使用什么。
使用适当的函数转换为 UTF-8。

For #2, I'm normally happy with iconv()but utf8_encode()can also do the job if source data happens to use ISO-8859-1.

对于#2，我通常很满意，iconv()但utf8_encode()如果源数据碰巧使用 ISO-8859-1 ，我也可以完成这项工作。

Update

更新

It looks like you don't know what charset your data is using. In some cases, you can figure it out if you know the country and language of the user (e.g., Spain/Spanish) through the default encoding used by Microsoft Windows in such territory.

看起来您不知道您的数据使用的是什么字符集。在某些情况下，您可以通过 Microsoft Windows 在该地区使用的默认编码了解用户的国家和语言（例如，西班牙/西班牙语）。

Answer 3

回答by Victor Priceputu

Be careful, using iconv()can return false if the conversion fails.

请注意，iconv()如果转换失败，使用可能会返回 false。

I am also having a somewhat similar problem, some characters from the Chinese alphabet are mistaken for \nif the file is encoded in UNICODE, but not if it is UFT-8.

我也遇到了类似的问题，\n如果文件是用 UNICODE 编码的，则中文字母中的某些字符会被误认为是 UFT-8，但如果是 UFT-8，则不会。

To get back to your problem, make sure the encoding of your file is the same with the one of your database. Also using utf-8_encode()on an already utf-8 text can have unpleasant results. Try using mb_detect_encoding()to see the encoding of the file, but unfortunately this way doesn't always work. There is no easy fix for character encoding from what i can see :(

要回到您的问题，请确保您的文件的编码与您的数据库之一相同。同样utf-8_encode()在已经 utf-8 的文本上使用可能会产生令人不快的结果。尝试使用 mb_detect_encoding()查看文件的编码，但不幸的是这种方式并不总是有效。从我所看到的，字符编码没有简单的解决方法:(

php 读取ansi文件并转换为UTF-8字符串

提问by user192344

回答by Mark Bekkers

回答by álvaro González

Update

更新

回答by Victor Priceputu

相关推荐

最近更新

标签

php 读取ansi文件并转换为UTF-8字符串

提问by user192344

回答by Mark Bekkers

回答by álvaro González

Update

更新

回答by Victor Priceputu

相关推荐

如何从今天的日期（PHP）计算并获取过去（例如 3 周前）的日期

将包含科学记数法数字的字符串转换为 PHP 中的双精度数

php file_put_contents 函数不起作用

如何在 PHP 中获取系统信息？

相关推荐

最近更新

标签