php 将 SQL_Latin1_General_CP1_CI_AS 编码为 UTF-8

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10124930/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 21:26:31  来源:igfitidea点击:

Encoding SQL_Latin1_General_CP1_CI_AS into UTF-8

phpxmlsql-server-2008encodingutf-8

提问by SGr

I'm generating a XML file with PHP using DomDocument and I need to handle asian characters. I'm pulling data from the MSSQL2008 server using the pdo_mssql driver and I apply utf8_encode() on the XML attribute values. Everything works fine as long as there's no special characters.

我正在使用 DomDocument 用 PHP 生成一个 XML 文件,我需要处理亚洲字符。我正在使用 pdo_mssql 驱动程序从 MSSQL2008 服务器中提取数据,并对 XML 属性值应用 utf8_encode()。只要没有特殊字符,一切正常。

The server is MS SQL Server 2008 SP3

服务器是 MS SQL Server 2008 SP3

The database, table and column collation are all SQL_Latin1_General_CP1_CI_AS

数据库、表和列的排序规则都是 SQL_Latin1_General_CP1_CI_AS

I'm using PHP 5.2.17

我正在使用 PHP 5.2.17

Here's my PDO object:

这是我的 PDO 对象:

$pdo = new PDO("mssql:host=MyServer,1433;dbname=MyDatabase", user123, password123);

My query is a basic SELECT.

我的查询是一个基本的 SELECT。

I know storing special characters into SQL_Latin1_General_CP1_CI_AS columns isn't great, but ideally it would be nice to make it work without changing it, because other non-PHP programs already use that column and it works fine. In SQL Server Management Studio I can see the asian characters correctly.

我知道将特殊字符存储到 SQL_Latin1_General_CP1_CI_AS 列中并不是很好,但理想情况下,让它在不更改的情况下工作会很好,因为其他非 PHP 程序已经使用该列并且它工作正常。在 SQL Server Management Studio 中,我可以正确地看到亚洲字符。

Considering all the details above, how should I process the data?

考虑到上述所有细节,我应该如何处理数据?

回答by SGr

I found how to solve it, so hopefully this will be helpful to someone.

我找到了解决方法,所以希望这会对某人有所帮助。

First, SQL_Latin1_General_CP1_CI_AS is a strange mix of CP-1252 and UTF-8. The basic characters are CP-1252, so this is why all I had to do was UTF-8 and everything worked. The asian and other UTF-8 characters are encoded on 2 bytes and the php pdo_mssql driver seems to hate varying length characters so it seems to do a CAST to varchar (instead of nvarchar) and then all the 2 byte characters become question marks ('?').

首先,SQL_Latin1_General_CP1_CI_AS 是 CP-1252 和 UTF-8 的奇怪组合。基本字符是 CP-1252,所以这就是为什么我所要做的就是 UTF-8 并且一切正常。亚洲和其他 UTF-8 字符以 2 个字节编码,php pdo_mssql 驱动程序似乎讨厌不同长度的字符,因此它似乎对 varchar(而不是 nvarchar)执行 CAST,然后所有 2 个字节字符变成问号(' ?')。

I fixed it by casting it to binary and then I rebuild the text with php:

我通过将它转换为二进制文件来修复它,然后我用 php 重建文本:

SELECT CAST(MY_COLUMN AS VARBINARY(MAX)) FROM MY_TABLE;

In php:

在 php 中:

//Binary to hexadecimal
$hex = bin2hex($bin);

//And then from hex to string
$str = "";
for ($i=0;$i<strlen($hex) -1;$i+=2)
{
    $str .= chr(hexdec($hex[$i].$hex[$i+1]));
}
//And then from UCS-2LE/SQL_Latin1_General_CP1_CI_AS (that's the column format in the DB) to UTF-8
$str = iconv('UCS-2LE', 'UTF-8', $str);

回答by Juliano Barreto

I know this post is old, but the only thing that work for me was iconv("CP850", "UTF-8//TRANSLIT", $var); I had the same issues with SQL_Latin1_General_CP1_CI_AI, maybe it work for SQL_Latin1_General_CP1_CI_AS too.

我知道这篇文章很旧,但唯一对我有用的是 iconv("CP850", "UTF-8//TRANSLIT", $var); 我对 SQL_Latin1_General_CP1_CI_AI 也有同样的问题,也许它也适用于 SQL_Latin1_General_CP1_CI_AS。

回答by Flavio Ariano

You can try so:

你可以试试:

header("Content-Type: text/html; charset=utf-8");
$dbhost   = "hostname";
$db       = "database";
$query = "SELECT *
    FROM Estado
    ORDER BY Nome";
$conn = new PDO( "sqlsrv:server=$dbhost ; Database = $db", "", "" );
$stmt = $conn->prepare( $query, array(PDO::ATTR_CURSOR => PDO::CURSOR_SCROLL, PDO::SQLSRV_ATTR_CURSOR_SCROLL_TYPE => PDO::SQLSRV_CURSOR_BUFFERED, PDO::SQLSRV_ENCODING_SYSTEM) );
$stmt->execute();
while ( $row = $stmt->fetch( PDO::FETCH_ASSOC ) )
{
// CP1252 == code page Latin1
print iconv("CP1252", "ISO-8859-1", "$row[Nome] <br>");
}

回答by M Rostami

Thanks @SGr for answer.
I found out a better way for doing that :

感谢@SGr 的回答。
我找到了一个更好的方法:

SELECT CAST(CAST(MY_COLUMN AS VARBINARY(MAX)) AS VARCHAR(MAX)) as MY_COLUMN FROM MY_TABLE;
and also try with:
SELECT CAST(MY_COLUMN AS VARBINARY(MAX)) as MY_COLUMN FROM MY_TABLE;

SELECT CAST(CAST(MY_COLUMN AS VARBINARY(MAX)) AS VARCHAR(MAX)) as MY_COLUMN FROM MY_TABLE;
并尝试:
SELECT CAST(MY_COLUMN AS VARBINARY(MAX)) as MY_COLUMN FROM MY_TABLE;

And in PHP you should just convert it to UTF-8 :

在 PHP 中,您应该将其转换为 UTF-8 :

$string = iconv('UCS-2LE', 'UTF-8', $row['MY_COLUMN']);

$string = iconv('UCS-2LE', 'UTF-8', $row['MY_COLUMN']);

回答by Wan

For me, none of the above was the direct solution--though I did use parts of above solutions. This worked for me with the Vietnamese alphabet. If you come across this post and none of the above work for you, try:

对我来说,以上都不是直接的解决方案——尽管我确实使用了上述解决方案的一部分。这对我的越南字母有用。如果您遇到这篇文章并且上述方法都不适合您,请尝试:

    $req = "SELECT CAST(MY_COLUMN as VARBINARY(MAX)) as MY_COLUMN FROM MY_TABLE"; 
    $stmt = $conn->prepare($req);
    $stmt->execute();
    while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
        $str = pack("H*",$row['MY_COLUMN']);
        $str = mb_convert_encoding($z, 'HTML-ENTITIES','UCS-2LE');
        print_r($str);
    }

And a little bonus--I had to json_encode this data and was (duh) getting html code instead of the special characters. to fix just use html_entity_decode() on the strings before sending with json_encode.

还有一点好处——我必须对这些数据进行 json_encode 并且(废话)获取 html 代码而不是特殊字符。在使用 json_encode 发送之前,只需在字符串上使用 html_entity_decode() 即可修复。

回答by Klompenrunner

No need for crazy stuff. Collation SQL_Latin1_General_CP1_CI_AScharacter encoding is: Windows-1252

不需要疯狂的东西。排序规则SQL_Latin1_General_CP1_CI_AS字符编码为:Windows-1252

This works perfect for me: $str = mb_convert_encoding($str, 'UTF-8', 'Windows-1252');

这对我来说很完美: $str = mb_convert_encoding($str, 'UTF-8', 'Windows-1252');

回答by Pierre-Olivier

By default, PDOuses PDO::SQLSRV_ENCODING_UTF8for sending/receiving data.

缺省情况下,PDO使用PDO::SQLSRV_ENCODING_UTF8用于发送/接收数据。

If your current collate is LATIN1, have you tried specifiying PDO::SQLSRV_ENCODING_SYSTEMto let PDOknow that you want to use the current system encoding instead of UTF-8?

如果您当前的整理是LATIN1,您是否尝试过指定PDO::SQLSRV_ENCODING_SYSTEMPDO您知道您要使用当前系统编码而不是UTF-8

You could even use PDO::SQLSRV_ENCODING_BINARYwhich returns data in a binary form (no encoding or translation is done when transfering data). This way, you could handle character encoding on your side.

您甚至可以使用PDO::SQLSRV_ENCODING_BINARYwhich 以二进制形式返回数据(传输数据时不进行编码或转换)。这样,您就可以自行处理字符编码。

More documentation here: http://ca3.php.net/manual/en/ref.pdo-sqlsrv.php

更多文档在这里:http: //ca3.php.net/manual/en/ref.pdo-sqlsrv.php