PHP 中的编码问题 (UTF-8)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1388607/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 02:21:24  来源:igfitidea点击:

Encoding problem (UTF-8) in PHP

phpencodingutf-8

提问by caw

I want to output the following string in PHP:

我想在 PHP 中输​​出以下字符串:

? ? ü ?

? ? ü ?

Therefore, I've encoded it to utf8 manually:

因此,我手动将其编码为 utf8:

?¤ ?? ?? ?? ?

? ?? ?? ?? ?

So my script is:

所以我的脚本是:

<?php
header('content-type: text/html; charset=utf-8');
echo '?¤ ?? ?? ?? ?';
?>

The first 4 characters are correct (? ? ü ?) but unfortunately the sign isn't correct:

前 4 个字符是正确的 (? ? ü ?) 但不幸的是符号不正确:

? ? ü ?

? ? ü ?

Here you can see it.

在这里你可以看到它。

Can you tell me what I've done wrong? My editor (Notepad++) has settings for Encoding (Ansi/UTF-8) and Format (Windows/Unix). Do I have to change them?

你能告诉我我做错了什么吗?我的编辑器 (Notepad++) 具有编码 (Ansi/UTF-8) 和格式 (Windows/Unix) 设置。我必须改变它们吗?

I hope you can help me. Thanks in advance!

我希望你能帮助我。提前致谢!

回答by Dominic Rodger

That last character just isn't in the file (try viewing the source), which is why you don't see it.

最后一个字符不在文件中(尝试查看源代码),这就是您看不到它的原因。

I think you might be better off saving the PHP file as UTF-8 (in Notepad++ that options is available in Format -> Encode in UTF-8 without BOM), and inserting the actual characters in your PHP file (i.e. in Notepad++), rather than hacking around with inserting ?everywhere. You may find Windows Character Map useful for inserting unicode characters.

我认为您最好将 PHP 文件保存为 UTF-8(在 Notepad++ 中,该选项可用于格式 -> 以 UTF-8 编码而没有 BOM),并在您的 PHP 文件中插入实际字符(即在 Notepad++ 中),而不是到处插入?。您可能会发现 Windows 字符映射对于插入 unicode 字符很有用。

回答by Joey

The Euro sign (U+20AC) is encoded in UTF-8 with three bytes, not two. This can be seen here. So your encoding is simply wrong.

欧元符号 (U+20AC) 以 UTF-8 编码,包含三个字节,而不是两个字节。这可以在这里看到。所以你的编码是完全错误的。

回答by velcrow

If you want to output it properly to utf8, your script should be:

如果你想将它正确输出到 utf8,你的脚本应该是:

<?php
header('content-type: text/html; charset=utf-8');
echo "\xc3\xa4"."\xc3\xb6"."\xc3\xbc"."\xc3\x9f"."\xe2\x82\xac";
?>

That way even if your php script is saved to a non-utf-8 encoding, it will still work.

这样即使你的 php 脚本被保存为非 utf-8 编码,它仍然可以工作。

回答by Artelius

You should alwaysset your editor to the same encoding that the generated HTML instructs the browser to use. If the HTML page is intended to be interpreted as UTF-8, then set your text editor to UTF-8. PHP is completely unaware of the encoding settings of the editor used to create the file; it treats strings as a stream of bytes.

您应该始终将编辑器设置为生成的 HTML 指示浏览器使用的相同编码。如果打算将 HTML 页面解释为 UTF-8,则将文本编辑器设置为 UTF-8。PHP 完全不知道用于创建文件的编辑器的编码设置;它将字符串视为字节流。

In other words, as long as the right bytes are in the file, everything will work. And the easiest way to ensure the right bytes are in the file, is to set your encoding to the same one the web page is supposed to be in. Anything else just makes life more difficult than it needs to be.

换句话说,只要文件中有正确的字节,一切都会正常。确保文件中包含正确字节的最简单方法是将您的编码设置为网页应该使用的编码。其他任何事情只会让生活变得比需要的更困难。

But the best defence is to leave non-ASCII characters out of the code completely. You can pull them out of a database or localisation file instead. This means the code can be modified in essentially any editor without worrying about damaging the encoding.

但最好的防御是将非 ASCII 字符完全排除在代码之外。您可以将它们从数据库或本地化文件中提取出来。这意味着基本上可以在任何编辑器中修改代码,而不必担心损坏编码。

回答by vimal1083

header('Content-Type: text/html; charset=UTF-8');

This just informs the browsers what kind of content you're going to send it and how it should treat it. It does not set the encoding of the actual content you're sending. It's completely up to you to fulfil your own promise. Your content is not going to magically transform from whatever to UTF-8 just because you set that header. If you tell the browser to treat the content as UTF-8, but you're sending it Latin-1 encoded data, of course it will break.

这只是通知浏览器你要发送什么样的内容以及它应该如何处理它。它不会设置您发送的实际内容的编码。实现自己的承诺完全取决于您。您的内容不会因为您设置了该标头而从任何内容神奇地转换为 UTF-8。如果您告诉浏览器将内容视为 UTF-8,但您发送的是 Latin-1 编码数据,当然它会中断。

I refer you to What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text

我向您推荐每个程序员绝对需要了解的有关处理文本的编码和字符集的知识

回答by Djomla

this worked for me

这对我有用

    if (mb_check_encoding($value, 'UTF-8')) {
      return $value = utf8_encode($value);  
    }  
    else  {
      return $value;
    }

Source : https://github.com/jdorn/php-reports/issues/100

来源:https: //github.com/jdorn/php-reports/issues/100

回答by TechyFlick

Try this it works for me. This code will change ?? to ?

试试这个它对我有用。这段代码会改变吗??到 ?

<?php

header('Content-Type: text/html; charset=UTF-8');
echo $category = 'Computer & Zubeh??r';
exit;

?>

Result: Computer & Zubeh?r

结果:计算机和 Zubeh?r