php $_POST 将从 utf-8 转换为 ?¤ ?? ?? 等等
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9002701/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
$_POST will convert from utf-8 to ?¤ ?? ?? etc
提问by lungov
I am new here, so I apologize if I am doing anything wrong.
我是新来的,所以如果我做错了什么,我深表歉意。
I have a form which submits user input onto another page. User is expected to type ?, ?, é, etc... I have placed all of the following in the document:
我有一个将用户输入提交到另一个页面的表单。用户应该输入 ?, ?, é, 等等...我已经在文档中放置了以下所有内容:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
header('Content-Type:text/html; charset=UTF-8');
<form action="whatever.php" accept-charset="UTF-8">
I even tried:
我什至试过:
ini_set('default_charset', 'UTF-8');
When the other page loads, I need to check what the user input with something like:
当另一个页面加载时,我需要检查用户输入的内容,例如:
if ( $_POST['field'] == $check ) {
...
}
But if he inputs something like 'München', PHP will compare 'M??nchen' with 'München' and will never trigger TRUE even though it should. Since it is specified UTF-8 everywhere, I am guessing that the server is converting to something else (Windows-1252 as I read on another thread) because it does not support or is not configured to UTF-8. I am using Apache on a local server before I load into production; I have not changed (and don't know how to) any of the default settings. I've been working on a Windows 7, editing with Notepad++ enconding my files in ANSI. If I bin2hex('München')
I get '4dc3bc6e6368656e'.
但是,如果他输入诸如“München”之类的内容,PHP 会将“M??nchen”与“München”进行比较,并且即使应该触发 TRUE,也永远不会触发。由于它在任何地方都指定为 UTF-8,我猜测服务器正在转换为其他内容(我在另一个线程上阅读的 Windows-1252),因为它不支持或未配置为 UTF-8。在加载到生产环境之前,我在本地服务器上使用 Apache;我没有更改(也不知道如何更改)任何默认设置。我一直在使用 Windows 7,使用 Notepad++ 进行编辑,以 ANSI 编码我的文件。如果我bin2hex('München')
得到“4dc3bc6e6368656e”。
If I echo $_POST['field'];
it displays 'München' correctly.
如果我echo $_POST['field'];
正确显示“慕尼黑”。
I have researched everywhere for an explanation, all I find is that I should include those tags/headings I already have.
我到处研究以寻求解释,我发现我应该包括我已经拥有的那些标签/标题。
Any help is much appreciated.
任何帮助深表感谢。
回答by gioele
You are facing many different problems at the same, let's start with the simplest one.
您同时面临着许多不同的问题,让我们从最简单的一个开始。
Problem 1) You say that echo $_POST['field'];
will display it correctly? What do you mean with "display"? It can be displayed correctly in two cases:
问题1)你说echo $_POST['field'];
会正确显示吗?“显示”是什么意思?可以在两种情况下正确显示:
- either the field is in UTF-8 and your page has been declared as UTF-8 and the browser is displaying it as UTF-8 or,
- the field is in Latin-1 and the browser has decided (through the auto-detection heuristics) that your page is in Latin-1.
- 要么该字段是 UTF-8 并且您的页面已被声明为 UTF-8 并且浏览器将其显示为 UTF-8 要么,
- 该字段位于 Latin-1 中,并且浏览器已决定(通过自动检测启发式)您的页面位于 Latin-1 中。
So, the fact that echo $_POST['field'];
is correct tells you nothing.
所以,echo $_POST['field'];
正确的事实告诉你什么都没有。
Problem 2) You are using
问题 2)您正在使用
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
header('Content-Type:text/html; charset=UTF-8');
Is this PHP code? If it is, it will be an error because the header must be set before sending out any byte. If you do this you will not set the Content-Type
header and PHP should generate a warning.
这是PHP代码吗?如果是,那将是一个错误,因为在发送任何字节之前必须设置标头。如果您这样做,您将不会设置Content-Type
标头,PHP 应生成警告。
Problem 3) You are using
问题 3) 您正在使用
<form action="whatever.php" accept-charset="UTF-8">
Some browsers (IE, mostly) ignore accept-charset
if they can coerce the data to be sent in ASCII or ISO Latin-1. So the data will be in UTF-8 and declared as ISO Latin-1 or ISO Latin-1 and sent as ISO Latin-1 (but this second case is not your case).
某些浏览器(主要是 IE)会忽略accept-charset
它们是否可以强制以 ASCII 或 ISO Latin-1 格式发送数据。因此,数据将采用 UTF-8 格式并声明为 ISO Latin-1 或 ISO Latin-1 并作为 ISO Latin-1 发送(但第二种情况不是您的情况)。
Have a look at https://stackoverflow.com/a/8547004/449288to see how to solve this problem.
看看https://stackoverflow.com/a/8547004/449288看看如何解决这个问题。
Problem 4) Which strings are you comparing? For example, if you have
问题 4)你比较的是哪些字符串?例如,如果你有
$city = "München"
$_POST['city'] == $city
The result of this code will depend on the encoding of the PHP file. If the file is encoded in ISO Latin-1 and the $_POST
correctly contains UTF-8 data, the ==
will compare different bytes and will return false.
此代码的结果将取决于 PHP 文件的编码。如果文件以 ISO Latin-1 编码并且$_POST
正确包含 UTF-8 数据,==
则将比较不同的字节并返回 false。
回答by Jeremy Harris
Another solution that may be helpful is in Apache, you can place a directive in your configuration file (httpd.conf) or .htacess called AddDefaultCharset
. It looks like this:
另一个可能有用的解决方案是在 Apache 中,您可以在配置文件 (httpd.conf) 或 .htacess 中放置一个名为 .htacess 的指令AddDefaultCharset
。它看起来像这样:
AddDefaultCharset utf-8
AddDefaultCharset utf-8
http://httpd.apache.org/docs/2.0/mod/core.html#adddefaultcharset
http://httpd.apache.org/docs/2.0/mod/core.html#adddefaultcharset
That will override any other default charsets.
这将覆盖任何其他默认字符集。
回答by ujjwal singh
I changed "mbstring.detect_order = pass" in my php.ini file and i worked
我在我的 php.ini 文件中更改了“mbstring.detect_order = pass”并且我工作了
回答by zrvan
This is due to the character encoding of the PHP file(s).
这是由于 PHP 文件的字符编码造成的。
The hardcoded München
is stored with the character encoding of the source file(s), in this case ANSI
and when that value is compared to the UTF-8 encoded value provided in the $_POST
variable, the two will, quite naturally, differ.
硬München
编码与源文件的字符编码一起存储,在这种情况下ANSI
,当将该值与$_POST
变量中提供的 UTF-8 编码值进行比较时,两者自然会有所不同。
The solution to your problem is one of:
您的问题的解决方案是以下之一:
- Serve and process content with the same encoding as that of the source file(s), in this case likely to be
windows-1252
.- This would, for starters, include changing the
content="text/html; charset=UTF-8"
tocontent="text/html; charset=windows-1252"
whenever serving HTML data.
- This would, for starters, include changing the
- Avoid all hardcoded values that could be affected by character encoding issues between
UTF-8
andwindows-1252
, more or less only hardcode values that onlyincludes English letters and numbers.- Any
UTF-8
values would have to be read from a source that ensures they areUTF-8
encoded (for instance a database set to useUTF-8
as storage encoding as well as connection encoding).
- Any
- Wrap all hardcoded assignments in
utf8_encode()
, for instance$value = utf8_encode ('München');
- Change the encoding of the source file(s) to
UTF-8
.- This can be accomplished in any number of ways, a decent text editor will be able to do it or the outstanding libiconvcan be used, especially for batch processing.
- 使用与源文件相同的编码提供和处理内容,在这种情况下可能是
windows-1252
.- 对于初学者来说,这包括在提供 HTML 数据时更改
content="text/html; charset=UTF-8"
tocontent="text/html; charset=windows-1252"
。
- 对于初学者来说,这包括在提供 HTML 数据时更改
- 避免所有可能受
UTF-8
和之间的字符编码问题影响的windows-1252
硬编码值,或多或少只包含仅包含英文字母和数字的硬编码值。- 任何
UTF-8
值都必须从确保它们被UTF-8
编码的源中读取(例如设置UTF-8
用作存储编码和连接编码的数据库)。
- 任何
- 裹在所有硬编码的任务
utf8_encode()
,例如$value = utf8_encode ('München');
- 将源文件的编码更改为
UTF-8
.- 这可以通过多种方式完成,一个体面的文本编辑器将能够做到这一点,或者可以使用出色的libiconv,特别是对于批处理。
Either solution 1 or 4 would be my preferred solution, especially if multiple people are involved in the project.
解决方案 1 或 4 将是我的首选解决方案,尤其是在项目涉及多个人的情况下。
As a side-note, some text editors (notably Notepad++
) has the option of using either UTF-8
or UTF-8 without BOM
. The BOM
(Byte Order Mark) is pointless in UTF-8
and will cause problems when writing headers in PHP (most often when doing a redirect). This is because the BOM
is right in front of the initial <?php
, causing the server to send the BOM
just as it would had there been any other character in front. The difference is you'd note a character in front, but the BOM
isn't displayed.
Rule of thumb: Always use UTF-8 without BOM.
作为旁注,一些文本编辑器(特别是Notepad++
)可以选择使用UTF-8
或UTF-8 without BOM
。该BOM
(字节顺序标记)是没有意义的UTF-8
,并在利用PHP编写头时(最常见的做重定向时)会产生问题。这是因为BOM
就在首字母的前面<?php
,导致服务器发送BOM
就像前面有任何其他字符一样。不同之处在于您会注意到前面有一个字符,但BOM
没有显示。
经验法则:始终使用没有 BOM 的 UTF-8。
回答by Mohammad Saberi
I've used Unicode characters in my forms and file many times. I had not any problem up to now. Try to do these steps and check the result:
我在表单和文件中多次使用 Unicode 字符。到目前为止我没有任何问题。尝试执行以下步骤并检查结果:
- Remove
header('Content-Type:text/html; charset=UTF-8');
from your HTML form codes. - Use your form just like
<form action="whatever.php">
withoutaccept-charset="UTF-8"
. (It's better to insert the method of sending data in your form tag). - In target page (whatever.php), insert again
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
in a<head>
tag.
header('Content-Type:text/html; charset=UTF-8');
从您的 HTML 表单代码中删除。- 使用您的表单就像
<form action="whatever.php">
没有accept-charset="UTF-8"
. (最好在你的表单标签中插入发送数据的方法)。 - 在目标页面(whatever.php)中,再次插入
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
一个<head>
标签。
I always did my project like what I mentioned here and I did not have any problem with Unicode strings.
我总是像我在这里提到的那样做我的项目,我对 Unicode 字符串没有任何问题。