php 转义字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2870872/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Escaping escape Characters
提问by Alix Axel
I'm trying to mimic the json_encodebitmask flags implemented in PHP 5.3.0, here is the string I have:
我试图模仿json_encodePHP 5.3.0 中实现的位掩码标志,这是我拥有的字符串:
$s = addslashes('O\'Rei"lly'); // O\'Rei\"lly
Doing json_encode($s, JSON_HEX_APOS | JSON_HEX_QUOT)outputs the following:
执行json_encode($s, JSON_HEX_APOS | JSON_HEX_QUOT)输出以下内容:
"O\\u0027Rei\\u0022lly"
And I'm currently doing this in PHP versions older than 5.3.0:
我目前正在 5.3.0 之前的 PHP 版本中执行此操作:
str_replace(array('\"', "\'"), array('\u0022', '\\u0027'), json_encode($s))
or
str_replace(array('\"', '\\''), array('\u0022', '\\u0027'), json_encode($s))
Which correctly outputs the same result:
哪个正确输出相同的结果:
"O\\u0027Rei\\u0022lly"
I'm having trouble understanding why do I need to replace single quotes ('\\\''or even "\\'"[surrounding quotes excluded]) with '\\\u0027'and not just '\\u0027'.
我无法理解为什么我需要更换单引号('\\\''甚至"\\'"[包围引号排除有]) '\\\u0027',而不仅仅是'\\u0027'。
Here is the code that I'm having trouble porting to PHP < 5.3:
这是我在移植到 PHP < 5.3 时遇到问题的代码:
if (get_magic_quotes_gpc() && version_compare(PHP_VERSION, '6.0.0', '<'))
{
/* JSON_HEX_APOS and JSON_HEX_QUOT are availiable */
if (version_compare(PHP_VERSION, '5.3.0', '>=') === true)
{
$_GET = json_encode($_GET, JSON_HEX_APOS | JSON_HEX_QUOT);
$_POST = json_encode($_POST, JSON_HEX_APOS | JSON_HEX_QUOT);
$_COOKIE = json_encode($_COOKIE, JSON_HEX_APOS | JSON_HEX_QUOT);
$_REQUEST = json_encode($_REQUEST, JSON_HEX_APOS | JSON_HEX_QUOT);
}
/* mimic the behaviour of JSON_HEX_APOS and JSON_HEX_QUOT */
else if (extension_loaded('json') === true)
{
$_GET = str_replace(array(), array('\u0022', '\u0027'), json_encode($_GET));
$_POST = str_replace(array(), array('\u0022', '\u0027'), json_encode($_POST));
$_COOKIE = str_replace(array(), array('\u0022', '\u0027'), json_encode($_COOKIE));
$_REQUEST = str_replace(array(), array('\u0022', '\u0027'), json_encode($_REQUEST));
}
$_GET = json_decode(stripslashes($_GET));
$_POST = json_decode(stripslashes($_POST));
$_COOKIE = json_decode(stripslashes($_COOKIE));
$_REQUEST = json_decode(stripslashes($_REQUEST));
}
回答by awatts
The PHP string
PHP 字符串
'O\'Rei"lly'
is just PHP's way of getting the literal value
只是 PHP 获取文字值的方式
O'Rei"lly
into a string which can be used. Calling addslasheson that string changes it to be literally the following 11 characters
变成一个可以使用的字符串。调用addslashes该字符串会将其更改为以下 11 个字符
O\'Rei\"lly
i.e. strlen(addslashes('O\'Rei"lly')) == 11
IE strlen(addslashes('O\'Rei"lly')) == 11
This is the value which is being sent to json_escape.
这是被发送到的值json_escape。
In JSON backslash is an escape character, so that needs to be escaped, i.e.
在 JSON 中反斜杠是一个转义字符,所以需要转义,即
\to be \\
\成为 \\
Also single and double quotes can cause problems. So converting them to their unicode equivalent in one way to avoid problems. So later verions of PHP's json_encode change
单引号和双引号也会引起问题。因此,以一种方式将它们转换为等效的 unicode 以避免出现问题。所以后来版本的 PHP 的 json_encode 改变了
'to be \u0027
'成为 \u0027
and
和
"to be \u0022
"成为 \u0022
So applying these three rules to
所以应用这三个规则
O\'Rei\"lly
gives us
给我们
O\\u0027Rei\\u0022lly
This string is then wrapped in double quotes to make it a JSON string. Your replace expressions include the leading forward slashes. Either by accident or on purpose this means that the leading and trailing double quote returned by json_encodeis not subject to the escaping, which it shouldn't be.
然后将此字符串括在双引号中以使其成为 JSON 字符串。您的替换表达式包括前导正斜杠。无论是偶然还是故意,这意味着返回的前导和尾随双引号json_encode不受转义的影响,这不应该是。
So in earlier versions of PHP
所以在早期版本的 PHP 中
$s = addslashes('O\'Rei"lly');
print json_encode($s);
would print
会打印
"O\'Rei\\"lly"
and we want to change 'to be \u0027and we want to change \"to be \u0022because the \in \"is just to get the "into the string because it begins and ends with double-quotes.
并且我们想要更改'为\u0027并且我们想要更改\"为\u0022因为\in\"只是为了"将字符串放入字符串中,因为它以双引号开头和结尾。
So that's why we get
所以这就是为什么我们得到
"O\\u0027Rei\\u0022lly"
回答by staticsan
It's escaping the backslash as well as the quote. It's difficult dealing with escaped escapes, as you're doing here, as it quickly turns into backslash counting games. :-/
它正在转义反斜杠和引号。处理逃脱的转义很困难,就像您在这里所做的那样,因为它很快就会变成反斜杠计数游戏。:-/
回答by drawnonward
When you encode a string for json, some things have to be escaped regardless of the options. As others have pointed out, that includes '\' so any backslash run through json_encode will be doubled. Since you are first running your string through addslashes, which also adds backslashes to quotes, you are adding a lot of extra backslashes. The following function will emulate how json_encode would encode a string. If the string has already had backslashes added, they will be doubled.
当您为 json 编码字符串时,无论选项如何,都必须对某些内容进行转义。正如其他人指出的那样,这包括“\”,因此通过 json_encode 运行的任何反斜杠都将加倍。由于您首先通过addslashes 运行您的字符串,这也会在引号中添加反斜杠,因此您添加了许多额外的反斜杠。以下函数将模拟 json_encode 如何编码字符串。如果字符串已经添加了反斜杠,它们将被加倍。
function json_encode_string( $encode , $options ) {
$escape = '\$s = addslashes('O\'Rei"lly'); // O\'Rei\"lly
..';
$needle = array();
$replace = array();
if ( $options & JSON_HEX_APOS ) {
$needle[] = "'";
$replace[] = '\u0027';
} else {
$escape .= "'";
}
if ( $options & JSON_HEX_QUOT ) {
$needle[] = '"';
$replace[] = '\u0022';
} else {
$escape .= '"';
}
if ( $options & JSON_HEX_AMP ) {
$needle[] = '&';
$replace[] = '\u0026';
}
if ( $options & JSON_HEX_TAG ) {
$needle[] = '<';
$needle[] = '>';
$replace[] = '\u003C';
$replace[] = '\u003E';
}
$encode = addcslashes( $encode , $escape );
$encode = str_replace( $needle , $replace , $encode );
return $encode;
}
回答by dabito
If I understand correctly, you just want to know why you need to use
如果我理解正确,您只想知道为什么需要使用
'\\\u0027'and not just'\\u0027'
You're escaping the slash and the character unicode value. With this you are telling json that it should put an apostrophe there, but it needs the backslash and the u to know that a unicode hexadecimal character code is next.
您正在转义斜杠和字符 unicode 值。有了这个,你告诉 json 它应该在那里放一个撇号,但它需要反斜杠和 u 才能知道接下来是 unicode 十六进制字符代码。
Since you are escaping this string:
由于您要转义此字符串:
<?php $out = json_encode(array(10, "h'ello", addslashes("h'ello re-escaped"))); ?>
<script type="text/javascript">
var out = <?php echo $out; ?>;
alert(out[0]);
alert(out[1]);
alert(out[2]);
</script>
the first backslash is actually escaping the backslash before the apostrophe. Then next slash is used to escape the backslash used by json to identify the character as a unicode character.
第一个反斜杠实际上是在撇号前转义反斜杠。然后下一个斜杠用于转义 json 用于将字符标识为 unicode 字符的反斜杠。
If you were appplying the algorythm to O'Reilly instead of O\'Rei\"lly then the latter would suffice.
如果您将算法应用于 O'Reilly 而不是 O\'Rei\"lly,那么后者就足够了。
I hope you find this useful. I only leave you this link so you can read more on how json is constructed, since its obvious you already understand PHP:
希望这个对你有帮助。我只给你留下这个链接,所以你可以阅读更多关于 json 是如何构建的,因为很明显你已经了解 PHP:
回答by Zsolti
Since you are going to json_encodethe string \'you will have to encode first the \then the '. So you will have \\and \u0027. Concatenating these results \\\u0027.
由于您要访问json_encode字符串,因此\'您必须先编码,\然后是'. 所以你会有\\和\u0027。串联这些结果\\\u0027。
回答by Tom
The \generated by addslashes()get re-escaped by json_encode(). You probably meant to say this Doing json_encode($s, JSON_HEX_APOS | JSON_HEX_QUOT) outputs the followingbut you used $strinstead of $s, which confused everyone.
该\所产生的addslashes()由get再逃脱json_encode()。您可能想说这个,Doing json_encode($s, JSON_HEX_APOS | JSON_HEX_QUOT) outputs the following但您使用了$str代替$s,这让每个人都感到困惑。
If you evaluate the string "O\\\u0027Rei\\\u0022lly"in JavaScript, you will get "O\'rei\"lly"and I am pretty sure that's notwhat you want. When you evaluate it, you probably need all the control codes removed. Go ahead, poke this in a file: alert("O\\\u0027Rei\\\u0022lly").
如果你"O\\\u0027Rei\\\u0022lly"在 JavaScript 中计算字符串,你会得到"O\'rei\"lly",我很确定这不是你想要的。当您评估它时,您可能需要删除所有控制代码。继续,在文件中戳这个:alert("O\\\u0027Rei\\\u0022lly").
Conclusion: You are escaping the quotes twice, which is most likely not what you need. json_encodealready escapes everything that is needed so that any JavaScript parser would return the original data structure. In your case, that is the string you have obtained after the call to addslashes.
结论:您两次转义引号,这很可能不是您需要的。json_encode已经转义了所有需要的东西,这样任何 JavaScript 解析器都会返回原始数据结构。在您的情况下,这是您在调用addslashes.
Proof:
证明:
##代码##
