在 PHP 正则表达式中转义反斜杠 [\] 的正确方法?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11044136/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 23:38:59  来源:igfitidea点击:

Right way to escape backslash [ \ ] in PHP regex?

phpregex

提问by Mahmoud Tahan

Just out of curiosity, I'm trying to figure out which exactly is the right way to escape a backslash for use in a PHP regular expression pattern like so:

出于好奇,我试图找出在 PHP 正则表达式模式中使用反斜杠的正确方法,如下所示:

TEST 01: (3 backslashes)

测试 01:(3 个反斜杠)

$pattern = "/^[\\]{1,}$/";
$string = '\';

// ----- RETURNS A MATCH -----

TEST 02: (4 backslashes)

测试 02:(4 个反斜杠)

$pattern = "/^[\\]{1,}$/";
$string = '\';

// ----- ALSO RETURNS A MATCH -----

According to the articles below, 4 is supposedly the right way but what confuses me is that both tests returned a match. If both are right, then is 4 the preferred way?

根据下面的文章,4 应该是正确的方法,但让我感到困惑的是两个测试都返回了匹配项。如果两者都正确,那么 4 是首选方式吗?

RESOURCES:

资源:

采纳答案by Marc B

The thing is, you're using a character class, [], so it doesn't matter how many literal backslashes are embedded in it, it'll be treated as a single backslash.

问题是,您使用的是字符类, [],因此无论嵌入多少文字反斜杠都无关紧要,它将被视为单个反斜杠。

e.g. the following two regexes:

例如以下两个正则表达式:

/[a]/
/[aa]/

are for all intents and purposes identical as far as the regex engine is concerned. Character classes take a list of characters and "collapse" them down to match a single character, along the lines of "for the current character being considered, is it any of the characters listed inside the []?". If you list two backslashes in the class, then it'll be "is the char a blackslash or is it a backslash?".

就正则表达式引擎而言,所有意图和目的都是相同的。字符类采用字符列表并将它们“折叠”以匹配单个字符,沿着“对于当前正在考虑的字符,它是[]?内列出的任何字符”。如果您在类中列出两个反斜杠,那么它将是“字符是黑斜杠还是反斜杠?”。

回答by MikeM

// PHP 5.4.1

// Either three or four \ can be used to match a '\'.
echo preg_match( '/\\/', '\' );        // 1
echo preg_match( '/\\/', '\' );       // 1

// Match two backslashes `\`.
echo preg_match( '/\\\/', '\\' );   // Warning: No ending delimiter '/' found
echo preg_match( '/\\\\/', '\\' );  // 1
echo preg_match( '/\\\\/', '\\' ); // 1

// Match one backslash using a character class.
echo preg_match( '/[\]/', '\' );       // 0
echo preg_match( '/[\\]/', '\' );      // 1  
echo preg_match( '/[\\]/', '\' );     // 1

When using three backslashes to match a '\'the pattern below is interpreted as match a '\'followed by an 's'.

当使用三个反斜杠匹配 a 时'\',下面的模式被解释为匹配 a'\'后跟一个's'

echo preg_match( '/\\s/', '\ ' );    // 0  
echo preg_match( '/\\s/', '\s' );    // 1  

When using four backslashes to match a '\'the pattern below is interpreted as match a '\'followed by a space character.

当使用四个反斜杠匹配 a 时'\',下面的模式被解释为匹配 a'\'后跟一个空格字符。

echo preg_match( '/\\\s/', '\ ' );   // 1
echo preg_match( '/\\\s/', '\s' );   // 0

The same applies if inside a character class.

如果在字符类中,这同样适用。

echo preg_match( '/[\\s]/', ' ' );   // 0 
echo preg_match( '/[\\\s]/', ' ' );  // 1 

None of the above results are affected by enclosing the strings in double instead of single quotes.

将字符串括在双引号而不是单引号中不会影响上述结果。

Conclusions:
Whether inside or outside a bracketed character class, a literal backslash can be matched using just three backslashes '\\\'unless the next character in the pattern is also backslashed, in which case the literal backslash must be matched using four backslashes.

结论:
无论在方括号字符类内部还是外部,文字反斜杠都可以仅使用三个反斜杠进行匹配,'\\\'除非模式中的下一个字符也被反斜杠,在这种情况下,文字反斜杠必须使用四个反斜杠进行匹配。

Recommendation:
Always use four backslashes '\\\\'in a regex pattern when seeking to match a backslash.

建议:在寻求匹配反斜杠时,
始终'\\\\'在正则表达式模式中使用四个反斜杠。

Escape sequences.

转义序列

回答by Олег Всильдеревьев

To avoid this kind of unclear code you can use \x5cLike this :)

为了避免这种不清楚的代码,你可以使用\x5c像这样:)

echo preg_replace( '/\x5c\w+\.php$/i', '<b>
$regexp = <<<EOR
schemaLocation\s*=\s*["'](.*?)["']
EOR;
preg_match_all("/".$regexp."/", $xml, $matches);
print_r($matches);
</b>', __FILE__ );

回答by Scott Chu

I've studied this years ago. That's because 1st backslash escapes the 2nd one and they together form a 'true baclkslash' character in pattern and this true one escapes the 3rd one. So it magically makes 3 backslashes work.

我几年前学过这个。那是因为第一个反斜杠逃脱了第二个反斜杠,它们一起在模式中形成了一个“真正的反斜杠”字符,而这个真正的反斜杠逃脱了第三个。所以它神奇地使 3 个反斜杠起作用。

However, normal suggestion is to use 4 backslashes instead of the ambiguous 3 backslashes.

但是,通常的建议是使用 4 个反斜杠而不是模糊的 3 个反斜杠。

If I'm wrong about anything, please feel free to correct me.

如果我有任何错误,请随时纠正我。

回答by test30

You can also use the following

您还可以使用以下

##代码##

keywords: dochere, nowdoc

关键词: dochere,nowdoc