php - 如何与正则表达式匹配除“-”之外的所有特殊字符?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9727097/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to match with regex all special chars except "-" in PHP?
提问by CaTz
How can I match all the “special” chars (like +_*&^%$#@!~) except the char -in PHP?
如何匹配+_*&^%$#@!~除-PHP 中的字符之外的所有“特殊”字符(如)?
I know that \Wwill match all the “special” chars including the -.
我知道这\W将匹配所有“特殊”字符,包括-.
Any suggestions in consideration of Unicode letters?
考虑到 Unicode 字母有什么建议吗?
回答by hakre
[^-]is not the special character you want[\W]are all special characters as you know[^\w]are all special characters as well - sounds fair?
[^-]不是你想要的特殊字符[\W]如你所知,都是特殊字符[^\w]是否所有特殊字符也是如此 - 听起来公平吗?
So therefore [^\w-]is the combination of both: All "special" characters but without -.
因此[^\w-]是两者的组合:所有“特殊”字符但没有-.
回答by tchrist
\pLmatches any character with the UnicodeLettercharacter property, which is a major general category group; that is, it matches[\p{Ll}\p{Lt}\p{Lu}\p{Lm}\p{Lo}].\pNmatches any character with the UnicodeNumbercharacter property, which is a major general category group; that is, it matches[\p{Nd}\p{Nl}\p{No}].- Note that the Unicode
Alphabeticcharacterproperty also includes certain combining marks such as U+0345 ?? ????????? ????? ?????????????. I suggest you that you also include\pM, which matches any character with the UnicodeMarkcharacter property, which is a major general category group; that is, it matches[\p{Mn}\p{Me}\p{Mc}]. - Character U+002D ??????-????? is probably the
-you're referring to. - Note though that Unicode v6.1 has 27 characterswith the Unicode
Dashcharacter property, including such common characters as U+2010 ??????, U+2013 ?? ????, U+2014 ?? ????, and U+2212 ????? ????. Whether you actually want to include or exclude those, I have no idea.
\pL匹配任何具有 UnicodeLetter字符属性的字符,这是一个主要的通用类别组;也就是说,它匹配[\p{Ll}\p{Lt}\p{Lu}\p{Lm}\p{Lo}]。\pN匹配任何具有 UnicodeNumber字符属性的字符,这是一个主要的通用类别组;也就是说,它匹配[\p{Nd}\p{Nl}\p{No}]。- 请注意,Unicode
Alphabetic字符属性还包括某些组合标记,例如 U+0345 ?? ????????? ????? ??????????????? 我建议你还包括\pM,它匹配具有 UnicodeMark字符属性的任何字符,这是一个主要的通用类别组;也就是说,它匹配[\p{Mn}\p{Me}\p{Mc}]。 - 字符 U+002D ??????-????? 可能是
-你所指的。 - 请注意,Unicode v6.1 有27 个具有 Unicode
Dash字符属性的字符,包括 U+2010 ??????、U+2013 ?? 等常见字符。????, U+2014 ?? ??, 和 U+2212 ??? ????. 无论您是真的想包括还是排除这些,我都不知道。
Given all that, it is not unlikely that you want something like:
鉴于所有这些,您不太可能想要以下内容:
[^\pL\pN\pM\x2D\x{2010}-\x{2015}\x{2212}]
回答by Austin Brunkhorst
You can try this pattern
你可以试试这个模式
([^a-zA-Z-])
([^a-zA-Z-])
This should match all characters that are not a-zand the -
这应该匹配所有不是的字符a-z和-

