php - 如何与正则表达式匹配除“-”之外的所有特殊字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9727097/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 07:31:00  来源:igfitidea点击:

How to match with regex all special chars except "-" in PHP?

phpregexunicodespecial-charactersnon-alphanumeric

提问by CaTz

How can I match all the “special” chars (like +_*&^%$#@!~) except the char -in PHP?

如何匹配+_*&^%$#@!~-PHP 中的字符之外的所有“特殊”字符(如)?

I know that \Wwill match all the “special” chars including the -.

我知道这\W将匹配所有“特殊”字符,包括-.

Any suggestions in consideration of Unicode letters?

考虑到 Unicode 字母有什么建议吗?

回答by hakre

  • [^-]is not the special character you want
  • [\W]are all special characters as you know
  • [^\w]are all special characters as well - sounds fair?
  • [^-]不是你想要的特殊字符
  • [\W]如你所知,都是特殊字符
  • [^\w]是否所有特殊字符也是如此 - 听起来公平吗?

So therefore [^\w-]is the combination of both: All "special" characters but without -.

因此[^\w-]是两者的组合:所有“特殊”字符但没有-.

回答by tchrist

  • \pLmatches any character with the Unicode Lettercharacter property, which is a major general category group; that is, it matches [\p{Ll}\p{Lt}\p{Lu}\p{Lm}\p{Lo}].
  • \pNmatches any character with the Unicode Numbercharacter property, which is a major general category group; that is, it matches [\p{Nd}\p{Nl}\p{No}].
  • Note that the Unicode Alphabeticcharacterproperty also includes certain combining marks such as U+0345 ?? ????????? ????? ?????????????. I suggest you that you also include \pM, which matches any character with the Unicode Markcharacter property, which is a major general category group; that is, it matches [\p{Mn}\p{Me}\p{Mc}].
  • Character U+002D ??????-????? is probably the -you're referring to.
  • Note though that Unicode v6.1 has 27 characterswith the Unicode Dashcharacter property, including such common characters as U+2010 ??????, U+2013 ?? ????, U+2014 ?? ????, and U+2212 ????? ????. Whether you actually want to include or exclude those, I have no idea.
  • \pL匹配任何具有 UnicodeLetter字符属性的字符,这是一个主要的通用类别组;也就是说,它匹配[\p{Ll}\p{Lt}\p{Lu}\p{Lm}\p{Lo}]
  • \pN匹配任何具有 UnicodeNumber字符属性的字符,这是一个主要的通用类别组;也就是说,它匹配[\p{Nd}\p{Nl}\p{No}]
  • 请注意,UnicodeAlphabetic字符属性还包括某些组合标记,例如 U+0345 ?? ????????? ????? ??????????????? 我建议你还包括\pM,它匹配具有 UnicodeMark字符属性的任何字符,这是一个主要的通用类别组;也就是说,它匹配[\p{Mn}\p{Me}\p{Mc}]
  • 字符 U+002D ??????-????? 可能是-你所指的。
  • 请注意,Unicode v6.1 有27 个具有 UnicodeDash字符属性的字符,包括 U+2010 ??????、U+2013 ?? 等常见字符。????, U+2014 ?? ??, 和 U+2212 ??? ????. 无论您是真的想包括还是排除这些,我都不知道。

Given all that, it is not unlikely that you want something like:

鉴于所有这些,您不太可能想要以下内容:

[^\pL\pN\pM\x2D\x{2010}-\x{2015}\x{2212}]

回答by Austin Brunkhorst

You can try this pattern

你可以试试这个模式

([^a-zA-Z-])

([^a-zA-Z-])

This should match all characters that are not a-zand the -

这应该匹配所有不是的字符a-z-