php 正则表达式:如何不带下划线表达 \w
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14858346/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regular Expressions: How to Express \w Without Underscore
提问by Dimitri Vorontzov
Is there a concise way to express:
有没有简洁的表达方式:
\w but without _
That is, "all characters included in \w, except _"
即“\w 中包含的所有字符,除了 _”
I'm asking this because I'm looking for the most concise way to express domain name validation. A domain name may include lowercase and uppercase letters, numbers, period signs and dashes, but no underscores. \w includes all of the above, plus an underscore. So, is there any way to "remove" an underscore from \w via regex syntax?
我问这个是因为我正在寻找表达域名验证的最简洁的方法。域名可以包含小写和大写字母、数字、句号和破折号,但不能包含下划线。\w 包括以上所有内容,加上一个下划线。那么,有没有办法通过正则表达式语法从 \w 中“删除”下划线?
Edited:I'm asking about regex as used in PHP.
编辑:我在询问 PHP 中使用的正则表达式。
Thanks in advance!
提前致谢!
回答by protist
the following character class (in Perl)
以下字符类(在 Perl 中)
[^\W_]
\Wis the same as [^\w]
\W是相同的 [^\w]
回答by Bergi
You could use a negative lookahead: (?!_)\w
您可以使用负前瞻:(?!_)\w
However, I think writing [a-zA-Z0-9.-]is more readable.
但是,我认为写作[a-zA-Z0-9.-]更具可读性。
回答by Kent
If my understanding is right \wmeans [A-Za-z0-9_]period signs, dashes are not included.
如果我的理解是正确的\w意味着[A-Za-z0-9_]句号,不包括破折号。
info: http://en.wikipedia.org/wiki/Regular_expression#POSIX_character_classes
信息:http: //en.wikipedia.org/wiki/Regular_expression#POSIX_character_classes
so I guess what you want is [a-zA-Z0-9.-]
所以我想你想要的是 [a-zA-Z0-9.-]
回答by nhahtdh
To be on the safe side, usually, we will use character class:
为了安全起见,通常我们会使用字符类:
[a-zA-Z0-9.-]
The regex "fragment" above match English alphabet, and digits, plus period .and dash -. It should work even with the most basic regex support.
上面的正则表达式“片段”匹配英文字母和数字,加上句点.和破折号-。即使有最基本的正则表达式支持,它也应该可以工作。
Shorter may be better, but only if you know exactly what it represents.
越短越好,但前提是您确切地知道它代表什么。
I don't know what language you are using. In a lot of engines, \wis equivalent to [a-zA-Z0-9_](some requires "ASCII mode" for this). However, some engine have Unicode support for regex, and may extend \wto match Unicode characters.
我不知道你用的是什么语言。在很多引擎中,\w相当于[a-zA-Z0-9_](有些为此需要“ASCII 模式”)。但是,某些引擎对正则表达式具有 Unicode 支持,并且可能会扩展\w以匹配 Unicode 字符。
回答by Zero Piraeus
Some regex flavours have a negative lookbehind syntax you might use:
某些正则表达式风格具有您可能会使用的负面后视语法:
\w(?<!_)
回答by Zoltán Tamási
I would start with [^_], and then think of what else characters I need to deny. If you need to filter a keyboard input, it's quite simple to enumerate all the unwanted characters.
我会从 [^_] 开始,然后想想我还需要否认哪些字符。如果您需要过滤键盘输入,枚举所有不需要的字符非常简单。
回答by MrD
You can write something like this:
你可以这样写:
\([^\w]|_)\u
If you use preg_filter with this string any character in \w (excluding _ underscore) will be filtered.
如果将此字符串与 preg_filter 一起使用,\w 中的任何字符(不包括 _ 下划线)都将被过滤。

