Python:匹配字母数字的正则表达式不起作用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4722998/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: Regular expression to match alpha-numeric not working?
提问by Tommo
I am looking to match a string that is inputted from a website to check if is alpha-numeric and possibly contains an underscore. My code:
我希望匹配从网站输入的字符串,以检查是否为字母数字并且可能包含下划线。我的代码:
if re.match('[a-zA-Z0-9_]',playerName):
# do stuff
For some reason, this matches with crazy chars for example: nIg○▲ ☆ ★ ◇ ◆
出于某种原因,这与疯狂的字符匹配,例如:nIg○▲ ☆ ★ ◇ ◆
I only want regular A-Z and 0-9 and _ matching, is there something i am missing here?
我只想要常规的 AZ 和 0-9 和 _ 匹配,我在这里缺少什么吗?
采纳答案by Rozuur
Python has a special sequence \wfor matching alphanumeric and underscore when the LOCALEand UNICODEflags are not specified. So you can modify your pattern as,
\w当没有指定LOCALE和UNICODE标志时,Python 有一个特殊的序列来匹配字母数字和下划线。所以你可以修改你的模式,
pattern = '^\w+$'
pattern = '^\w+$'
回答by Klaus Byskov Pedersen
Your regex only matches one character. Try this instead:
您的正则表达式只匹配一个字符。试试这个:
if re.match('^[a-zA-Z0-9_]+$',playerName):
回答by Fred Nurk
…check if is alpha-numeric and possibly contains an underscore.
...检查是否是字母数字并且可能包含下划线。
Do you mean this literally, so that only one underscore is allowed, total? (Not unreasonable for player names; adjacent underscores in particular can be hard for other players to read.) Should "a_b_c" not match?
你的意思是字面意思,所以总共只允许一个下划线?(对于玩家名称来说并非不合理;尤其是相邻的下划线可能让其他玩家难以阅读。)“a_b_c”应该不匹配吗?
If so:
如果是这样的话:
if playerName and re.match("^[a-zA-Z0-9]*_?[a-zA-Z0-9]*$", playerName):
The new first part of the condition checks for an empty value, which simplifies the regex.
条件的新第一部分检查空值,这简化了正则表达式。
This places no restrictions on where the underscore can occur, so all of "_a", "a_", and "_" will match. If you instead want to prevent both leading and trailing underscores, which is again reasonable for player names, change to:
这对下划线出现的位置没有限制,因此所有“_a”、“a_”和“_”都将匹配。如果您想同时阻止前导和尾随下划线,这对于玩家名称也是合理的,请更改为:
if re.match("^[a-zA-Z0-9]+(?:_[a-zA-Z0-9]+)?$", playerName):
// this regex doesn't match an empty string, so that check is unneeded

