Python:匹配字母数字的正则表达式不起作用?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4722998/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 17:12:35  来源:igfitidea点击:

Python: Regular expression to match alpha-numeric not working?

pythonregex

提问by Tommo

I am looking to match a string that is inputted from a website to check if is alpha-numeric and possibly contains an underscore. My code:

我希望匹配从网站输入的字符串,以检查是否为字母数字并且可能包含下划线。我的代码:

if re.match('[a-zA-Z0-9_]',playerName):
            # do stuff

For some reason, this matches with crazy chars for example: nIg○▲ ☆ ★ ◇ ◆

出于某种原因,这与疯狂的字符匹配,例如:nIg○▲ ☆ ★ ◇ ◆

I only want regular A-Z and 0-9 and _ matching, is there something i am missing here?

我只想要常规的 AZ 和 0-9 和 _ 匹配,我在这里缺少什么吗?

采纳答案by Rozuur

Python has a special sequence \wfor matching alphanumeric and underscore when the LOCALEand UNICODEflags are not specified. So you can modify your pattern as,

\w当没有指定LOCALEUNICODE标志时,Python 有一个特殊的序列来匹配字母数字和下划线。所以你可以修改你的模式,

pattern = '^\w+$'

pattern = '^\w+$'

回答by Klaus Byskov Pedersen

Your regex only matches one character. Try this instead:

您的正则表达式只匹配一个字符。试试这个:

if re.match('^[a-zA-Z0-9_]+$',playerName): 

回答by Fred Nurk

…check if is alpha-numeric and possibly contains an underscore.

...检查是否是字母数字并且可能包含下划线。

Do you mean this literally, so that only one underscore is allowed, total? (Not unreasonable for player names; adjacent underscores in particular can be hard for other players to read.) Should "a_b_c" not match?

你的意思是字面意思,所以总共只允许一个下划线?(对于玩家名称来说并非不合理;尤其是相邻的下划线可能让其他玩家难以阅读。)“a_b_c”应该不匹配吗?

If so:

如果是这样的话:

if playerName and re.match("^[a-zA-Z0-9]*_?[a-zA-Z0-9]*$", playerName):

The new first part of the condition checks for an empty value, which simplifies the regex.

条件的新第一部分检查空值,这简化了正则表达式。

This places no restrictions on where the underscore can occur, so all of "_a", "a_", and "_" will match. If you instead want to prevent both leading and trailing underscores, which is again reasonable for player names, change to:

这对下划线出现的位置没有限制,因此所有“_a”、“a_”和“_”都将匹配。如果您想同时阻止前导和尾随下划线,这对于玩家名称也是合理的,请更改为:

if re.match("^[a-zA-Z0-9]+(?:_[a-zA-Z0-9]+)?$", playerName):
// this regex doesn't match an empty string, so that check is unneeded