仅 Python 正则表达式匹配空间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38162444/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:26:50  来源:igfitidea点击:

Python regex match space only

pythonregex

提问by Dimitry

In python3, how do I match exactly whitespace character and not newline \n or tab \t?

在python3中,如何完全匹配空白字符而不是换行符\n或制表符\t?

I've seen the \s+[^\n]answer from Regex match space not \nanswer, but for the following example it does not work:

我已经看到Regex 匹配空间\s+[^\n]答案 不是 \n答案,但对于以下示例,它不起作用:

a='rasd\nsa sd'
print(re.search(r'\s+[^ \n]',a))

Result is <_sre.SRE_Match object; span=(4, 6), match='\ns'>, which is the newline matched.

结果是<_sre.SRE_Match object; span=(4, 6), match='\ns'>,这是匹配的换行符。

回答by Resonance

No need for special groups. Just create a regex with a space character. The space character does not have any special meaning, it just means "match a space".

不需要特殊群体。只需创建一个带有空格字符的正则表达式。空格字符没有任何特殊含义,它只是表示“匹配一个空格”。

RE = re.compile(' +')

So for your case

所以对于你的情况

a='rasd\nsa sd'
print(re.search(' +', a))

would give

会给

<_sre.SRE_Match object; span=(7, 8), match=' '>

回答by Wiktor Stribi?ew

If you want to match 1 or more whitespace chars except the newline and a tab use

如果要匹配除换行符和制表符之外的 1 个或多个空白字符,请使用

r"[^\S\n\t]+"

The [^\S]matches any char that is not a non-whitespace = any char that is whitespace. However, since the character class is a negated one, when you add characters to it they are excluded from matching.

[^\S]任何字符不是相匹配的非空白=任何炭是空格。但是,由于字符类是否定类,因此当您向其中添加字符时,它们会被排除在匹配之外。

Python demo:

Python 演示

import re
a='rasd\nsa sd'
print(re.findall(r'[^\S\n\t]+',a))
# => [' ']

Some more considerations: \smatches [ \t\n\r\f\v]if ASCII flag is used. So, if you plan to only match ASCII, you might as well use [ \r\f\v]to exclude the chars you want. If you need to work with Unicode strings, the solution above is a viable one.

更多注意事项:如果使用 ASCII 标志,则\s匹配[ \t\n\r\f\v]。因此,如果您打算只匹配 ASCII,您不妨使用[ \r\f\v]排除您想要的字符。如果您需要使用 Unicode 字符串,上面的解决方案是可行的。