正则表达式匹配任何长度超过八个字母的东西,在 Python 中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3604105/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:54:31  来源:igfitidea点击:

Regular expression matching anything greater than eight letters in length, in Python

pythonregex

提问by magnetar

Despite attempts to master grep and related GNU software, I haven't come close to mastering regular expressions. I do like them, but I find them a bit of an eyesore all the same.

尽管试图掌握 grep 和相关的 GNU 软件,我还没有接近掌握正则表达式。我确实喜欢它们,但我还是觉得它们有点碍眼。

I suppose this question isn't difficult for some, but I've spent hours trying to figure out how to search through my favorite book for words greater than a certain length, and in the end, came up with some really ugly code:

我想这个问题对某些人来说并不难,但我花了几个小时试图弄清楚如何在我最喜欢的书中搜索大于特定长度的单词,最后,想出了一些非常丑陋的代码:

twentyfours = [w for w in vocab if re.search('^........................$', w)]
twentyfives = [w for w in vocab if re.search('^.........................$', w)]
twentysixes = [w for w in vocab if re.search('^..........................$', w)]
twentysevens = [w for w in vocab if re.search('^...........................$', w)]
twentyeights = [w for w in vocab if re.search('^............................$', w)]

... a line for each length, all the way from a certain length to another one.

... 每个长度都有一条线,从某个长度一直到另一个长度。

What I want instead is to be able to say 'give me every word in vocab that's greater than eight letters in length.' How would I do that?

相反,我想要的是能够说“给我单词中长度超过八个字母的每个单词”。我该怎么做?

采纳答案by kennytm

You don't need regex for this.

您不需要为此使用正则表达式。

result = [w for w in vocab if len(w) >= 8]

but if regex must be used:

但如果必须使用正则表达式:

rx = re.compile('^.{8,}$')
#                  ^^^^ {8,} means 8 or more.
result = [w for w in vocab if rx.match(w)]

See http://www.regular-expressions.info/repeat.htmlfor detail on the {a,b}syntax.

有关语法的详细信息,请参阅http://www.regular-expressions.info/repeat.html{a,b}

回答by unholysampler

^.{8,}$

^.{8,}$

This will match something that has at least 8 characters. You can also place a number after the coma to limit the upper bound or remove the first number to not restrict the lower bound.

这将匹配至少有 8 个字符的内容。您还可以在昏迷后放置一个数字以限制上限或删除第一个数字以不限制下限。

回答by Andy

if you do want to use a regular expression

如果您确实想使用正则表达式

result = [ w for w in vocab if re.search('^.{24}',w) ]

the {x} says match x characters. but it is probably better to use len(w)

{x} 表示匹配 x 个字符。但最好使用 len(w)

回答by Ivo van der Wijk

\w will match letter and characters, {min,[max]} allows you to define size. An expression like

\w 将匹配字母和字符,{min,[max]} 允许您定义大小。像这样的表达

\w{9,}

will give all letter/number combinations of 9 characters or more

将给出 9 个或更多字符的所有字母/数字组合

回答by unbeli

.{9,}for "more than eight", .{8,}for "eight or more"
Or just len(w) > 8

.{9,}对于“超过八个”, .{8,}对于“八个或更多”
或者只是len(w) > 8