python正则表达式“\1”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20802056/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python regular expression "\1"
提问by Mengwen
Can anyone tell me what does "\1" mean in the following regular expression in Python?
谁能告诉我以下 Python 正则表达式中的“\1”是什么意思?
re.sub(r'(\b[a-z]+) ', r'', 'cat in the the hat')
采纳答案by Patrick Collins
\1is equivalent to re.search(...).group(1), the first parentheses-delimited expression inside of the regex.
\1等价于re.search(...).group(1), 正则表达式中的第一个括号分隔的表达式。
It's also, fun fact, part of the reason that regular expressions are significantly slower in Python and other programming languages than required to be by CS theory.
这也是一个有趣的事实,部分原因是正则表达式在 Python 和其他编程语言中比CS 理论所要求的要慢得多。
回答by doctorlove
The first \1means the first group - i.e. the first bracketed expression (\b[a-z]+)
第一个\1表示第一组 - 即第一个括号表达式(\b[a-z]+)
From the docs\number
从文档\number
"Matches the contents of the group of the same number. Groups are numbered starting from 1. For example, (.+) \1 matches 'the the' or '55 55', but not 'thethe' (note the space after the group)"
"匹配相同编号的组的内容。组从1开始编号。例如,(.+)\1匹配'the the'或'55 55',但不匹配'thethe'(注意后面的空格团体)”
In your case it is looking for a repeated "word" (well, block of lower case letters).
在您的情况下,它正在寻找重复的“单词”(好吧,小写字母块)。
The second \1is the replacement to use in case of a match, so a repeated word will be replaced by a single word.
第二个\1是在匹配的情况下使用的替换,因此重复的单词将被单个单词替换。
回答by stranac
From the python docs for the re module:
\numberMatches the contents of the group of the same number. Groups are numbered starting from 1. For example,
(.+) \1matches'the the'or'55 55', but not'thethe'(note the space after the group). This special sequence can only be used to match one of the first 99 groups. If the first digit of number is 0, or number is 3 octal digits long, it will not be interpreted as a group match, but as the character with octal value number. Inside the'['and']'of a character class, all numeric escapes are treated as characters.
\number匹配相同号码组的内容。组从 1 开始编号。例如,
(.+) \1匹配'the the'或'55 55',但不匹配'thethe'(注意组后的空格)。此特殊序列只能用于匹配前 99 个组中的一个。如果 number 的第一位为 0,或 number 为 3 个八进制位长,则不会被解释为组匹配,而是被解释为具有八进制值的字符。在字符类的'['and']'中,所有数字转义都被视为字符。
Your example is basically the same as what is explained in the docs.
您的示例与文档中解释的内容基本相同。
回答by Michel Feldheim
\1is a backreference.
It matches, what ever matched in your brackets, in this case the
\1是反向引用。在这种情况下,它匹配您括号中匹配的内容the
You are basically saying
你基本上是说
- match empty string at the beginning of a word (\b)
- match alphabetical characters from a-z, one or more times
- match the term in brackets again
- 匹配单词开头的空字符串 (\b)
- 匹配 az 中的字母字符,一次或多次
- 再次匹配括号中的术语
cat in (' ''the')' the'hat
cat in (' ''the')' the'帽子
回答by Nuwan Madushanka
Example
The following code using Python regex to find the repeating digits in given string
示例
以下代码使用 Python 正则表达式查找给定字符串中的重复数字
import re
result = re.search(r'(\d)\1{3}','54222267890' )
print result.group()
This gives the output
2222
import re
result = re.search(r'(\d)\1{3}','54222267890' )
print result.group()
这给出了输出
2222

