Python 匹配点的正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13989640/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 10:06:25  来源:igfitidea点击:

Regular expression to match a dot

pythonregex

提问by

Was wondering what the best way is to match "test.this"from "blah blah blah [email protected] blah blah"is? Using Python.

想知道最好的方法是什么,以匹配"test.this""blah blah blah [email protected] blah blah"IS?使用 Python。

I've tried re.split(r"\b\w.\w@")

我试过了 re.split(r"\b\w.\w@")

回答by Yuushi

A .in regex is a metacharacter, it is used to match any character. To match a literal dot, you need to escape it, so \.

.正则表达式中的A是元字符,用于匹配任何字符。要匹配文字点,您需要对其进行转义,因此\.

回答by Rohit Jain

In your regex you need to escapethe dot "\."or use it inside a character class"[.]", as it is a meta-character in regex, which matches any character.

在您的正则表达式中,您需要对点进行转义"\."或在字符类中使用它"[.]",因为它是正则表达式中的元字符,可匹配任何字符。

Also, you need \w+instead of \wto match one or more word characters.

此外,您需要\w+而不是\w匹配一个或多个单词字符。



Now, if you want the test.thiscontent, then splitis not what you need. splitwill split your string around the test.this. For example:

现在,如果您想要test.this内容,那么split就不是您所需要的。split将围绕test.this. 例如:

>>> re.split(r"\b\w+\.\w+@", s)
['blah blah blah ', 'gmail.com blah blah']


You can use re.findall:

您可以使用re.findall

>>> re.findall(r'\w+[.]\w+(?=@)', s)   # look ahead
['test.this']
>>> re.findall(r'(\w+[.]\w+)@', s)     # capture group
['test.this']

回答by StackUser

"In the default mode, Dot (.) matches any character except a newline. If the DOTALL flag has been specified, this matches any character including a newline." (python Doc)

“在默认模式下,点 (.) 匹配除换行符以外的任何字符。如果指定了 DOTALL 标志,则匹配包括换行符在内的任何字符。” (蟒蛇文档)

So, if you want to evaluate dot literaly, I think you should put it in square brackets:

所以,如果你想从字面上评估点,我认为你应该把它放在方括号中:

>>> p = re.compile(r'\b(\w+[.]\w+)')
>>> resp = p.search("blah blah blah [email protected] blah blah")
>>> resp.group()
'test.this'

回答by Zibri

In javascript you have to use \. to match a dot.

在javascript中你必须使用\。匹配一个点。

Example

例子

"blah.tests.zibri.org".match('test\..*')
null

and

"blah.test.zibri.org".match('test\..*')
["test.zibri.org", index: 5, input: "blah.test.zibri.org", groups: undefined]

回答by Emma

This expression,

这个表情,

(?<=\s|^)[^.\s]+\.[^.\s]+(?=@)

might also work OK for those specific types of input strings.

对于那些特定类型的输入字符串也可能正常工作。

Demo

演示

Test

测试

import re

expression = r'(?<=^|\s)[^.\s]+\.[^.\s]+(?=@)'
string = '''
blah blah blah [email protected] blah blah
blah blah blah test.this @gmail.com blah blah
blah blah blah [email protected] blah blah
'''

matches = re.findall(expression, string)

print(matches)

Output

输出

['test.this']


If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.

如果你想简化/修改/探索表达式,它已经在regex101.com 的右上角面板中进行了解释。如果您愿意,您还可以在此链接中观看它如何与某些示例输入匹配。