Python:如何在 if 语句中使用 RegEx?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14225608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: How to use RegEx in an if statement?
提问by
I have the following code which looks through the files in one directory and copies files that contain a certain string into another directory, but I am trying to use Regular Expressions as the string could be upper and lowercase or a mix of both.
我有以下代码,它查看一个目录中的文件并将包含某个字符串的文件复制到另一个目录中,但我正在尝试使用正则表达式,因为该字符串可以是大写和小写或两者的混合。
Here is the code that works, before I tried to use RegEx's
在我尝试使用 RegEx 之前,这是有效的代码
import os
import re
import shutil
def test():
os.chdir("C:/Users/David/Desktop/Test/MyFiles")
files = os.listdir(".")
os.mkdir("C:/Users/David/Desktop/Test/MyFiles2")
for x in (files):
inputFile = open((x), "r")
content = inputFile.read()
inputFile.close()
if ("Hello World" in content)
shutil.copy(x, "C:/Users/David/Desktop/Test/MyFiles2")
Here is my code when I have tried to use RegEx's
这是我尝试使用 RegEx 时的代码
import os
import re
import shutil
def test2():
os.chdir("C:/Users/David/Desktop/Test/MyFiles")
files = os.listdir(".")
os.mkdir("C:/Users/David/Desktop/Test/MyFiles2")
regex_txt = "facebook.com"
for x in (files):
inputFile = open((x), "r")
content = inputFile.read()
inputFile.close()
regex = re.compile(regex_txt, re.IGNORECASE)
Im guessing that I need a line of code that is something like
我猜我需要一行类似的代码
if regex = re.compile(regex_txt, re.IGNORECASE) == True
But I cant seem to get anything to work, if someone could point me in the right direction it would be appreciated.
但我似乎无法得到任何工作,如果有人能指出我正确的方向,我将不胜感激。
采纳答案by aw4lly
if re.match(regex, content) is not None:
blah..
You could also use re.searchdepending on how you want it to match.
您也可以re.search根据您希望它如何匹配来使用。
回答by Silas Ray
First you compile the regex, then you have to use it with match, find, or some other method to actually run it against some input.
首先,你编译正则表达式,那么你必须使用它match,find或者一些其他的方法来实际运行对一些输入。
import os
import re
import shutil
def test():
os.chdir("C:/Users/David/Desktop/Test/MyFiles")
files = os.listdir(".")
os.mkdir("C:/Users/David/Desktop/Test/MyFiles2")
pattern = re.compile(regex_txt, re.IGNORECASE)
for x in (files):
with open((x), 'r') as input_file:
for line in input_file:
if pattern.search(line):
shutil.copy(x, "C:/Users/David/Desktop/Test/MyFiles2")
break
回答by Mike Samuel
The REPL makes it easy to learn APIs. Just run python, create an object and then ask for help:
REPL 使学习 API 变得容易。只需运行python,创建一个对象,然后请求help:
$ python
>>> import re
>>> help(re.compile(r''))
at the command line shows, among other things:
在命令行显示,除其他外:
search(...)
search(string[, pos[, endpos]])--> match object orNone. Scan through string looking for a match, and return a correspondingMatchObjectinstance. ReturnNoneif no position in the string matches.
search(...)
search(string[, pos[, endpos]])--> 匹配对象或None. 扫描字符串查找匹配项,并返回相应的MatchObject实例。None如果字符串中没有位置匹配,则返回。
so you can do
所以你可以做
regex = re.compile(regex_txt, re.IGNORECASE)
match = regex.search(content) # From your file reading code.
if match is not None:
# use match
Incidentally,
顺便,
regex_txt = "facebook.com"
has a .which matches any character, so re.compile("facebook.com").search("facebookkcom") is not Noneis true because .matches any character. Maybe
有一个.匹配任何字符,所以re.compile("facebook.com").search("facebookkcom") is not None是真的,因为.匹配任何字符。也许
regex_txt = r"(?i)facebook\.com"
The \.matches a literal "."character instead of treating .as a special regular expression operator.
该\.文字匹配"."字符而不是治疗.作为一种特殊的正则表达式运算符。
The r"..."bit means that the regular expression compiler gets the escape in \.instead of the python parser interpreting it.
该r"..."位意味着正则表达式编译器获得转义\.而不是 python 解析器解释它。
The (?i)makes the regex case-insensitive like re.IGNORECASEbut self-contained.
这(?i)使得正则表达式不区分大小写,re.IGNORECASE但自包含。
回答by Jon Clements
Regex's shouldn't really be used in this fashion - unless you want something more complicated than what you're trying to do - for instance, you could just normalise your content string and comparision string to be:
正则表达式不应该真正以这种方式使用 - 除非你想要比你想要做的更复杂的东西 - 例如,你可以将你的内容字符串和比较字符串标准化为:
if 'facebook.com' in content.lower():
shutil.copy(x, "C:/Users/David/Desktop/Test/MyFiles2")
回答by Bob Stein
if re.search(r'pattern', string):
if re.search(r'pattern', string):
Simple if-test:
简单的 if 测试:
if re.search(r'ing\b', "seeking a great perhaps"): # any words end with ing?
print("yes")
Pattern check, extract a substring, case insensitive:
模式检查,提取子串,不区分大小写:
match_object = re.search(r'^OUGHT (.*) BE$', "ought to be", flags=re.IGNORECASE)
if match_object:
assert "to" == match_object.group(1) # what's between ought and be?
Notes:
笔记:
Use
re.search()not re.match. Match restricts to the startof strings, a confusingconvention if you ask me. If you do want a string-starting match, use caret or\Ainstead,re.search(r'^...', ...)Use raw stringsyntax
r'pattern'for the first parameter. Otherwise you would need to double up backslashes, as inre.search('ing\\b', ...)In this example,
\bis a special sequencemeaning word-boundaryin regex. Not to be confused with backspace.re.search()returnsNoneif it doesn't find anything, which is always falsy.re.search()returns a Match objectif it finds anything, which is always truthy.a group is what matched inside parentheses
group numbering starts at 1

