Python:替换为正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3997525/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 13:46:20  来源:igfitidea点击:

Python: Replace with regex

pythonregex

提问by Pickels

I need to replace part of a string. I was looking through the Python documentation and found re.sub.

我需要替换字符串的一部分。我正在浏览 Python 文档并找到了 re.sub。

import re
s = '<textarea id="Foo"></textarea>'
output = re.sub(r'<textarea.*>(.*)</textarea>', 'Bar', s)
print output

>>>'Bar'

I was expecting this to print '<textarea id="Foo">Bar</textarea>'and not 'bar'.

我期待这个打印'<textarea id="Foo">Bar</textarea>'而不是'bar'。

Could anybody tell me what I did wrong?

谁能告诉我我做错了什么?

采纳答案by Mark Byers

Instead of capturing the part you want to replaceyou can capture the parts you want to keepand then refer to them using a reference \1to include them in the substituted string.

您可以捕获要保留的部分,而不是捕获要替换的部分,然后使用引用引用它们以将它们包含在替换字符串中。\1

Try this instead:

试试这个:

output = re.sub(r'(<textarea.*>).*(</textarea>)', r'Bar', s)

Also, assuming this is HTML you should consider using an HTML parser for this task, for example Beautiful Soup.

此外,假设这是 HTML,您应该考虑为此任务使用 HTML 解析器,例如Beautiful Soup

回答by Rahul Agarwal

Or you could just use the search function instead:

或者你可以只使用搜索功能:

match=re.search(r'(<textarea.*>).*(</textarea>)', s)
output = match.group(1)+'bar'+match.group(2)
print output
>>>'<textarea id="Foo">bar</textarea>'