如何从python中的正则表达式匹配返回字符串？

Question

提问by Hyman Dalton

I am running through lines in a text file using a pythonscript. I want to search for an imgtag within the text document and return the tag as text.

我正在使用python脚本运行文本文件中的行。我想img在文本文档中搜索标签并将标签作为文本返回。

When I run the regex re.match(line)it returns a _sre.SRE_MATCHobject. How do I get it to return a string?

当我运行正则表达式时，re.match(line)它返回一个 _sre.SRE_MATCH对象。我如何让它返回一个字符串？

import sys
import string
import re

f = open("sample.txt", 'r' )
l = open('writetest.txt', 'w')

count = 1

for line in f:
    line = line.rstrip()
    imgtag  = re.match(r'<img.*?>',line)
    print("yo it's a {}".format(imgtag))

When run it prints:

运行时打印：

yo it's a None
yo it's a None
yo it's a None
yo it's a <_sre.SRE_Match object at 0x7fd4ea90e578>
yo it's a None
yo it's a <_sre.SRE_Match object at 0x7fd4ea90e578>
yo it's a None
yo it's a <_sre.SRE_Match object at 0x7fd4ea90e578>
yo it's a <_sre.SRE_Match object at 0x7fd4ea90e5e0>
yo it's a None
yo it's a None

Answer 1

采纳答案by wflynny

You should use re.MatchObject.group(0). Like

你应该使用re.MatchObject.group(0). 喜欢

imtag = re.match(r'<img.*?>', line).group(0)

Edit:

编辑：

You also might be better off doing something like

你也可能会更好地做类似的事情

imgtag  = re.match(r'<img.*?>',line)
if imtag:
    print("yo it's a {}".format(imgtag.group(0)))

to eliminate all the Nones.

消除所有的Nones。

Answer 2

回答by Explosion Pills

imgtag.group(0)or imgtag.group(). This returns the entire match as a string. You are not capturing anything else either.

imgtag.group(0)或imgtag.group()。这将整个匹配项作为字符串返回。你也没有捕捉任何其他东西。

http://docs.python.org/release/2.5.2/lib/match-objects.html

Answer 3

回答by newtover

Considering there might be several imgtags I would recommend re.findall:

考虑到img我可能会推荐几个标签re.findall：

import re

with open("sample.txt", 'r') as f_in, open('writetest.txt', 'w') as f_out:
    for line in f_in:
        for img in re.findall('<img[^>]+>', line):
            print >> f_out, "yo it's a {}".format(img)

Answer 4

回答by Sergii Shcherbak

Note that re.match(pattern, string, flags=0)only returns matches at the beginningof the string. If you want to locate a match anywherein the string, use re.search(pattern, string, flags=0)instead (https://docs.python.org/3/library/re.html). This will scan the string and return the first match object. Then you can extract the matching string with match_object.group(0)as the folks suggested.

请注意，re.match(pattern, string, flags=0)仅返回字符串开头的匹配项。如果要在字符串中的任何位置找到匹配项，请re.search(pattern, string, flags=0)改用 ( https://docs.python.org/3/library/re.html)。这将扫描字符串并返回第一个匹配对象。然后您可以match_object.group(0)按照人们的建议提取匹配的字符串。

如何从python中的正则表达式匹配返回字符串？

提问by Hyman Dalton

采纳答案by wflynny

回答by Explosion Pills

回答by newtover

回答by Sergii Shcherbak

相关推荐

最近更新

标签

如何从python中的正则表达式匹配返回字符串？

提问by Hyman Dalton

采纳答案by wflynny

回答by Explosion Pills

回答by newtover

回答by Sergii Shcherbak

相关推荐

Python 如何杀死所有 uwsgi 实例

Python 如何将 assertSequenceEqual 应用于值来实现 assertDictEqual

Python 使用 Boto3 作为字符串打开 S3 对象

我可以更改 Python 的“请求”模块的连接池大小吗？

相关推荐

最近更新

标签