Python re.findall 将输出打印为列表而不是字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29325809/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python re.findall prints output as list instead of string
提问by Ilea
My re.findall search is matching and returning the right string, but when I try to print the result, it prints it as a list instead of a string. Example below:
我的 re.findall 搜索匹配并返回正确的字符串,但是当我尝试打印结果时,它将它打印为列表而不是字符串。下面的例子:
> line = ID=id5;Parent=rna1;Dbxref=GeneID:653635,Genbank:NR_024540.1,HGNC:38034;gbkey=misc_RNA;gene=WASH7P;product=WAS protein family homolog 7 pseudogene;transcript_id=NR_024540.1
> print re.findall(r'gene=[^;\n]+', line)
> ['gene=WASH7P']
I would like the print function just to return gene=WASH7P
without the brackets and parentheses around it.
我希望 print 函数返回时gene=WASH7P
没有括号和圆括号。
How can I adjust my code so that it prints just the match, without the brackets and parentheses around it?
如何调整我的代码,使其只打印匹配项,而没有括号和圆括号?
Thank you!
谢谢!
回答by avinash pandey
The error that you are getting could be because your regex is not returning any match for the findall function.Please try to check what is the return type of the object returned by re.findallbefore trying to index it.Use this code before indexing so that if list is empty it will not raise indexerror.
您遇到的错误可能是因为您的正则表达式没有返回 findall 函数的任何匹配项。请尝试在尝试索引之前检查re.findall返回的对象的返回类型是什么。在索引之前使用此代码,以便如果列表为空,则不会引发索引错误。
x = re.findall(r'Name=[^;]+', line)
if not len(x):
#write your logic
回答by Fermi paradox
It prints it as a list, because.. it is a list.
它将它打印为一个列表,因为..它是一个列表。
Return all non-overlapping matches of pattern in string, as a listof strings.
以字符串列表的形式返回字符串中模式的所有非重叠匹配项。
To print only the string use print(re.findall(r'Name=[^;]+', line)[0])
instead.
要仅打印字符串,请print(re.findall(r'Name=[^;]+', line)[0])
改用。
That code is assuming you do have one match. If you have 0 matches, you ll get an error. If you have more, you ll print only the first match.
该代码假设您确实有一场比赛。如果您有 0 个匹配项,则会出现错误。如果您有更多,您将只打印第一场比赛。
To ensure you are not getting an error, check if a match was found before you use [0]
(or .group()
for re.search()
).
为确保您没有收到错误,请在使用[0]
(或.group()
for re.search()
)之前检查是否找到匹配项。
s = re.search(r'Name=[^;]+', my_str)
if s:
print(s.group())
or print(s[0])
或者 print(s[0])
回答by Ilea
Thank you for everyone's help!
谢谢大家的帮助!
Both of the below codes were successful in printing the output as a string.
以下两个代码都成功地将输出打印为字符串。
> re.findall(r'gene=[^;\n]+', line)[0]
> re.search(r'gene=[^;\n]+', line).group
However, I was continuing to get "list index out of range" errors on one of my regex, even though results were printing when I just used re.findall().
但是,我的正则表达式之一继续出现“列表索引超出范围”错误,即使我刚使用 re.findall() 时结果正在打印。
> re.findall(r'transcript_id=[^\s]+',line)
I realized that this seemingly impossible result was because I was calling re.findall() within a forloop that was iterating over every line in a file. There were matches for some lines but not for others, so I was receiving the "list index out of range" error for those lines in which there was no match.
我意识到这个看似不可能的结果是因为我在迭代文件中的每一行的for循环中调用 re.findall() 。某些行有匹配项,但其他行没有匹配项,因此对于那些没有匹配项的行,我收到“列表索引超出范围”错误。
the code below resolved the issue:
下面的代码解决了这个问题:
> if re.findall(r'transcript_id=[^\s]+',line):
> transcript = re.findall(r'transcript_id=[^\s]+',line)[0]
> else:
> transcript = "NA"
Thank you!
谢谢!