Python 在两个子字符串之间查找字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3368969/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find string between two substrings
提问by John Howard
How do I find a string between two substrings ('123STRINGabc' -> 'STRING')?
如何在两个子字符串 ( '123STRINGabc' -> 'STRING')之间找到一个字符串?
My current method is like this:
我目前的方法是这样的:
>>> start = 'asdf=5;'
>>> end = '123jasd'
>>> s = 'asdf=5;iwantthis123jasd'
>>> print((s.split(start))[1].split(end)[0])
iwantthis
However, this seems very inefficient and un-pythonic. What is a better way to do something like this?
然而,这似乎非常低效且不符合 Python 风格。做这样的事情的更好方法是什么?
Forgot to mention:
The string might not start and end with startand end. They may have more characters before and after.
忘了提:字符串可能不会以startand开头和结尾end。它们之前和之后可能有更多字符。
采纳答案by Nikolaus Gradwohl
import re
s = 'asdf=5;iwantthis123jasd'
result = re.search('asdf=5;(.*)123jasd', s)
print(result.group(1))
回答by Tim McNamara
s[len(start):-len(end)]
回答by josh
My method will be to do something like,
我的方法是做类似的事情,
find index of start string in s => i
find index of end string in s => j
substring = substring(i+len(start) to j-1)
回答by cji
s = "123123STRINGabcabc"
def find_between( s, first, last ):
try:
start = s.index( first ) + len( first )
end = s.index( last, start )
return s[start:end]
except ValueError:
return ""
def find_between_r( s, first, last ):
try:
start = s.rindex( first ) + len( first )
end = s.rindex( last, start )
return s[start:end]
except ValueError:
return ""
print find_between( s, "123", "abc" )
print find_between_r( s, "123", "abc" )
gives:
给出:
123STRING
STRINGabc
I thought it should be noted - depending on what behavior you need, you can mix indexand rindexcalls or go with one of the above versions (it's equivalent of regex (.*)and (.*?)groups).
我认为应该注意 - 根据您需要的行为,您可以混合index和rindex调用或使用上述版本之一(它相当于正则表达式(.*)和(.*?)组)。
回答by John La Rooy
Here is one way to do it
这是一种方法
_,_,rest = s.partition(start)
result,_,_ = rest.partition(end)
print result
Another way using regexp
使用正则表达式的另一种方式
import re
print re.findall(re.escape(start)+"(.*)"+re.escape(end),s)[0]
or
或者
print re.search(re.escape(start)+"(.*)"+re.escape(end),s).group(1)
回答by Tony Veijalainen
This I posted before as code snippet in Daniweb:
这是我之前在 Daniweb 中作为代码片段发布的:
# picking up piece of string between separators
# function using partition, like partition, but drops the separators
def between(left,right,s):
before,_,a = s.partition(left)
a,_,after = a.partition(right)
return before,a,after
s = "bla bla blaa <a>data</a> lsdjfasdj?f (important notice) 'Daniweb forum' tcha tcha tchaa"
print between('<a>','</a>',s)
print between('(',')',s)
print between("'","'",s)
""" Output:
('bla bla blaa ', 'data', " lsdjfasdj\xc3\xb6f (important notice) 'Daniweb forum' tcha tcha tchaa")
('bla bla blaa <a>data</a> lsdjfasdj\xc3\xb6f ', 'important notice', " 'Daniweb forum' tcha tcha tchaa")
('bla bla blaa <a>data</a> lsdjfasdj\xc3\xb6f (important notice) ', 'Daniweb forum', ' tcha tcha tchaa')
"""
回答by Tim McNamara
String formatting adds some flexibility to what Nikolaus Gradwohl suggested. startand endcan now be amended as desired.
字符串格式为 Nikolaus Gradwohl 的建议增加了一些灵活性。start并且end根据需要,现在可以修改。
import re
s = 'asdf=5;iwantthis123jasd'
start = 'asdf=5;'
end = '123jasd'
result = re.search('%s(.*)%s' % (start, end), s).group(1)
print(result)
回答by Reinstate Monica - Goodbye SE
To extract STRING, try:
要提取STRING,请尝试:
myString = '123STRINGabc'
startString = '123'
endString = 'abc'
mySubString=myString[myString.find(startString)+len(startString):myString.find(endString)]
回答by ansetou
start = 'asdf=5;'
end = '123jasd'
s = 'asdf=5;iwantthis123jasd'
print s[s.find(start)+len(start):s.rfind(end)]
gives
给
iwantthis
回答by tstoev
source='your token _here0@df and maybe _here1@df or maybe _here2@df'
start_sep='_'
end_sep='@df'
result=[]
tmp=source.split(start_sep)
for par in tmp:
if end_sep in par:
result.append(par.split(end_sep)[0])
print result
must show: here0, here1, here2
必须显示:here0、here1、here2
the regex is better but it will require additional lib an you may want to go for python only
正则表达式更好,但它需要额外的库,您可能只想使用 python

