Python - 最优雅的提取子字符串的方法,给出左右边框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34834258/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python - Most elegant way to extract a substring, being given left and right borders
提问by Vincent
I have a string - Python :
我有一个字符串 - Python:
string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
Expected output is :
预期输出是:
"Atlantis-GPS-coordinates"
I know that the expected output is ALWAYS surrounded by "/bar/" on the left and "/" on the right :
我知道预期的输出总是被左边的“/bar/”和右边的“/”包围:
"/bar/Atlantis-GPS-coordinates/"
Proposed solution would look like :
建议的解决方案如下:
a = string.find("/bar/")
b = string.find("/",a+5)
output=string[a+5,b]
This works, but I don't like it. Does someone know a beautiful function or tip ?
这有效,但我不喜欢它。有人知道一个漂亮的功能或提示吗?
采纳答案by dawg
You can use split:
您可以使用拆分:
>>> string.split("/bar/")[1].split("/")[0]
'Atlantis-GPS-coordinates'
Some efficiency from adding a max split of 1
I suppose:
添加最大分割的一些效率1
我想:
>>> string.split("/bar/", 1)[1].split("/", 1)[0]
'Atlantis-GPS-coordinates'
Or use partition:
或使用分区:
>>> string.partition("/bar/")[2].partition("/")[0]
'Atlantis-GPS-coordinates'
Or a regex:
或正则表达式:
>>> re.search(r'/bar/([^/]+)', string).group(1)
'Atlantis-GPS-coordinates'
Depends on what speaks to you and your data.
取决于什么对你和你的数据说话。
回答by D.Shawley
What you haven't isn't all that bad. I'd write it as:
你没有的也不是那么糟糕。我会把它写成:
start = string.find('/bar/') + 5
end = string.find('/', start)
output = string[start:end]
as long as you know that /bar/WHAT-YOU-WANT/
is always going to be present. Otherwise, I would reach for the regular expression knife:
只要你知道它/bar/WHAT-YOU-WANT/
永远存在。否则,我会伸手去拿正则表达式刀:
>>> import re
>>> PATTERN = re.compile('^.*/bar/([^/]*)/.*$')
>>> s = '/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/'
>>> match = PATTERN.match(s)
>>> match.group(1)
'Atlantis-GPS-coordinates'
回答by heemayl
Using re
(slower than other solutions):
使用re
(比其他解决方案慢):
>>> import re
>>> string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
>>> re.search(r'(?<=/bar/)[^/]+(?=/)', string).group()
'Atlantis-GPS-coordinates'
回答by crajun
import re
pattern = '(?<=/bar/).+?/'
string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
result = re.search(pattern, string)
print string[result.start():result.end() - 1]
# "Atlantis-GPS-coordinates"
That is a Python 2.x example. What it does first is: 1. (?<=/bar/) means only process the following regex if this precedes it (so that /bar/ must be before it) 2. '.+?/' means any amount of characters up until the next '/' char
这是一个 Python 2.x 示例。它首先执行的是: 1. (?<=/bar/) 表示仅在它之前处理以下正则表达式(因此 /bar/ 必须在其之前) 2. '.+?/' 表示任意数量的字符直到下一个 '/' 字符
Hope that helps some.
希望能帮到一些人。
If you need to do this kind of search a bunch it is better to 'compile' this search for performance, but if you only need to do it once don't bother.
如果您需要大量进行此类搜索,最好“编译”此搜索以提高性能,但如果您只需要进行一次,请不要打扰。