在字符串 Python 中,我如何在 : 之前获取所有内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27387415/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How would I get everything before a : in a string Python
提问by 0Cool
I am looking for a way to get all of the letters in a string before a : but I have no idea on where to start. Would I use regex? If so how?
我正在寻找一种方法来获取 a 之前字符串中的所有字母:但我不知道从哪里开始。我会使用正则表达式吗?如果是这样怎么办?
string = "Username: How are you today?"
Can someone show me a example on what I could do?
有人可以向我展示我可以做什么的例子吗?
采纳答案by fredtantini
Just use the split
function. It returns a list, so you can keep the first element:
只需使用该split
功能。它返回一个列表,因此您可以保留第一个元素:
>>> s1.split(':')
['Username', ' How are you today?']
>>> s1.split(':')[0]
'Username'
回答by Cory Kramer
You don't need regex
for this
你不需要regex
这个
>>> s = "Username: How are you today?"
You can use the split
method to split the string on the ':'
character
您可以使用split
方法拆分的字符串':'
的字符
>>> s.split(':')
['Username', ' How are you today?']
And slice out element [0]
to get the first part of the string
并切出元素[0]
以获取字符串的第一部分
>>> s.split(':')[0]
'Username'
回答by Hackaholic
Using index
:
使用index
:
>>> string = "Username: How are you today?"
>>> string[:string.index(":")]
'Username'
The index will give you the position of :
in string, then you can slice it.
索引将为您:
提供字符串中的位置,然后您可以对其进行切片。
If you want to use regex:
如果你想使用正则表达式:
>>> import re
>>> re.match("(.*?):",string).group()
'Username'
match
matches from the start of the string.
match
从字符串的开头匹配。
回答by Aristide
I have benchmarked these various technics under Python 3.7.0 (IPython).
我已经在 Python 3.7.0 (IPython) 下对这些不同的技术进行了基准测试。
TLDR
TLDR
- fastest (when the split symbol
c
is known): pre-compiled regex. - fastest (otherwise):
s.partition(c)[0]
. - safe (i.e., when
c
may not be in ins
): partition, split. - unsafe: index, regex.
- 最快(当拆分符号
c
已知时):预编译的正则表达式。 - 最快的(否则)
s.partition(c)[0]
。 - 安全(即,何时
c
可能不在 in 中s
):分区、拆分。 - 不安全:索引,正则表达式。
Code
代码
import string, random, re
SYMBOLS = string.ascii_uppercase + string.digits
SIZE = 100
def create_test_set(string_length):
for _ in range(SIZE):
random_string = ''.join(random.choices(SYMBOLS, k=string_length))
yield (random.choice(random_string), random_string)
for string_length in (2**4, 2**8, 2**16, 2**32):
print("\nString length:", string_length)
print(" regex (compiled):", end=" ")
test_set_for_regex = ((re.compile("(.*?)" + c).match, s) for (c, s) in test_set)
%timeit [re_match(s).group() for (re_match, s) in test_set_for_regex]
test_set = list(create_test_set(16))
print(" partition: ", end=" ")
%timeit [s.partition(c)[0] for (c, s) in test_set]
print(" index: ", end=" ")
%timeit [s[:s.index(c)] for (c, s) in test_set]
print(" split (limited): ", end=" ")
%timeit [s.split(c, 1)[0] for (c, s) in test_set]
print(" split: ", end=" ")
%timeit [s.split(c)[0] for (c, s) in test_set]
print(" regex: ", end=" ")
%timeit [re.match("(.*?)" + c, s).group() for (c, s) in test_set]
Results
结果
String length: 16
regex (compiled): 156 ns ± 4.41 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
partition: 19.3 μs ± 430 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
index: 26.1 μs ± 341 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split (limited): 26.8 μs ± 1.26 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split: 26.3 μs ± 835 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
regex: 128 μs ± 4.02 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
String length: 256
regex (compiled): 167 ns ± 2.7 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
partition: 20.9 μs ± 694 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
index: 28.6 μs ± 2.73 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split (limited): 27.4 μs ± 979 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split: 31.5 μs ± 4.86 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
regex: 148 μs ± 7.05 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
String length: 65536
regex (compiled): 173 ns ± 3.95 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
partition: 20.9 μs ± 613 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
index: 27.7 μs ± 515 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split (limited): 27.2 μs ± 796 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split: 26.5 μs ± 377 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
regex: 128 μs ± 1.5 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
String length: 4294967296
regex (compiled): 165 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
partition: 19.9 μs ± 144 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
index: 27.7 μs ± 571 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split (limited): 26.1 μs ± 472 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
split: 28.1 μs ± 1.69 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
regex: 137 μs ± 6.53 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
回答by Marv-CZ
partition()may be better then split() for this purpose as it has the better predicable results for situations you have no delimiter or more delimiters.
为此,partition()可能比 split() 更好,因为它在没有分隔符或更多分隔符的情况下具有更好的可预测结果。