在字符串 Python 中，我如何在 : 之前获取所有内容

Question

提问by 0Cool

I am looking for a way to get all of the letters in a string before a : but I have no idea on where to start. Would I use regex? If so how?

我正在寻找一种方法来获取 a 之前字符串中的所有字母：但我不知道从哪里开始。我会使用正则表达式吗？如果是这样怎么办？

string = "Username: How are you today?"

Can someone show me a example on what I could do?

有人可以向我展示我可以做什么的例子吗？

Answer 1

采纳答案by fredtantini

Just use the splitfunction. It returns a list, so you can keep the first element:

只需使用该split功能。它返回一个列表，因此您可以保留第一个元素：

>>> s1.split(':')
['Username', ' How are you today?']
>>> s1.split(':')[0]
'Username'

Answer 2

回答by Cory Kramer

You don't need regexfor this

你不需要regex这个

>>> s = "Username: How are you today?"

You can use the splitmethod to split the string on the ':'character

您可以使用split方法拆分的字符串':'的字符

>>> s.split(':')
['Username', ' How are you today?']

And slice out element [0]to get the first part of the string

并切出元素[0]以获取字符串的第一部分

>>> s.split(':')[0]
'Username'

Answer 3

回答by Hackaholic

Using index:

使用index：

>>> string = "Username: How are you today?"
>>> string[:string.index(":")]
'Username'

The index will give you the position of :in string, then you can slice it.

索引将为您:提供字符串中的位置，然后您可以对其进行切片。

If you want to use regex:

如果你想使用正则表达式：

>>> import re
>>> re.match("(.*?):",string).group()
'Username'

matchmatches from the start of the string.

match从字符串的开头匹配。

Answer 4

回答by Aristide

I have benchmarked these various technics under Python 3.7.0 (IPython).

我已经在 Python 3.7.0 (IPython) 下对这些不同的技术进行了基准测试。

TLDR

fastest (when the split symbol cis known): pre-compiled regex.
fastest (otherwise): s.partition(c)[0].
safe (i.e., when cmay not be in in s): partition, split.
unsafe: index, regex.

最快（当拆分符号c已知时）：预编译的正则表达式。
最快的（否则）s.partition(c)[0]。
安全（即，何时c可能不在 in 中s）：分区、拆分。
不安全：索引，正则表达式。

Code

代码

import string, random, re

SYMBOLS = string.ascii_uppercase + string.digits
SIZE = 100

def create_test_set(string_length):
    for _ in range(SIZE):
        random_string = ''.join(random.choices(SYMBOLS, k=string_length))
        yield (random.choice(random_string), random_string)

for string_length in (2**4, 2**8, 2**16, 2**32):
    print("\nString length:", string_length)
    print("  regex (compiled):", end=" ")
    test_set_for_regex = ((re.compile("(.*?)" + c).match, s) for (c, s) in test_set)
    %timeit [re_match(s).group() for (re_match, s) in test_set_for_regex]
    test_set = list(create_test_set(16))
    print("  partition:       ", end=" ")
    %timeit [s.partition(c)[0] for (c, s) in test_set]
    print("  index:           ", end=" ")
    %timeit [s[:s.index(c)] for (c, s) in test_set]
    print("  split (limited): ", end=" ")
    %timeit [s.split(c, 1)[0] for (c, s) in test_set]
    print("  split:           ", end=" ")
    %timeit [s.split(c)[0] for (c, s) in test_set]
    print("  regex:           ", end=" ")
    %timeit [re.match("(.*?)" + c, s).group() for (c, s) in test_set]

Results

结果

String length: 16
  regex (compiled): 156 ns ± 4.41 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
  partition:        19.3 μs ± 430 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
  index:            26.1 μs ± 341 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split (limited):  26.8 μs ± 1.26 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split:            26.3 μs ± 835 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  regex:            128 μs ± 4.02 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

String length: 256
  regex (compiled): 167 ns ± 2.7 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
  partition:        20.9 μs ± 694 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  index:            28.6 μs ± 2.73 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split (limited):  27.4 μs ± 979 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split:            31.5 μs ± 4.86 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  regex:            148 μs ± 7.05 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

String length: 65536
  regex (compiled): 173 ns ± 3.95 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
  partition:        20.9 μs ± 613 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
  index:            27.7 μs ± 515 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split (limited):  27.2 μs ± 796 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split:            26.5 μs ± 377 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  regex:            128 μs ± 1.5 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

String length: 4294967296
  regex (compiled): 165 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
  partition:        19.9 μs ± 144 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
  index:            27.7 μs ± 571 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split (limited):  26.1 μs ± 472 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  split:            28.1 μs ± 1.69 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  regex:            137 μs ± 6.53 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Answer 5

回答by Marv-CZ

partition()may be better then split() for this purpose as it has the better predicable results for situations you have no delimiter or more delimiters.

为此，partition()可能比 split() 更好，因为它在没有分隔符或更多分隔符的情况下具有更好的可预测结果。

在字符串 Python 中，我如何在 : 之前获取所有内容

提问by 0Cool

采纳答案by fredtantini

回答by Cory Kramer

回答by Hackaholic

回答by Aristide

TLDR

TLDR

Code

代码

Results

结果

回答by Marv-CZ

相关推荐

最近更新

标签

在字符串 Python 中，我如何在 : 之前获取所有内容

提问by 0Cool

采纳答案by fredtantini

回答by Cory Kramer

回答by Hackaholic

回答by Aristide

TLDR

TLDR

Code

代码

Results

结果

回答by Marv-CZ

相关推荐

Python 装饰器执行顺序

漂亮的打印 JSON python

Python 如何使用 PyPDF2 附加 PDF 页面

如何将 PyCharm 连接到位于 Docker 容器内的 python 解释器？

相关推荐

最近更新

标签