Python 在给定字符的第 n 次出现处拆分字符串

Question

提问by cherrun

Is there a Python-way to split a string after the nth occurrence of a given delimiter?

在给定分隔符出现第 n 次后，是否有一种 Python 方式来拆分字符串？

Given a string:

给定一个字符串：

'20_231_myString_234'

It should be split into (with the delimiter being '_', after its second occurrence):

它应该被拆分为（在第二次出现后，分隔符为“_”）：

['20_231', 'myString_234']

Or is the only way to accomplish this to count, split and join?

或者是实现这一点的唯一方法来计数、拆分和加入？

Answer 1

采纳答案by jamylak

>>> n = 2
>>> groups = text.split('_')
>>> '_'.join(groups[:n]), '_'.join(groups[n:])
('20_231', 'myString_234')

Seems like this is the most readable way, the alternative is regex)

似乎这是最易读的方式，替代方案是正则表达式）

Answer 2

回答by perreal

Using reto get a regex of the form ^((?:[^_]*_){n-1}[^_]*)_(.*)where nis a variable:

使用re得到以下形式的正则表达式^((?:[^_]*_){n-1}[^_]*)_(.*)，其中n是一个变量：

n=2
s='20_231_myString_234'
m=re.match(r'^((?:[^_]*_){%d}[^_]*)_(.*)' % (n-1), s)
if m: print m.groups()

or have a nice function:

或者有一个不错的功能：

import re
def nthofchar(s, c, n):
    regex=r'^((?:[^%c]*%c){%d}[^%c]*)%c(.*)' % (c,c,n-1,c,c)
    l = ()
    m = re.match(regex, s)
    if m: l = m.groups()
    return l

s='20_231_myString_234'
print nthofchar(s, '_', 2)

Or without regexes, using iterative find:

或者不使用正则表达式，使用迭代查找：

def nth_split(s, delim, n): 
    p, c = -1, 0
    while c < n:  
        p = s.index(delim, p + 1)
        c += 1
    return s[:p], s[p + 1:] 

s1, s2 = nth_split('20_231_myString_234', '_', 2)
print s1, ":", s2

Answer 3

回答by Micha? Fita

It depends what is your patternfor this split. Because if first two elementsare always numbers for example, you may build regular expressionand use remodule. It is able to split your string as well.

这取决于您对这种拆分的模式是什么。因为如果前两个元素总是数字，例如，您可以构建正则表达式并使用re模块。它也可以拆分您的字符串。

Answer 4

回答by pypat

I like this solution because it works without any actuall regex and can easiely be adapted to another "nth" or delimiter.

我喜欢这个解决方案，因为它不需要任何实际的正则表达式，并且可以很容易地适应另一个“第n个”或分隔符。

import re

string = "20_231_myString_234"
occur = 2  # on which occourence you want to split

indices = [x.start() for x in re.finditer("_", string)]
part1 = string[0:indices[occur-1]]
part2 = string[indices[occur-1]+1:]

print (part1, ' ', part2)

Answer 5

回答by Nullify

>>>import re
>>>str= '20_231_myString_234'

>>> occerence = [m.start() for m in re.finditer('_',str)]  # this will give you a list of '_' position
>>>occerence
[2, 6, 15]
>>>result = [str[:occerence[1]],str[occerence[1]+1:]] # [str[:6],str[7:]]
>>>result
['20_231', 'myString_234']

Answer 6

回答by AllBlackt

I had a larger string to split ever nth character, ended up with the following code:

我有一个更大的字符串来分割第 n 个字符，最后得到以下代码：

# Split every 6 spaces
n = 6
sep = ' '
n_split_groups = []

groups = err_str.split(sep)
while len(groups):
    n_split_groups.append(sep.join(groups[:n]))
    groups = groups[n:]

print n_split_groups

Thanks @perreal!

谢谢@perreal！

Answer 7

回答by Yuval

I thought I would contribute my two cents. The second parameter to split()allows you to limit the split after a certain number of strings:

我以为我会贡献我的两分钱。第二个参数 tosplit()允许您在一定数量的字符串后限制拆分：

def split_at(s, delim, n):
    r = s.split(delim, n)[n]
    return s[:-len(r)-len(delim)], r

On my machine, the two good answers by @perreal, iterative find and regular expressions, actually measure 1.4 and 1.6 times slower (respectively) than this method.

在我的机器上，@perreal 的两个很好的答案，迭代查找和正则表达式，实际上比这种方法慢 1.4 和 1.6 倍（分别）。

It's worth noting that it can become even quicker if you don't need the initial bit. Then the code becomes:

值得注意的是，如果您不需要初始位，它可以变得更快。然后代码变成：

def remove_head_parts(s, delim, n):
    return s.split(delim, n)[n]

Not so sure about the naming, I admit, but it does the job. Somewhat surprisingly, it is 2 times faster than iterative find and 3 times faster than regular expressions.

我承认对命名不太确定，但它确实可以。有点令人惊讶的是，它比迭代查找快 2 倍，比正则表达式快 3 倍。

I put up my testing script online. You are welcome to review and comment.

我把我的测试脚本放到网上。欢迎大家点评和评论。

Python 在给定字符的第 n 次出现处拆分字符串

提问by cherrun

采纳答案by jamylak

回答by perreal

回答by Micha? Fita

回答by pypat

回答by Nullify

回答by AllBlackt

回答by Yuval

相关推荐

最近更新

标签

Python 在给定字符的第 n 次出现处拆分字符串

提问by cherrun

采纳答案by jamylak

回答by perreal

回答by Micha? Fita

回答by pypat

回答by Nullify

回答by AllBlackt

回答by Yuval

相关推荐

Python 如何计算pandas DataFrame中列中的NaN值

在python中合并两个列表的最快方法是什么？

Python：出现频率

Python 类型错误：列表索引必须是整数，而不是字典

相关推荐

最近更新

标签