有没有办法在 Python 中按第 n 个分隔符分割字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1621906/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 22:43:45  来源:igfitidea点击:

Is there a way to split a string by every nth separator in Python?

pythonstringsplit

提问by Gnuffo1

For example, if I had the following string:

例如,如果我有以下字符串:

"this-is-a-string"

“这是一个字符串”

Could I split it by every 2nd "-" rather than every "-" so that it returns two values ("this-is" and "a-string") rather than returning four?

我可以按每第二个“-”而不是每个“-”分割它,以便它返回两个值(“this-is”和“a-string”)而不是返回四个?

回答by Gumbo

Here's another solution:

这是另一个解决方案:

span = 2
words = "this-is-a-string".split("-")
print ["-".join(words[i:i+span]) for i in range(0, len(words), span)]

回答by John La Rooy

>>> s="a-b-c-d-e-f-g-h-i-j-k-l"         # use zip(*[i]*n)
>>> i=iter(s.split('-'))                # for the nth case    
>>> map("-".join,zip(i,i))    
['a-b', 'c-d', 'e-f', 'g-h', 'i-j', 'k-l']

>>> i=iter(s.split('-'))
>>> map("-".join,zip(*[i]*3))
['a-b-c', 'd-e-f', 'g-h-i', 'j-k-l']
>>> i=iter(s.split('-'))
>>> map("-".join,zip(*[i]*4))
['a-b-c-d', 'e-f-g-h', 'i-j-k-l']

Sometimes itertools.izip is faster as you can see in the results

有时 itertools.izip 更快,正如您在结果中看到的

>>> from itertools import izip
>>> s="a-b-c-d-e-f-g-h-i-j-k-l"
>>> i=iter(s.split("-"))
>>> ["-".join(x) for x in izip(i,i)]
['a-b', 'c-d', 'e-f', 'g-h', 'i-j', 'k-l']

Here is a version that sort ofworks with an odd number of parts depending what output you desire in that case. You might prefer to trim the '-'off the end of the last element with .rstrip('-')for example.

这里是一个版本,那种具有奇数个部分的工作取决于你在这种情况下愿望输出。例如,您可能更喜欢修剪'-'最后一个元素的末尾.rstrip('-')

>>> from itertools import izip_longest
>>> s="a-b-c-d-e-f-g-h-i-j-k-l-m"
>>> i=iter(s.split('-'))
>>> map("-".join,izip_longest(i,i,fillvalue=""))
['a-b', 'c-d', 'e-f', 'g-h', 'i-j', 'k-l', 'm-']

Here are some timings

这里有一些时间

$ python -m timeit -s 'import re;r=re.compile("[^-]+-[^-]+");s="a-b-c-d-e-f-g-h-i-j-k-l"' 'r.findall(s)'
100000 loops, best of 3: 4.31 usec per loop

$ python -m timeit -s 'from itertools import izip;s="a-b-c-d-e-f-g-h-i-j-k-l"' 'i=iter(s.split("-"));["-".join(x) for x in izip(i,i)]'
100000 loops, best of 3: 5.41 usec per loop

$ python -m timeit -s 's="a-b-c-d-e-f-g-h-i-j-k-l"' 'i=iter(s.split("-"));["-".join(x) for x in zip(i,i)]'
100000 loops, best of 3: 7.3 usec per loop

$ python -m timeit -s 's="a-b-c-d-e-f-g-h-i-j-k-l"' 't=s.split("-");["-".join(t[i:i+2]) for i in range(0, len(t), 2)]'
100000 loops, best of 3: 7.49 usec per loop

$ python -m timeit -s 's="a-b-c-d-e-f-g-h-i-j-k-l"' '["-".join([x,y]) for x,y in zip(s.split("-")[::2], s.split("-")[1::2])]'
100000 loops, best of 3: 9.51 usec per loop

回答by recursive

Regular expressions handle this easily:

正则表达式很容易处理这个:

import re
s = "aaaa-aa-bbbb-bb-c-ccccc-d-ddddd"
print re.findall("[^-]+-[^-]+", s)

Output:

输出:

['aaaa-aa', 'bbbb-bb', 'c-ccccc', 'd-ddddd']

Update for Nick D:

尼克 D 的更新:

n = 3
print re.findall("-".join(["[^-]+"] * n), s)

Output:

输出:

['aaaa-aa-bbbb', 'bb-c-ccccc']

回答by EmFi

EDIT:The original code I posted didn't work. This version does:

编辑:我发布的原始代码不起作用。这个版本做了:

I don't think you can split on every other one, but you could split on every - and join every pair.

我不认为你可以分开每一个,但你可以分开每一个 - 并加入每一对。

chunks = []
content = "this-is-a-string"
split_string = content.split('-')

for i in range(0, len(split_string) - 1,2) :
    if i < len(split_string) - 1:
        chunks.append("-".join([split_string[i], split_string[i+1]]))
    else:
        chunks.append(split_string[i])

回答by elzapp

I think several of the already given solutions are good enough, but just for fun, I did this version:

我认为已经给出的几个解决方案已经足够好了,但只是为了好玩,我做了这个版本:

def twosplit(s,sep):
  first=s.find(sep)
  if first>=0:
    second=s.find(sep,first+1)
      if second>=0:
        return [s[0:second]] + twosplit(s[second+1:],sep)
      else:
        return [s]
    else:
      return [s]
  print twosplit("this-is-a-string","-")

回答by SpliFF

l = 'this-is-a-string'.split()
nl = []
ss = ""
c = 0
for s in l:
   c += 1
   if c%2 == 0:
       ss = s
   else:
       ss = "%s-%s"%(ss,s)
       nl.insert(ss)

print nl