python re - 在字符前拆分字符串

Question

提问by kakarukeys

how to split a string at positions before a character?

如何在字符之前的位置拆分字符串？

split a string before 'a'
input: "fffagggahhh"
output: ["fff", "aggg", "ahhh"]

在 'a' 之前拆分一个字符串
输入：“fffagggahhh”
输出：[“fff”，“aggg”，“啊哈”]

the obvious way doesn't work:

明显的方法不起作用：

>>> h=re.compile("(?=a)")

>>> h.split("fffagggahhh")

['fffagggahhh']

>>>

Answer 1

回答by adamk

>>> r=re.compile("(a?[^a]+)")
>>> r.findall("fffagggahhh")
['fff', 'aggg', 'ahhh']

EDIT:

编辑：

This won't handle correctly double as in the string:

这将无法正确处理a字符串中的double s：

>>> r.findall("fffagggaahhh")
['fff', 'aggg', 'ahhh']

KennyTM's re seems better suited.

KennyTM 的 re 似乎更合适。

Answer 2

回答by pyfunc

Ok, not exactly the solution you want but I thought it will be a useful addition to problem here.

好的，不完全是您想要的解决方案，但我认为这将是对这里问题的有用补充。

Solution without re

无需重新的解决方案

Without re:

无需重新：

>>> x = "fffagggahhh"
>>> k = x.split('a')
>>> j = [k[0]] + ['a'+l for l in k[1:]]
>>> j
['fff', 'aggg', 'ahhh']
>>>

Answer 3

回答by Igor Serebryany

split()takes an argument for the character to split on:

split()接受要拆分的字符的参数：

>>> "fffagggahhh".split('a')
['fff', 'ggg', 'hhh']

Answer 4

回答by kennytm

>>> rx = re.compile("(?:a|^)[^a]*")
>>> rx.findall("fffagggahhh")
['fff', 'aggg', 'ahhh']
>>> rx.findall("aaa")
['a', 'a', 'a']
>>> rx.findall("fgh")
['fgh']
>>> rx.findall("")
['']

Answer 5

回答by Amber

>>> foo = "abbcaaaabbbbcaaab"
>>> bar = foo.split("c")
>>> baz = [bar[0]] + ["c"+x for x in bar[1:]]
>>> baz
['abb', 'caaaabbbb', 'caaab']

Due to how slicing works, this will work properly even if there are no occurrences of cin foo.

由于切片的工作方式，即使没有出现cin ，这也会正常工作foo。

Answer 6

回答by Terrel Shumway

import re

def split_before(pattern,text):
    prev = 0
    for m in re.finditer(pattern,text):
        yield text[prev:m.start()]
        prev = m.start()
    yield text[prev:]


if __name__ == '__main__':
    print list(split_before("a","fffagggahhh"))

re.split treats the pattern as a delimiter.

re.split 将模式视为分隔符。

>>> print list(split_before("a","afffagggahhhaab"))
['', 'afff', 'aggg', 'ahhh', 'a', 'ab']
>>> print list(split_before("a","ffaabcaaa"))
['ff', 'a', 'abc', 'a', 'a', 'a']
>>> print list(split_before("a","aaaaa"))
['', 'a', 'a', 'a', 'a', 'a']
>>> print list(split_before("a","bbbb"))
['bbbb']
>>> print list(split_before("a",""))
['']

Answer 7

回答by John Machin

This one works on repeated a's

这个适用于重复a的

  >>> re.findall("a[^a]*|^[^a]*", "aaaaa")
  ['a', 'a', 'a', 'a', 'a']
  >>> re.findall("a[^a]*|[^a]+", "ffaabcaaa")
  ['ff', 'a', 'abc', 'a', 'a', 'a']

Approach: the main chunks that you are looking for are an afollowed by zero or more not-a. That covers all possibilities except for zero or more not-a. That can happen only at the start of the input string.

方法：您要查找的主要块是 ana后跟零个或多个 not- a。这涵盖了除零个或多个 not- 之外的所有可能性a。这只能发生在输入字符串的开头。

python re - 在字符前拆分字符串

提问by kakarukeys

回答by adamk

回答by pyfunc

回答by Igor Serebryany

回答by kennytm

回答by Amber

回答by Terrel Shumway

回答by John Machin

相关推荐

最近更新

标签

python re - 在字符前拆分字符串

提问by kakarukeys

回答by adamk

回答by pyfunc

回答by Igor Serebryany

回答by kennytm

回答by Amber

回答by Terrel Shumway

回答by John Machin

相关推荐

Python 检查线程/从列表中删除

获取邮件附件到python文件对象

Python：NameError：未定义全局名称“foobar”

用逗号分割并在 Python 中去除空格

相关推荐

最近更新

标签