python 在大写字母前插入空格的pythonic方法

Question

提问by Electrons_Ahoy

I've got a file whose format I'm altering via a python script. I have several camel cased strings in this file where I just want to insert a single space before the capital letter - so "WordWordWord" becomes "Word Word Word".

我有一个文件，我正在通过 python 脚本更改其格式。我在这个文件中有几个驼峰式字符串，我只想在大写字母前插入一个空格 - 所以“WordWordWord”变成了“Word Word Word”。

My limited regex experience just stalled out on me - can someone think of a decent regex to do this, or (better yet) is there a more pythonic way to do this that I'm missing?

我有限的正则表达式经验刚刚在我身上停滞不前 - 有人可以想到一个像样的正则表达式来做到这一点，或者（更好）有没有更pythonic的方法来做到这一点，我错过了？

Answer 1

回答by Greg Hewgill

You could try:

你可以试试：

>>> re.sub(r"(\w)([A-Z])", r" ", "WordWordWord")
'Word Word Word'

Answer 2

回答by Greg Hewgill

If there are consecutive capitals, then Gregs result could not be what you look for, since the \w consumes the caracter in front of the captial letter to be replaced.

如果有连续的大写字母，则 Gregs 结果可能不是您要查找的内容，因为 \w 消耗了要替换的大写字母前面的字符。

>>> re.sub(r"(\w)([A-Z])", r" ", "WordWordWWWWWWWord")
'Word Word WW WW WW Word'

A look-behind would solve this:

后视可以解决这个问题：

>>> re.sub(r"(?<=\w)([A-Z])", r" ", "WordWordWWWWWWWord")
'Word Word W W W W W W Word'

Answer 3

回答by tzot

Perhaps shorter:

也许更短：

>>> re.sub(r"\B([A-Z])", r" ", "DoIThinkThisIsABetterAnswer?")

Answer 4

回答by Markus Jarderot

Have a look at my answer on .NET - How can you split a “caps” delimited string into an array?

看看我在.NET 上的回答- 如何将“大写”分隔的字符串拆分为数组？

Edit:Maybe better to include it here.

编辑：也许更好地将它包含在这里。

re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r' ', text)

For example:

例如：

"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]

Answer 5

回答by Yaroslav Surzhikov

Maybe you would be interested in one-liner implementation without using regexp:

也许您会对不使用正则表达式的单行实现感兴趣：

''.join(' ' + char if char.isupper() else char.strip() for char in text).strip()

Answer 6

回答by Dan Lenski

With regexes you can do this:

使用正则表达式，您可以执行以下操作：

re.sub('([A-Z])', r' ', str)

Of course, that will only work for ASCII characters, if you want to do Unicode it's a whole new can of worms :-)

当然，这仅适用于 ASCII 字符，如果您想使用 Unicode，它是一种全新的蠕虫:-)

Answer 7

回答by David Underhill

If you have acronyms, you probably do not want spaces between them. This two-stage regex will keep acronyms intact (and also treat punctuation and other non-uppercase letters as something to add a space on):

如果您有首字母缩略词，您可能不希望它们之间有空格。这个两阶段正则表达式将保持首字母缩写词完整（并且还将标点符号和其他非大写字母视为添加空格的东西）：

re_outer = re.compile(r'([^A-Z ])([A-Z])')
re_inner = re.compile(r'(?<!^)([A-Z])([^A-Z])')
re_outer.sub(r' ', re_inner.sub(r' ', 'DaveIsAFKRightNow!Cool'))

The output will be: 'Dave Is AFK Right Now! Cool'

输出将是： 'Dave Is AFK Right Now! Cool'

Answer 8

回答by monkut

I agree that the regex solution is the easiest, but I wouldn't say it's the most pythonic.

我同意正则表达式解决方案是最简单的，但我不会说它是最 Pythonic 的。

How about:

怎么样：

text = 'WordWordWord'
new_text = ''

for i, letter in enumerate(text):
    if i and letter.isupper():
        new_text += ' '

    new_text += letter

Answer 9

回答by Brian

I think regexes are the way to go here, but just to give a pure python version without (hopefully) any of the problems ΤΖΩΤΖΙΟΥ has pointed out:

我认为正则表达式是通往这里的方式，但只是为了提供一个纯 python 版本，而没有（希望）任何 ΤΖΩΤΖΙΟΥ 指出的问题：

def splitCaps(s):
    result = []
    for ch, next in window(s+" ", 2):
        result.append(ch)
        if next.isupper() and not ch.isspace():
            result.append(' ')
    return ''.join(result)

window() is a utility function I use to operate on a sliding window of items, defined as:

window() 是我用来操作项目的滑动窗口的实用函数，定义为：

import collections, itertools

def window(it, winsize, step=1):
    it=iter(it)  # Ensure we have an iterator
    l=collections.deque(itertools.islice(it, winsize))
    while 1:  # Continue till StopIteration gets raised.
        yield tuple(l)
        for i in range(step):
            l.append(it.next())
            l.popleft()

python 在大写字母前插入空格的pythonic方法

提问by Electrons_Ahoy

回答by Greg Hewgill

回答by Greg Hewgill

回答by tzot

回答by Markus Jarderot

回答by Yaroslav Surzhikov

回答by Dan Lenski

回答by David Underhill

回答by monkut

回答by Brian

相关推荐

最近更新

标签

python 在大写字母前插入空格的pythonic方法

提问by Electrons_Ahoy

回答by Greg Hewgill

回答by Greg Hewgill

回答by tzot

回答by Markus Jarderot

回答by Yaroslav Surzhikov

回答by Dan Lenski

回答by David Underhill

回答by monkut

回答by Brian

相关推荐

python 在python中将XML编辑为字典？

python time.gmtime() 是否有反函数将 UTC 元组解析为自纪元以来的秒数？

python 如何将 django.core.urlresolvers.reverse 与函数引用而不是命名 URL 模式一起使用？

Python 列表（元组）中每个元素有多少字节？

相关推荐

最近更新

标签