如何在 Python 中简化从下划线到驼峰式大小写的这种转换?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4303492/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 15:10:44  来源:igfitidea点击:

How can I simplify this conversion from underscore to camelcase in Python?

python

提问by Serge Tarkovski

I have written the function below that converts underscore to camelcase with first word in lowercase, i.e. "get_this_value" -> "getThisValue". Also I have requirement to preserve leading and trailing underscores and also double (triple etc.) underscores, if any, i.e.

我写了下面的函数,将下划线转换为驼峰字母,第一个单词为小写,即“get_this_value”->“getThisValue”。我还要求保留前导和尾随下划线以及双(三重等)下划线,如果有的话,即

"_get__this_value_" -> "_get_ThisValue_".

The code:

编码:

def underscore_to_camelcase(value):
    output = ""
    first_word_passed = False
    for word in value.split("_"):
        if not word:
            output += "_"
            continue
        if first_word_passed:
            output += word.capitalize()
        else:
            output += word.lower()
        first_word_passed = True
    return output

I am feeling the code above as written in non-Pythonic style, though it works as expected, so looking how to simplify the code and write it using list comprehensions etc.

我感觉上面的代码是用非 Pythonic 风格编写的,虽然它按预期工作,所以看看如何简化代码并使用列表推导等编写它。

采纳答案by Dave Webb

Your code is fine. The problem I think you're trying to solve is that if first_word_passedlooks a little bit ugly.

你的代码没问题。我认为您要解决的问题是if first_word_passed看起来有点难看。

One option for fixing this is a generator. We can easily make this return one thing for first entry and another for all subsequent entries. As Python has first-class functions we can get the generator to return the function we want to use to process each word.

解决这个问题的一种选择是发电机。我们可以很容易地使第一次进入时返回一件事,而为所有后续条目返回另一件事。由于 Python 具有一流的函数,我们可以让生成器返回我们想要用来处理每个单词的函数。

We then just need to use the conditional operatorso we can handle the blank entries returned by double underscores within a list comprehension.

然后我们只需要使用条件运算符,这样我们就可以处理列表推导式中双下划线返回的空白条目。

So if we have a word we call the generator to get the function to use to set the case, and if we don't we just use _leaving the generator untouched.

因此,如果我们有一个单词,我们调用生成器来获取用于设置大小写的函数,如果我们没有,我们就使用_保持生成器不变。

def underscore_to_camelcase(value):
    def camelcase(): 
        yield str.lower
        while True:
            yield str.capitalize

    c = camelcase()
    return "".join(c.next()(x) if x else '_' for x in value.split("_"))

回答by Gareth Rees

I think the code is fine. You've got a fairly complex specification, so if you insist on squashing it into the Procrustean bed of a list comprehension, then you're likely to harm the clarity of the code.

我认为代码很好。您有一个相当复杂的规范,因此如果您坚持将其压缩到列表推导式的 Procrustean 床中,那么您可能会损害代码的清晰度。

The only changes I'd make would be:

我要做的唯一改变是:

  1. To use the joinmethod to build the result in O(n) space and time, rather than repeated applications of +=which is O(n2).
  2. To add a docstring.
  1. 使用该join方法在 O( n) 空间和时间中构建结果,而不是重复应用+=O( n2 )。
  2. 添加文档字符串。

Like this:

像这样:

def underscore_to_camelcase(s):
    """Take the underscore-separated string s and return a camelCase
    equivalent.  Initial and final underscores are preserved, and medial
    pairs of underscores are turned into a single underscore."""
    def camelcase_words(words):
        first_word_passed = False
        for word in words:
            if not word:
                yield "_"
                continue
            if first_word_passed:
                yield word.capitalize()
            else:
                yield word.lower()
            first_word_passed = True
    return ''.join(camelcase_words(s.split('_')))

Depending on the application, another change I would consider making would be to memoize the function. I presume you're automatically translating source code in some way, and you expect the same names to occur many times. So you might as well store the conversion instead of re-computing it each time. An easy way to do that would be to use the @memoizeddecorator from the Python decorator library.

根据应用程序,我会考虑进行的另一个更改是记住该功能。我假设您正在以某种方式自动翻译源代码,并且您希望多次出现相同的名称。因此,您不妨存储转换,而不是每次都重新计算。一个简单的方法来做到这一点是使用的@memoized装饰从Python的装饰库

回答by P?r Wieslander

I agree with Gareth that the code is ok. However, if you really want a shorter, yet readable approach you could try something like this:

我同意 Gareth 的意见,即代码没问题。但是,如果你真的想要一个更短但可读的方法,你可以尝试这样的事情:

def underscore_to_camelcase(value):
    # Make a list of capitalized words and underscores to be preserved
    capitalized_words = [w.capitalize() if w else '_' for w in value.split('_')]

    # Convert the first word to lowercase
    for i, word in enumerate(capitalized_words):
        if word != '_':
            capitalized_words[i] = word.lower()
            break

    # Join all words to a single string and return it
    return "".join(capitalized_words)

回答by vonPetrushev

This is the most compact way to do it:

这是最紧凑的方法:

def underscore_to_camelcase(value):
    words = [word.capitalize() for word in value.split('_')]
    words[0]=words[0].lower()
    return "".join(words)

回答by Jocelyn delalande

For regexp sake !

为了正则表达式!

import re

def underscore_to_camelcase(value):
    def rep(m):
        if m.group(1) != None:
            return m.group(2) + m.group(3).lower() + '_'
        else:
            return m.group(3).capitalize()

    ret, nb_repl = re.subn(r'(^)?(_*)([a-zA-Z]+)', rep, value)
    return ret if (nb_repl > 1) else ret[:-1]

回答by unutbu

The problem calls for a function that returns a lowercase word the first time, but capitalized words afterwards. You can do that with an ifclause, but then the ifclause has to be evaluated for every word. An appealing alternative is to use a generator. It can return one thing on the first call, and something else on successive calls, and it does not require as many ifs.

该问题需要一个函数,该函数第一次返回小写单词,然后返回大写单词。您可以使用if子句做到这一点,但是if必须为每个单词评估该子句。一个有吸引力的替代方法是使用发电机。它可以在第一次调用时返回一个东西,在连续调用中返回别的东西,并且不需要那么多的ifs。

def lower_camelcase(seq):
    it=iter(seq)
    for word in it:
        yield word.lower()
        if word.isalnum(): break
    for word in it:
        yield word.capitalize()

def underscore_to_camelcase(text):
    return ''.join(lower_camelcase(word if word else '_' for word in text.split('_')))

Here is some test code to show that it works:

下面是一些测试代码来表明它的工作原理:

tests=[('get__this_value','get_ThisValue'),
       ('_get__this_value','_get_ThisValue'),
       ('_get__this_value_','_get_ThisValue_'),
       ('get_this_value','getThisValue'),        
       ('get__this__value','get_This_Value'),        
       ]
for test,answer in tests:
    result=underscore_to_camelcase(test)
    try:
        assert result==answer
    except AssertionError:
        print('{r!r} != {a!r}'.format(r=result,a=answer))

回答by Oben Sonne

Another regexp solution:

另一个正则表达式解决方案:

import re

def conv(s):
    """Convert underscore-separated strings to camelCase equivalents.

    >>> conv('get')
    'get'
    >>> conv('_get')
    '_get'
    >>> conv('get_this_value')
    'getThisValue'
    >>> conv('__get__this_value_')
    '_get_ThisValue_'
    >>> conv('_get__this_value__')
    '_get_ThisValue_'
    >>> conv('___get_this_value')
    '_getThisValue'

    """
    # convert case:
    s = re.sub(r'(_*[A-Z])', lambda m: m.group(1).lower(), s.title(), count=1)
    # remove/normalize underscores:
    s = re.sub(r'__+|^_+|_+$', '|', s).replace('_', '').replace('|', '_')
    return s

if __name__ == "__main__":
    import doctest
    doctest.testmod()

It works for your examples, but it might fail for names containting digits - it depends how you would capitalize them.

它适用于您的示例,但对于包含数字的名称可能会失败 - 这取决于您如何将它们大写。

回答by kevpie

Here is a list comprehension style generator expression.

这是一个列表理解样式生成器表达式。

from itertools import count
def underscore_to_camelcase(value):
    words = value.split('_')
    counter = count()
    return ''.join('_' if w == '' else w.capitalize() if counter.next() else w for w in words )

回答by Hugh Bothwell

A slightly modified version:

稍微修改的版本:

import re

def underscore_to_camelcase(value):
    first = True
    res = []

    for u,w in re.findall('([_]*)([^_]*)',value):
        if first:
            res.append(u+w)
            first = False
        elif len(w)==0:    # trailing underscores
            res.append(u)
        else:   # trim an underscore and capitalize
            res.append(u[:-1] + w.title())

    return ''.join(res)

回答by Siegfried Gevatter

This one works except for leaving the first word as lowercase.

除了将第一个单词保留为小写外,此方法有效。

def convert(word):
    return ''.join(x.capitalize() or '_' for x in word.split('_'))

(I know this isn't exactly what you asked for, and this thread is quite old, but since it's quite prominent when searching for such conversions on Google I thought I'd add my solution in case it helps anyone else).

(我知道这不完全是您所要求的,并且该线程很旧,但是由于在 Google 上搜索此类转换时它非常突出,我想我会添加我的解决方案,以防它对其他人有所帮助)。