在 Python 中返回字符串中第一个非空白字符的最低索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2378962/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-04 00:30:37  来源:igfitidea点击:

Returning the lowest index for the first non whitespace character in a string in Python

pythonstringstring-matching

提问by Pablo

What's the shortest way to do this in Python?

在 Python 中执行此操作的最短方法是什么?

string = "   xyz"

must return index = 3

必须返回索引 = 3

回答by Frank

>>> s = "   xyz"
>>> len(s) - len(s.lstrip())
3

回答by SilentGhost

>>> next(i for i, j in enumerate('   xyz') if j.strip())
3

or

或者

>>> next(i for i, j in enumerate('   xyz') if j not in string.whitespace)
3

in versions of Python < 2.5 you'll have to do:

在 Python < 2.5 的版本中,您必须执行以下操作:

(...).next()

回答by John Machin

Looks like the "regexes can do anything" brigade have taken the day off, so I'll fill in:

看起来“正则表达式可以做任何事情”旅已经请假了一天,所以我将填写:

>>> tests = [u'foo', u' foo', u'\xA0foo']
>>> import re
>>> for test in tests:
...     print len(re.match(r"\s*", test, re.UNICODE).group(0))
...
0
1
1
>>>

FWIW: time taken is O(the_answer), not O(len(input_string))

FWIW:花费的时间是 O(the_answer),而不是 O(len(input_string))

回答by D.Shawley

import re
def prefix_length(s):
   m = re.match('(\s+)', s)
   if m:
      return len(m.group(0))
   return 0

回答by DevPlayer

Many of the previous solutions are iterating at several points in their proposed solutions. And some make copies of the data (the string). re.match(), strip(), enumerate(), isspace()are duplicating behind the scene work. The

许多以前的解决方案在他们提出的解决方案中的几个点上进行了迭代。有些人复制数据(字符串)。re.match()、strip()、enumerate()、isspace() 在幕后工作是重复的。这

next(idx for idx, chr in enumerate(string) if not chr.isspace())
next(idx for idx, chr in enumerate(string) if not chr.whitespace)

are good choices for testing strings against various leading whitespace types such as vertical tabs and such, but that adds costs too.

是针对各种主要空白类型(例如垂直制表符等)测试字符串的不错选择,但这也会增加成本。

However if your string uses just a space characters or tab charachers then the following, more basic solution, clear and fast solution also uses the less memory.

但是,如果您的字符串仅使用空格字符或制表符,那么以下更基本的解决方案、清晰快速的解决方案也使用较少的内存。

def get_indent(astr):

    """Return index of first non-space character of a sequence else False."""

    try:
        iter(astr)
    except:
        raise

    # OR for not raising exceptions at all
    # if hasattr(astr,'__getitem__): return False

    idx = 0
    while idx < len(astr) and astr[idx] == ' ':
        idx += 1
    if astr[0] <> ' ':
        return False
    return idx

Although this may not be the absolute fastest or simpliest visually, some benefits with this solution are that you can easily transfer this to other languages and versions of Python. And is likely the easiest to debug, as there is little magic behavior. If you put the meat of the function in-line with your code instead of in a function you'd remove the function call part and would make this solution similar in byte code to the other solutions.

尽管这在视觉上可能不是绝对最快或最简单的,但此解决方案的一些好处是您可以轻松地将其转移到其他语言和 Python 版本。并且可能是最容易调试的,因为几乎没有什么神奇的行为。如果您将函数的内容与您的代码内联而不是在函数中,您将删除函数调用部分并使该解决方案在字节码中类似于其他解决方案。

Additionally this solution allows for more variations. Such as adding a test for tabs

此外,该解决方案允许更多变化。比如添加标签测试

or astr[idx] == '\t':

Or you can test the entire data as iterable once instead of checking if each line is iterable. Remember things like ""[0] raises an exception whereas ""[0:] does not.

或者您可以将整个数据测试为可迭代一次,而不是检查每一行是否可迭代。记住像 ""[0] 会引发异常而 ""[0:] 不会。

If you wanted to push the solution to inline you could go the non-Pythonic route:

如果您想将解决方案推向内联,您可以走非 Pythonic 路线:

i = 0
while i < len(s) and s[i] == ' ': i += 1

print i
3

. .

. .

回答by Adrien Plisson

>>> string = "   xyz"
>>> next(idx for idx, chr in enumerate(string) if not chr.isspace())
3

回答by ghostdog74

>>> string = "   xyz"
>>> map(str.isspace,string).index(False)
3