python 3中的字符串拆分格式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17222355/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:48:40  来源:igfitidea点击:

String split formatting in python 3

pythonformattingnewline

提问by Student J

I'm trying to format this string below where one row contains five words. However, I keep getting this as the output:

我试图在下面的一行包含五个单词的地方格式化这个字符串。但是,我一直将其作为输出:

I love cookies yes I do Let s see a dog

我喜欢饼干 是的 我喜欢 让我们看一只狗

First, I am not getting 5 words in one line, but instead, everything in one line.

首先,我不是在一行中得到 5 个单词,而是在一行中得到所有内容。

Second, why does the "Let's" get split? I thought in splitting the string using "words", it will only split if there was a space in between?

其次,为什么“让我们”分裂?我想在使用“单词”分割字符串时,它只会在中间有空格的情况下分割吗?

Suggestions?

建议?

string = """I love cookies. yes I do. Let's see a dog."""


# split string
words = re.split('\W+',string)

words = [i for i in words if i != '']


counter = 0
output=''
for i in words:
    if counter == 0:
        output +="{0:>15s}".format(i)

# if counter == 5, new row
    elif counter % 5 == 0:
       output += '\n'
       output += "{0:>15s}".format(i)

    else:
       output += "{0:>15s}".format(i)

    # Increase the counter by 1
    counter += 1

print(output)

采纳答案by Fredrik Pihl

As a start, don't call a variable "string" since it shadows the modulewith the same name

首先,不要调用变量“字符串”,因为它会隐藏具有相同名称的模块

Secondly, use split()to do your word-splitting

其次,split()用来做你的分词

>>> s = """I love cookies. yes I do. Let's see a dog."""
>>> s.split()
['I', 'love', 'cookies.', 'yes', 'I', 'do.', "Let's", 'see', 'a', 'dog.']

From re-module

重新模块

\W Matches any character which is not a Unicode word character. This is the opposite of \w. If the ASCII flag is used this becomes the equivalent of [^a-zA-Z0-9_] (but the flag affects the entire regular expression, so in such cases using an explicit [^a-zA-Z0-9_] may be a better choice).

\W 匹配任何不是 Unicode 单词字符的字符。这与\w相反。如果使用 ASCII 标志,则这相当于 [^a-zA-Z0-9_](但该标志会影响整个正则表达式,因此在这种情况下,使用显式 [^a-zA-Z0-9_] 可能是更好的选择)。

Since the 'is not listed in the above, the regexp used splits the "Let's" string into two parts:

由于'上面没有列出,所使用的正则表达式将“Let's”字符串分成两部分:

>>> words = re.split('\W+', s)
>>> words
['I', 'love', 'cookies', 'yes', 'I', 'do', 'Let', 's', 'see', 'a', 'dog', '']

This is the output I get using the strip()-approach above:

这是我使用上面的 strip() 方法得到的输出:

$ ./sp3.py 
              I           love       cookies.            yes              I
            do.          Let's            see              a           dog.

The code could probably be simplified to this since counter==0and the else-clause does the same thing. I through in an enumeratethere as well to get rid of the counter:

代码可能会简化为这个,因为counter==0else 子句做同样的事情。我也在那里进行了枚举以摆脱计数器:

#!/usr/bin/env python3

s = """I love cookies. yes I do. Let's see a dog."""
words = s.split()

output = ''
for n, i in enumerate(words):
    if n % 5 == 0:
        output += '\n'
    output += "{0:>15s}".format(i)
print(output)

回答by Stephan

words = string.split()
while (len(words))
     for word in words[:5]
          print(word, end=" ")
     print()
     words = words[5:]

That's the basic concept, split it using the split() method

这是基本概念,使用 split() 方法拆分它

Then slice it using slice notation to get the first 5 words

然后使用切片符号对其进行切片以获得前 5 个单词

Then slice off the first 5 words, and loop again

然后切掉前5个单词,再次循环