Python：使用正则表达式从所有行中删除空格

Question

提问by user469652

^(\s+)only removes the whitespace from the first line. How do I remove the front whitespace from all the lines?

^(\s+)只删除第一行的空格。如何从所有行中删除前面的空格？

Answer 1

采纳答案by AndiDog

Python's regex module does not default to multi-line ^matching, so you need to specify that flag explicitly.

Python 的 regex 模块不默认为multi-line ^matching，因此您需要明确指定该标志。

r = re.compile(r"^\s+", re.MULTILINE)
r.sub("", "a\n b\n c") # "a\nb\nc"

# or without compiling (only possible for Python 2.7+ because the flags option
# didn't exist in earlier versions of re.sub)

re.sub(r"^\s+", "", "a\n b\n c", flags = re.MULTILINE)

# but mind that \s includes newlines:
r.sub("", "a\n\n\n\n b\n c") # "a\nb\nc"

It's also possible to include the flag inline to the pattern:

也可以将标志内联到模式中：

re.sub(r"(?m)^\s+", "", "a\n b\n c")

An easier solution is to avoid regular expressions because the original problem is very simple:

一个更简单的解决方案是避免使用正则表达式，因为原始问题非常简单：

content = 'a\n b\n\n c'
stripped_content = ''.join(line.lstrip(' \t') for line in content.splitlines(True))
# stripped_content == 'a\nb\n\nc'

Answer 2

回答by ghostdog74

you can try strip()if you want to remove front and back, or lstrip()if front

你可以试试strip()如果你想去掉正面和背面，或者lstrip()如果正面

>>> s="  string with front spaces and back   "
>>> s.strip()
'string with front spaces and back'
>>> s.lstrip()
'string with front spaces and back   '

for line in open("file"):
    print line.lstrip()

If you really want to use regex

如果你真的想使用正则表达式

>>> import re
>>> re.sub("^\s+","",s) # remove the front
'string with front spaces and back   '
>>> re.sub("\s+\Z","",s)
'  string with front spaces and back'  #remove the back

Answer 3

回答by Tony Veijalainen

nowhite = ''.join(mytext.split())

NO whitespace will remain like you asked (everything is put as one word). More useful usualy is to join everything with ' 'or '\n'to keep words separately.

没有空格会像您问的那样保留（所有内容都放在一个词中）。更有用的通常是将所有内容加入' '或'\n'单独保留单词。

Answer 4

回答by tzot

You'll have to use the re.MULTILINE option:

您必须使用 re.MULTILINE 选项：

re.sub("(?m)^\s+", "", text)

The "(?m)" part enables multiline.

“(?m)”部分启用多行。

Answer 5

回答by John Machin

@AndiDog acknowledges in his (currently accepted) answer that it munches consecutive newlines.

@AndiDog 在他的（目前接受的）回答中承认它会咀嚼连续的换行符。

Here's how to fix that deficiency, which is caused by the fact that \nis BOTH whitespace and a line separator. What we need to do is make an re class that includes only whitespace characters other than newline.

这是解决该缺陷的方法，该缺陷是由\n空格和行分隔符引起的。我们需要做的是创建一个只包含除换行符以外的空白字符的 re 类。

We want whitespace and not newline, which can't be expressed directly in an re class. Let's rewrite that as not not (whitespace and not newline)i.e. not(not whitespace or not not newline(thanks, Augustus) i.e. not(not whitespace or newline)i.e. [^\S\n]in renotation.

我们想要whitespace and not newline，不能直接在 re 类中表达。让我们把它改写为not not (whitespace and not newline)ie not(not whitespace or not not newline（谢谢，奥古斯都）ie not(not whitespace or newline)ie[^\S\n]用re符号表示。

So:

所以：

>>> re.sub(r"(?m)^[^\S\n]+", "", "  a\n\n   \n\n b\n c\nd  e")
'a\n\n\n\nb\nc\nd  e'

Answer 6

回答by Tim McNamara

You don't actually need regular expressions for this most of the time. If you are only looking to remove commonindentation across multiple lines, try the textwrapmodule:

大多数时候您实际上并不需要正则表达式。如果您只想删除多行中的常见缩进，请尝试以下textwrap模块：

>>> import textwrap
>>> messy_text = " grrr\n whitespace\n everywhere"
>>> print textwrap.dedent(messy_text)
grrr
whitespace
everywhere

Note that if the indentation is irregular, this will maintained:

请注意，如果缩进不规则，这将保持：

>>> very_messy_text = " grrr\n \twhitespace\n everywhere"
>>> print textwrap.dedent(very_messy_text)
grrr
        whitespace
everywhere

Python：使用正则表达式从所有行中删除空格

提问by user469652

采纳答案by AndiDog

回答by ghostdog74

回答by Tony Veijalainen

回答by tzot

回答by John Machin

回答by Tim McNamara

相关推荐

最近更新

标签

Python：使用正则表达式从所有行中删除空格

提问by user469652

采纳答案by AndiDog

回答by ghostdog74

回答by Tony Veijalainen

回答by tzot

回答by John Machin

回答by Tim McNamara

相关推荐

在 Python 中，如何以可读格式显示当前时间

pip 可以与 Visual Studio 中的 Python 工具一起使用吗？

Python：删除 TKinter 框架

Python matplotlib 阶跃函数中的线型

相关推荐

最近更新

标签