Python 如何从字符串中删除 \n 和 \r

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35830924/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:02:14  来源:igfitidea点击:

How to remove \n and \r from a string

pythonhtmlpython-3.xfile-writing

提问by HittmanA

I currently am trying to get the code from this website: http://netherkingdom.netai.net/pycake.htmlThen I have a python script parse out all code in html div tags, and finally write the text from between the div tags to a file. The problem is it adds a bunch of \r and \n to the file. How can I either avoid this or remove the \r and \n. Here is my code:

我目前正在尝试从这个网站获取代码:http: //netherkingdom.netai.net/pycake.html然后我有一个 python 脚本解析出 html div 标签中的所有代码,最后写出 div 标签之间的文本到一个文件。问题是它在文件中添加了一堆 \r 和 \n 。我怎样才能避免这种情况或删除 \r 和 \n。这是我的代码:

import urllib.request
from html.parser import HTMLParser
import re
page = urllib.request.urlopen('http://netherkingdom.netai.net/pycake.html')
t = page.read()
class MyHTMLParser(HTMLParser):
    def handle_data(self, data):
        print(data)
        f = open('/Users/austinhitt/Desktop/Test.py', 'r')
        t = f.read()
        f = open('/Users/austinhitt/Desktop/Test.py', 'w')
        f.write(t + '\n' + data)
        f.close()
parser = MyHTMLParser()
t = t.decode()
parser.feed(t)

And here is the resulting file it makes:

这是它生成的结果文件:

b'
import time as t\r\n
from os import path\r\n
import os\r\n
\r\n
\r\n
\r\n
\r\n
\r\n'

Preferably I would also like to have the beginning b' and last ' removed. I am using Python 3.5.1 on a Mac.

最好我还想删除开头的 b' 和最后一个 '。我在 Mac 上使用 Python 3.5.1。

回答by cdarke

A simple solution is to strip trailing whitespace:

一个简单的解决方案是去除尾随空格:

with open('gash.txt', 'r') as var:
    for line in var:
        line = line.rstrip()
        print(line)

The advantage of rstrip()over using a [:-2]slice is that this is safe for UNIX style files as well.

rstrip()使用[:-2]切片的优点是这对于 UNIX 样式文件也是安全的。

However, if you only want to get rid of \rand they might not be at the end-of-line, then str.replace()is your friend:

但是,如果您只想摆脱\r并且它们可能不在行尾,那么str.replace()您的朋友是:

line = line.replace('\r', '')

If you have a byte object (that's the leading b') the you can convert it to a native Python 3 string using:

如果您有一个字节对象(即前导b'),您可以使用以下方法将其转换为原生 Python 3 字符串:

line = line.decode()

回答by will.fiset

One simple solution is just to strip off the last two characters of each line:

一个简单的解决方案是去掉每行的最后两个字符:

f = open('yourfile')
for line in f.readlines():
  line = line[:-2] # Removes last two characters (\r\n)
  print(repr(line))