如何从python中的字符串中删除这个\xa0？

Question

提问by slopeofhope

I have the following string:

我有以下字符串：

 word = u'Buffalo,\xa0IL\xa060625'

I don't want the "\xa0" in there. How can I get rid of it? The string I want is:

我不想要“\xa0”在那里。我怎样才能摆脱它？我想要的字符串是：

word = 'Buffalo, IL 06025

Answer 1

采纳答案by mgilson

If you know for sure that is the only character you don't want, you can .replaceit:

如果您确定这是唯一不想要的角色，您可以.replace：

>>> word.replace(u'\xa0', ' ')
u'Buffalo, IL 60625'

If you need to handle all non-ascii characters, encoding and replacing bad characters might be a good start...:

如果您需要处理所有非 ascii 字符，编码和替换坏字符可能是一个好的开始...：

>>> word.encode('ascii', 'replace')
'Buffalo,?IL?60625'

Answer 2

回答by khelwood

This seems to work for getting rid of non-ascii characters:

这似乎适用于摆脱非 ascii 字符：

fixedword = word.encode('ascii','ignore')

Answer 3

回答by abarnert

There is no \xathere. If you try to put that into a string literal, you're going to get a syntax error if you're lucky, or it's going to swallow up the next attempted character if you're not, because \xsequences aways have to be followed by two hexadecimal digits.

那里没有\xa。如果您尝试将其放入字符串文字中，那么幸运的话您将得到一个语法错误，否则它将吞掉下一个尝试的字符，因为\x必须在序列离开之后两个十六进制数字。

What you have is \xa0, which is an escape sequence for the character U+00A0, aka "NO-BREAK SPACE".

您拥有的是\xa0，这是字符U+00A0的转义序列，又名“NO-BREAK SPACE”。

I think you want to replace them with spaces, but whatever you want to do is pretty easy to write:

我认为您想用空格替换它们，但是无论您想做什么都很容易编写：

word.replace(u'\xa0', u' ') # replaced with space
word.replace(u'\xa0', u'0') # closest to what you were literally asking for
word.replace(u'\xa0', u'')  # removed completely

Answer 4

回答by Mark Ransom

The most robust way would be to use the unidecodemoduleto convert all non-ASCII characters to their closest ASCIIequivalent automatically.

最可靠的方法是使用该unidecode模块将所有非 ASCII 字符自动转换为最接近的ASCII 字符。

The character \xa0(not \xaas you stated) is a NO-BREAK SPACE, and the closest ASCII equivalent would of course be a regular space.

这个字符\xa0（不是\xa你说的）是一个NO-BREAK SPACE，最接近的 ASCII 等价物当然是一个普通的空格。

import unidecode
word = unidecode.unidecode(word)

Answer 5

回答by Amir Imani

You can easily use unicodedatato get rid of all of \x...characters.

您可以轻松地使用unicodedata来摆脱所有\x...字符。

from unicodedata import normalize
normalize('NFKD', word)
>>> 'Buffalo, IL 60625'

如何从python中的字符串中删除这个\xa0？

提问by slopeofhope

采纳答案by mgilson

回答by khelwood

回答by abarnert

回答by Mark Ransom

回答by Amir Imani

相关推荐

最近更新

标签

如何从python中的字符串中删除这个\xa0？

提问by slopeofhope

采纳答案by mgilson

回答by khelwood

回答by abarnert

回答by Mark Ransom

回答by Amir Imani

相关推荐

在python中计算数据帧的每一列中的非零值

如何从 bash shell 内联执行 Python

Python 安装脚本退出并出现错误：命令“x86_64-linux-gnu-gcc”失败，退出状态为 1

Python 以简洁的方式显示从 Flask 返回的 JSON

相关推荐

最近更新

标签