Python 如何删除那些“\x00\x00”

Question

提问by Luffy Cyliu

How to remove those "\x00\x00" in a string ? I have many of those strings (example shown below). I can use re.subto replace those "\x00". But I am wondering whether there is a better way to do that? Converting between unicode, bytes and string is always confusing.

如何删除字符串中的那些“\x00\x00”？我有很多这样的字符串（示例如下所示）。我可以re.sub用来替换那些“\x00”。但我想知道是否有更好的方法来做到这一点？在 unicode、bytes 和 string 之间转换总是令人困惑。

'Hello\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'.

Answer 1

回答by warownia1

Use rstrip

用 rstrip

>>> text = 'Hello\x00\x00\x00\x00'
>>> text.rstrip('\x00')
'Hello'

It removes all \x00characters at the end of the string.

它删除\x00字符串末尾的所有字符。

Answer 2

回答by galaxyan

>>> a = 'Hello\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' 
>>> a.replace('\x00','')
'Hello'

Answer 3

回答by anregen

I think the more general solution is to use:

我认为更通用的解决方案是使用：

cleanstring = nullterminatedstring.split('\x00',1)[0]

Which will splitthe string using \x00as the delimeter 1time. The split(...)returns a 2 element list: everything before the null in addition to everything after the null (it removes the delimeter). Appending [0]only returns the portion of the string before the first null (\x00) character, which I believe is what you're looking for.

哪个split字符串将\x00用作分隔符1时间。在split(...)除一切空（它消除了分隔符）之后的空之前的一切：返回一个2元素的列表。追加[0]只返回第一个空 (\x00) 字符之前的字符串部分，我相信这就是您要查找的内容。

The convention in some languages, specifically C-like, is that a single null character marks the end of the string. For example, you should also expect to see strings that look like:

某些语言（特别是 C 类语言）的约定是单个空字符标记字符串的结尾。例如，您还应该期望看到如下所示的字符串：

'Hello\x00dpiecesofsomeoldstring\x00\x00\x00'

The answer supplied here will handle that situation as well as the other examples.

此处提供的答案将处理这种情况以及其他示例。

Answer 4

回答by sarlacii

Building on the answers supplied, I suggest that strip() is more generic than rstrip() for cleaning up a data packet, as strip() removes chars from the beginning and the end of the supplied string, whereas rstrip() simply removes chars from the end of the string.

基于提供的答案，我建议 strip() 在清理数据包方面比 rstrip() 更通用，因为 strip() 从提供的字符串的开头和结尾删除字符，而 rstrip() 只是删除字符从字符串的末尾。

However, NUL chars are not treated as whitespace by default by strip(), and as such you need to specify explicitly. This can catch you out, as print() will of course not show the NUL chars. My solution that I used was to clean the string using ".strip().strip('\x00')":

但是，默认情况下，strip() 不会将 NUL 字符视为空格，因此您需要明确指定。这可能会让您措手不及，因为 print() 当然不会显示 NUL 字符。我使用的解决方案是使用“ .strip().strip('\x00')”清理字符串：

>>> arbBytesFromSocket = b'\x00\x00\x00\x00hello\x00\x00\x00\x00'
>>> arbBytesAsString = arbBytesFromSocket.decode('ascii')
>>> print(arbBytesAsString)
hello
>>> str(arbBytesAsString)
'\x00\x00\x00\x00hello\x00\x00\x00\x00'
>>> arbBytesAsString = arbBytesFromSocket.decode('ascii').strip().strip('\x00')
>>> str(arbBytesAsString)
'hello'
>>>

This gives you the string/byte array required, without the NUL chars on each end, and also preserves any NUL chars inside the "data packet", which is useful for received byte data that may contain valid NUL chars (eg. a C-type structure).

这为您提供了所需的字符串/字节数组，每端都没有 NUL 字符，并且还保留了“数据包”中的任何 NUL 字符，这对于接收到的可能包含有效 NUL 字符的字节数据（例如 C-类型结构）。

Python 如何删除那些“\x00\x00”

提问by Luffy Cyliu

回答by warownia1

回答by galaxyan

回答by anregen

回答by sarlacii

相关推荐

最近更新

标签

Python 如何删除那些“\x00\x00”

提问by Luffy Cyliu

回答by warownia1

回答by galaxyan

回答by anregen

回答by sarlacii

相关推荐

Python 重置列索引熊猫？

Python 以位置格式将浮点数转换为字符串（无科学记数法和错误精度）

Python ValueError：所有输入数组必须具有相同的维数

Python 从 Pandas 数据帧转换为 TensorFlow 张量对象

相关推荐

最近更新

标签