在 python 列表中替换 \x00 的最佳方法?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16071461/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Best way to replace \x00 in python lists?
提问by user2292661
I have a list of values from a parsed PE file that include /x00 null bytes at the end of each section. I want to be able to remove the /x00 bytes from the string without removing all "x"s from the file. I have tried doing .replace and re.sub, but not which much success.
我有一个来自解析的 PE 文件的值列表,每个部分的末尾都包含 /x00 空字节。我希望能够从字符串中删除 /x00 字节而不从文件中删除所有“x”。我试过做 .replace 和 re.sub,但没有多大成功。
Using Python 2.6.6
使用 Python 2.6.6
Example.
例子。
import re
List = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
while count < len(List):
test = re.sub('\\x00', '', str(list[count])
print test
count += 1
>>>test (removes x, but I want to keep it) #changed from tet to test
>>>data
>>>rsrc
I want to get the following output
我想得到以下输出
text data rsrc
文本数据 rsrc
Any ideas on the best way of going about this?
关于解决此问题的最佳方式的任何想法?
采纳答案by jamylak
>>> L = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
>>> [[x[0]] for x in L]
[['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
>>> [[x[0].replace('\x00', '')] for x in L]
[['.text'], ['.data'], ['.rsrc']]
Or to modify the list in place instead of creating a new one:
或者就地修改列表而不是创建新列表:
for x in L:
x[0] = x[0].replace('\x00', '')
回答by thkang
from itertools import chain
List = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
new_list = [x.replace("\x00", "") for x in chain(*List)]
#['.text', '.data', '.rsrc']
回答by Chris Doggett
Try a unicode pattern, like this:
尝试一个 unicode 模式,像这样:
re.sub(u'\x00', '', s)
It should give the following results:
它应该给出以下结果:
l = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
for x in l:
for s in l:
print re.sub(u'\x00', '', s)
count += 1
.text
.data
.rsrc
Or, using list comprehensions:
或者,使用列表推导式:
[[re.sub(u'\x00', '', s) for s in x] for x in l]
Actually, should work without the 'u' in front of the string. Just remove the first 3 slashes, and use this as your regex pattern:
实际上,应该在字符串前面没有 'u' 的情况下工作。只需删除前 3 个斜杠,并将其用作您的正则表达式模式:
'\x00'
回答by Luka Rahne
lst = (i[0].rstrip('\x00') for i in List)
for j in lst:
print j,
回答by martineau
What you're really wanting to do is replace '\x00'characters in stringsin a list.
您真正想要做的是替换列表'\x00'中字符串中的字符。
Towards that goal, people often overlook the fact that in Python 2 the non-Unicode string translate()method will also optionally (or only) delete 8-bit characters as illustrated below. (It doesn't accept this argument in Python 3 because strings are Unicode objects by default.)
为了这个目标,人们经常忽略这样一个事实,即在 Python 2 中,非 Unicode 字符串translate()方法也可以选择(或仅)删除 8 位字符,如下图所示。(它在 Python 3 中不接受这个参数,因为字符串默认是 Unicode 对象。)
Your Listdata structure seems a little odd, since it's a list of one-element lists consisting of just single strings. Regardless, in the code below I've renamed it sectionssince Capitalized words should only be used for the names of classes according to PEP 8 -- Style Guide for Python Code.
你的List数据结构看起来有点奇怪,因为它是一个由单个字符串组成的单元素列表。无论如何,在下面的代码中,我已将其重命名,sections因为根据PEP 8 - Python 代码风格指南,大写单词只能用于类的名称。
sections = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
for section in sections:
test = section[0].translate(None, '\x00')
print test
Output:
输出:
.text
.data
.rsrc
回答by Atri Basu
I think a better way to take care of this particular problem is to use the following function:
我认为处理这个特定问题的更好方法是使用以下函数:
import string
for item in List:
filter(lambda x: x in string.printable, str(item))
This will get rid of not just \x00 but any other such hex values that are appended to your string.
这不仅会消除 \x00,还会消除附加到字符串的任何其他此类十六进制值。

