在 Python 3 中解码十六进制字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3283984/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Decode Hex String in Python 3
提问by chimeracoder
In Python 2, converting the hexadecimal form of a string into the corresponding unicode was straightforward:
在 Python 2 中,将字符串的十六进制形式转换为相应的 unicode 非常简单:
comments.decode("hex")
where the variable 'comments' is a part of a line in a file (the rest of the line does notneed to be converted, as it is represented only in ASCII.
其中,变量“评论”是在一个文件中(该行的其余部分的线的一部分并不需要转换,因为它仅在ASCII表示。
Now in Python 3, however, this doesn't work (I assume because of the bytes/string vs. string/unicode switch. I feel like there should be a one-liner in Python 3 to do the same thing, rather than reading the entire line as a series of bytes (which I don't want to do) and then converting each part of the line separately. If it's possible, I'd like to read the entire line as a unicode string (because the rest of the line is in unicode) and only convert this one part from a hexadecimal representation.
然而,现在在 Python 3 中,这不起作用(我假设是因为字节/字符串与字符串/unicode 开关。我觉得 Python 3 中应该有一个单行代码来做同样的事情,而不是阅读整行作为一系列字节(我不想这样做),然后分别转换行的每一部分。如果可能,我想将整行作为 unicode 字符串读取(因为其余的该行是 unicode),并且只从十六进制表示转换这一部分。
采纳答案by unbeli
Something like:
就像是:
>>> bytes.fromhex('4a4b4c').decode('utf-8')
'JKL'
Just put the actual encoding you are using.
只需输入您正在使用的实际编码即可。
回答by Niklas
import codecs
decode_hex = codecs.getdecoder("hex_codec")
# for an array
msgs = [decode_hex(msg)[0] for msg in msgs]
# for a string
string = decode_hex(string)[0]
回答by HackerBoss
The answers from @unbeli and @Niklas are good, but @unbeli's answer does not work for all hex strings and it is desirable to do the decoding without importing an extra library (codecs). The following should work (but will not be very efficient for large strings):
@unbeli 和 @Niklas 的答案很好,但 @unbeli 的答案不适用于所有十六进制字符串,最好在不导入额外库(编解码器)的情况下进行解码。以下应该有效(但对于大字符串不会很有效):
>>> result = bytes.fromhex((lambda s: ("%s%s00" * (len(s)//2)) % tuple(s))('4a82fdfeff00')).decode('utf-16-le')
>>> result == '\x4a\x82\xfd\xfe\xff\x00'
True
Basically, it works around having invalid utf-8 bytes by padding with zeros and decoding as utf-16.
基本上,它通过用零填充并解码为 utf-16 来解决无效的 utf-8 字节。
回答by tripleee
Here's another which is IMHO simpler.
这是另一个更简单的恕我直言。
''.join[[chr("0x" + hex) for hex in sequence])
Example:
例子:
>>> ''.join([chr('0x' + x) for x in ('68', '00e1', '006c', '00f6', '3077', '1e05a')])
'hál?ぷ'

