在 Python 3 中解码十六进制字符串

Question

提问by chimeracoder

In Python 2, converting the hexadecimal form of a string into the corresponding unicode was straightforward:

在 Python 2 中，将字符串的十六进制形式转换为相应的 unicode 非常简单：

comments.decode("hex")

where the variable 'comments' is a part of a line in a file (the rest of the line does notneed to be converted, as it is represented only in ASCII.

其中，变量“评论”是在一个文件中（该行的其余部分的线的一部分并不需要转换，因为它仅在ASCII表示。

Now in Python 3, however, this doesn't work (I assume because of the bytes/string vs. string/unicode switch. I feel like there should be a one-liner in Python 3 to do the same thing, rather than reading the entire line as a series of bytes (which I don't want to do) and then converting each part of the line separately. If it's possible, I'd like to read the entire line as a unicode string (because the rest of the line is in unicode) and only convert this one part from a hexadecimal representation.

然而，现在在 Python 3 中，这不起作用（我假设是因为字节/字符串与字符串/unicode 开关。我觉得 Python 3 中应该有一个单行代码来做同样的事情，而不是阅读整行作为一系列字节（我不想这样做），然后分别转换行的每一部分。如果可能，我想将整行作为 unicode 字符串读取（因为其余的该行是 unicode），并且只从十六进制表示转换这一部分。

Answer 1

采纳答案by unbeli

Something like:

就像是：

>>> bytes.fromhex('4a4b4c').decode('utf-8')
'JKL'

Just put the actual encoding you are using.

只需输入您正在使用的实际编码即可。

Answer 2

回答by Niklas

import codecs

decode_hex = codecs.getdecoder("hex_codec")

# for an array
msgs = [decode_hex(msg)[0] for msg in msgs]

# for a string
string = decode_hex(string)[0]

Answer 3

回答by HackerBoss

The answers from @unbeli and @Niklas are good, but @unbeli's answer does not work for all hex strings and it is desirable to do the decoding without importing an extra library (codecs). The following should work (but will not be very efficient for large strings):

@unbeli 和 @Niklas 的答案很好，但 @unbeli 的答案不适用于所有十六进制字符串，最好在不导入额外库（编解码器）的情况下进行解码。以下应该有效（但对于大字符串不会很有效）：

>>> result = bytes.fromhex((lambda s: ("%s%s00" * (len(s)//2)) % tuple(s))('4a82fdfeff00')).decode('utf-16-le')
>>> result == '\x4a\x82\xfd\xfe\xff\x00'
True

Basically, it works around having invalid utf-8 bytes by padding with zeros and decoding as utf-16.

基本上，它通过用零填充并解码为 utf-16 来解决无效的 utf-8 字节。

Answer 4

回答by tripleee

Here's another which is IMHO simpler.

这是另一个更简单的恕我直言。

''.join[[chr("0x" + hex) for hex in sequence])

Example:

例子：

>>> ''.join([chr('0x' + x) for x in ('68', '00e1', '006c', '00f6', '3077', '1e05a')])
'hál?ぷ'

在 Python 3 中解码十六进制字符串

提问by chimeracoder

采纳答案by unbeli

回答by Niklas

回答by HackerBoss

回答by tripleee

相关推荐

最近更新

标签

在 Python 3 中解码十六进制字符串

提问by chimeracoder

采纳答案by unbeli

回答by Niklas

回答by HackerBoss

回答by tripleee

相关推荐

Python 什么是key=lambda

Python 正则表达式（在一个字符串中搜索多个值）

Python SciPy 的 optimize.minimize 中的多个变量

Python while (bool):

相关推荐

最近更新

标签