Python:将字符串从 UTF-8 转换为 Latin-1

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4299802/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 15:09:22  来源:igfitidea点击:

Python: convert string from UTF-8 to Latin-1

pythonencoding

提问by romor

I feel stacked here trying to change encodings with Python 2.5

我觉得这里堆积如山,试图用 Python 2.5 更改编码

I have XML response, which I encode to UTF-8: response.encode('utf-8'). That is fine, but the program which uses this info doesn't like this encoding and I have to convert it to other code page. Real example is that I use ghostscript python module to embed pdfmark data to a PDF file - end result is with wrong characters in Acrobat.

我有 XML 响应,我将其编码为 UTF-8: response.encode('utf-8')。这很好,但是使用此信息的程序不喜欢这种编码,我必须将其转换为其他代码页。真实的例子是我使用 ghostscript python 模块将 pdfmark 数据嵌入到 PDF 文件中 - 最终结果是 Acrobat 中的字符错误。

I've done numerous combinations with .encode()and .decode()between 'utf-8' and 'latin-1' and it drives me crazy as I can't output correct result.

我已经做了无数的组合,带.encode().decode()“UTF-8”和“拉丁-1”之间,它让我疯狂,我不能输出正确的结果。

If I output the string to a file with .encode('utf-8')and then convert this file from UTF-8 to CP1252 (aka latin-1) with i.e. iconv.exeand embed the data everything is fine.

如果我将字符串输出到一个文件,.encode('utf-8')然后使用 ie iconv.exe将此文件从 UTF-8 转换为 CP1252(又名 latin-1)并嵌入数据,一切都很好。

Basically can someone help me convert i.e. character áwhich is UTF-8 encoded as hex: C3 A1to latin-1 as hex: E1?

基本上有人可以帮我将 ie 字符á转换为 UTF-8 编码为 hex:C3 A1到 latin-1 as hex:E1吗?

Thanks in advance

提前致谢

采纳答案by Ignacio Vazquez-Abrams

Instead of .encode('utf-8'), use .encode('latin-1').

而不是.encode('utf-8'),使用.encode('latin-1')

回答by Utku Zihnioglu

data="UTF-8 data"
udata=data.decode("utf-8")
data=udata.encode("latin-1","ignore")

Should do it.

应该做。

回答by amit

Can you provide more details about what you are trying to do? In general, if you have a unicode string, you can use encode to convert it into string with appropriate encoding. Eg:

您能否提供有关您正在尝试做的事情的更多详细信息?一般来说,如果您有一个 unicode 字符串,您可以使用 encode 将其转换为具有适当编码的字符串。例如:

>>> a = u"\u00E1"
>>> type(a)
<type 'unicode'>
>>> a.encode('utf-8')
'\xc3\xa1'
>>> a.encode('latin-1')
'\xe1'

回答by handle

If the previous answers do not solve your problem, check the source of the data that won't print/convert properly.

如果之前的答案不能解决您的问题,请检查无法正确打印/转换的数据来源。

In my case, I was using json.loadon data incorrectly read from file by not using the encoding="utf-8". Trying to de-/encode the resulting string to latin-1just does not help...

就我而言,我json.load通过不使用encoding="utf-8". 试图将结果字符串解码/编码为latin-1只是无济于事......