python 的 etree.tostring 的编码问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1428172/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Encoding issues with python's etree.tostring
提问by smock
I'm using python 2.6.2's xml.etree.cElementTree to create an xml document:
我正在使用 python 2.6.2 的 xml.etree.cElementTree 创建一个 xml 文档:
import xml.etree.cElementTree as etree
elem = etree.Element('tag')
elem.text = (u"Würth Elektronik Midcom").encode('utf-8')
xml = etree.tostring(elem,encoding='UTF-8')
At the end of the day, xml looks like:
在一天结束时,xml 看起来像:
<?xml version='1.0' encoding='UTF-8'?>
<tag>Würth Elektronik Midcom</tag>
It looks like tostring ignored the encoding parameter and encoded 'ü' into some other character encoding ('ü' is a valid utf-8 encoding, I'm fairly sure).
看起来 tostring 忽略了编码参数并将 'ü' 编码为其他一些字符编码('ü' 是有效的 utf-8 编码,我很确定)。
Any advice as to what I'm doing wrong would be greatly appreciated.
任何关于我做错了什么的建议将不胜感激。
回答by John Millikin
You're encoding the text twice. Try this:
您对文本进行了两次编码。试试这个:
import xml.etree.cElementTree as etree
elem = etree.Element('tag')
elem.text = u"Würth Elektronik Midcom"
xml = etree.tostring(elem, encoding='UTF-8')
回答by BaiJiFeiLong
etree.tostring(elem, encoding=str)
etree.tostring(elem, encoding=str)
will return str
but not binary
in Python 3
将返回str
但不在binary
Python 3 中
You can also serialise to a Unicode string without declaration by passing the
unicode
function as encoding (orstr
in Py3), or the name 'unicode'. This changes the return value from a byte string to an unencoded unicode string.
您还可以通过将
unicode
函数作为编码(或str
在 Py3 中)或名称“unicode”传递来序列化为Unicode 字符串而无需声明。这会将返回值从字节字符串更改为未编码的 unicode 字符串。