python:json.dumps 不能处理 utf-8?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4184108/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python: json.dumps can't handle utf-8?
提问by Bin Chen
Below is the test program, including a Chinese character:
下面是测试程序,包括一个汉字:
# -*- coding: utf-8 -*-
import json
j = {"d":"中", "e":"a"}
json = json.dumps(j, encoding="utf-8")
print json
Below is the result, look the json.dumps convert the utf-8 to the original numbers!
下面是结果,看看 json.dumps 将 utf-8 转换为原始数字!
{"e": "a", "d": "\u4e2d"}
Why this is broken? Or anything I am wrong?
为什么这个坏了?或者我有什么错?
采纳答案by Boldewyn
You should read json.org. The complete JSON specification is in the white box on the right.
你应该阅读json.org。完整的 JSON 规范位于右侧的白框中。
There is nothing wrong with the generated JSON. Generators are allowed to genereate either UTF-8 strings or plain ASCII strings, where characters are escaped with the \uXXXXnotation. In your case, the Python jsonmodule decided for escaping, and 中has the escaped notation \u4e2d.
生成的 JSON 没有任何问题。允许生成器生成 UTF-8 字符串或纯 ASCII 字符串,其中字符用\uXXXX符号转义。在您的情况下,Pythonjson模块决定转义,并中具有转义符号\u4e2d。
By the way: Any conforming JSON interpreter will correctly unescape this sequence again and give you back the actual character.
顺便说一句:任何符合标准的 JSON 解释器都会再次正确地转义这个序列,并返回实际的字符。
回答by Ignacio Vazquez-Abrams
Looks like valid JSON to me. If you want jsonto output a string that has non-ASCII characters in it then you need to pass ensure_ascii=Falseand then encode manually afterward.
对我来说看起来像有效的 JSON。如果你想json输出一个包含非 ASCII 字符的字符串,那么你需要传递ensure_ascii=False然后手动编码。
回答by alemol
Use simplejson with the mentioned options:
使用带有上述选项的 simplejson:
# -*- coding: utf-8 -*-
import simplejson as json
j = {"d":"中", "e":"a"}
json = json.dumps(j, ensure_ascii=False, encoding="utf-8")
print json
Outs:
出局:
{"e": "a", "d": "中"}

