Python Django:非 ASCII 字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4635188/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Django: Non-ASCII character
提问by dkgirl
My Django View/Template is not able to handle special characters. The simple view below fails because of the ?. I get below error:
我的 Django 视图/模板无法处理特殊字符。由于 ?,下面的简单视图失败了。我得到以下错误:
Non-ASCII character '\xf1' in file"
文件中的非 ASCII 字符 '\xf1'"
def test(request):
return HttpResponse('espa?ol')
Is there some general setting that I need to set? It would be weird if I had to handle all strings separately: non-American letters are pretty common!
是否有一些我需要设置的通用设置?如果我必须单独处理所有字符串会很奇怪:非美国字母很常见!
EDITThis is in response to the comments below. It still fails :(
编辑这是对以下评论的回应。它仍然失败:(
I added the coding comment to my view and the meta info to my html, as suggested by Gabi.
按照 Gabi 的建议,我将编码注释添加到了我的视图中,并将元信息添加到了我的 html 中。
Now my example above doesn't give an error, but the ? is displayed incorrectly.
现在我上面的例子没有给出错误,但是 ? 显示不正确。
I tried return render_to_response('tube/mysite.html', {"s": 'espa?ol'}). No error, but it doesn't dislay (it does if s = hello). The other information on the html page displays fine.
我试过了return render_to_response('tube/mysite.html', {"s": 'espa?ol'})。没有错误,但它不会显示(如果 s = hello 会显示)。html 页面上的其他信息显示正常。
I tried hardcoding 'espa?ol' into my HTML and that fails:
我尝试将“espa?ol”硬编码到我的 HTML 中,但失败了:
UnicodeDecodeError 'utf8' codec can't decode byte 0xf.
UnicodeDecodeError 'utf8' 编解码器无法解码字节 0xf。
I tried with the u in front of the string:
我尝试在字符串前面使用 u:
SyntaxError (unicode error) 'utf8' codec can't decode byte 0xf1
SyntaxError(unicode 错误)“utf8”编解码器无法解码字节 0xf1
Does this help at all??
这有帮助吗??
回答by Gabi Purcaru
Do you have this at the beginning of your script:
你的脚本开头有这个吗:
# -*- coding: utf-8 -*-
...?
……?
See this: http://www.python.org/dev/peps/pep-0263/
看到这个:http: //www.python.org/dev/peps/pep-0263/
EDIT: For the second problem, it's about the html encoding. Put this in the head of your html page (you should send the request as an html page, otherwise I don't think you will be able to output that character correctly):
编辑:对于第二个问题,它是关于 html 编码的。将其放在 html 页面的头部(您应该将请求作为 html 页面发送,否则我认为您将无法正确输出该字符):
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
回答by Dominique Guardiola
You need the coding comment Gabi mentioned and also use the unicode "u" sign before your string :
您需要 Gabi 提到的编码注释,并在字符串前使用 unicode "u" 符号:
return HttpResponse(u'espa?ol')
The best page I found on the web explaining all the ASCII/Unicode mess is this one : http://www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror
我在网上找到的最好的解释所有 ASCII/Unicode 混乱的页面是这个:http: //www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror
Enjoy!
享受!
回答by rubayeet
Set DEFAULT_CHARSETto 'utf-8'in your settings.pyfile.
设置DEFAULT_CHARSET到'utf-8'您的settings.py文件。
回答by Ryan
I was struggling with the same issue as @dkgirl, yet despite making all of the changes suggested here I still could not get constant strings that I'd defined in settings.py that contain ? to show up in pages rendered from my templates.
我正在为与@dkgirl 相同的问题而苦苦挣扎,但尽管进行了此处建议的所有更改,但我仍然无法获得我在 settings.py 中定义的常量字符串,其中包含 ? 显示在从我的模板渲染的页面中。
Instead I replaced every instance of "utf-8" in my python code from the above solutions to "ISO-8859-1" (Latin-1). It works fine now.
相反,我将 python 代码中的每个“utf-8”实例从上述解决方案替换为“ ISO-8859-1”(Latin-1)。它现在工作正常。
Odd since everything seems to indicate that ? is supported by utf-8 (and in fact I'm still using utf-8 in my templates). Perhaps this is an issue only on older Django versions? I'm running 1.2 beta 1.
奇怪,因为一切似乎都表明了这一点?由 utf-8 支持(实际上我仍在我的模板中使用 utf-8)。也许这只是较旧的 Django 版本的问题?我正在运行 1.2 beta 1。
Any other ideas what may have caused the problem? Here's my old traceback:
Traceback (most recent call last):
File "manage.py", line 4, in
import settings # Assumed to be in the same directory.
File "C:\dev\xxxxx\settings.py", line 53
('es', ugettext(u'Espa±ol') ),
SyntaxError: (unicode error) 'utf8' codec can't decode byte 0xf1 in position 0:
unexpected end of data
任何其他想法可能导致问题?这是我的旧回溯:
Traceback(最近一次调用最后一次):
导入设置中的
文件“manage.py”,第 4 行
# 假定位于同一目录中。
文件 "C:\dev\xxxxx\settings.py", line 53
('es', ugettext(u'Espa±ol') ),
SyntaxError: (unicode error) 'utf8' codec can't decode byte 0xf1 in position 0:数据意外结束
回答by Cedric
Insert at the top of views.py
在views.py顶部插入
# -*- coding: utf-8 -*-
And add "u" before your string
并在字符串前添加“u”
my_str = u"plus de détails"
Solved!
解决了!
回答by xus
ref from: https://docs.djangoproject.com/en/1.8/ref/unicode/
引用自:https: //docs.djangoproject.com/en/1.8/ref/unicode/
"If your code only uses ASCII data, it's safe to use your normal strings, passing them around at will, because ASCII is a subset of UTF-8.
“如果你的代码只使用 ASCII 数据,那么使用普通字符串是安全的,随意传递它们,因为 ASCII 是 UTF-8 的子集。
Don't be fooled into thinking that if your DEFAULT_CHARSET setting is set to something other than 'utf-8' you can use that other encoding in your bytestrings! DEFAULT_CHARSET only applies to the strings generated as the result of template rendering (and email). Django will always assume UTF-8 encoding for internal bytestrings. The reason for this is that the DEFAULT_CHARSET setting is not actually under your control (if you are the application developer). It's under the control of the person installing and using your application – and if that person chooses a different setting, your code must still continue to work. Ergo, it cannot rely on that setting.
不要误以为如果您的 DEFAULT_CHARSET 设置设置为 'utf-8' 以外的其他内容,您可以在字节串中使用其他编码!DEFAULT_CHARSET 仅适用于作为模板渲染(和电子邮件)结果生成的字符串。Django 将始终假定内部字节串采用 UTF-8 编码。这样做的原因是 DEFAULT_CHARSET 设置实际上不受您的控制(如果您是应用程序开发人员)。它由安装和使用您的应用程序的人控制——如果该人选择不同的设置,您的代码仍必须继续工作。因此,它不能依赖该设置。
In most cases when Django is dealing with strings, it will convert them to Unicode strings before doing anything else. So, as a general rule, if you pass in a bytestring, be prepared to receive a Unicode string back in the result."
在大多数情况下,当 Django 处理字符串时,它会在做任何其他事情之前将它们转换为 Unicode 字符串。因此,作为一般规则,如果您传入一个字节串,请准备好在结果中接收返回的 Unicode 字符串。”
回答by Dennis Degryse
The thing about encoding is that apart from declaring to use UTF-8 (via <meta>and the project's settings.pyfile) you should of course respect your declaration: make sure your files are saved using UTF-8 encoding.
关于编码的事情是,除了声明使用 UTF-8(通过<meta>和项目settings.py文件)之外,您当然应该尊重您的声明:确保使用 UTF-8 编码保存您的文件。
The reason is simple: you tell the interpreter to do IO using a specific charset. When you didn't save your files with that charset, the interpreter will get lost.
原因很简单:您告诉解释器使用特定字符集进行 IO。当您没有使用该字符集保存文件时,解释器将丢失。
Some IDEs and editors will use Latin1 (ISO-8859-1) by default, which explains why Ryan his answer could work. Although it's not a valid solution to the original question being asked, but a quick fix.
默认情况下,一些 IDE 和编辑器将使用 Latin1 (ISO-8859-1),这解释了为什么 Ryan 他的答案可以工作。虽然这不是提出的原始问题的有效解决方案,但可以快速解决。

