Python 在 Jinja2 模板中使用 utf-8 字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22181944/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:26:36  来源:igfitidea点击:

Using utf-8 characters in a Jinja2 template

pythonpython-2.7utf-8character-encodingjinja2

提问by alex.ac

I'm trying to use utf-8 characters when rendering a template with Jinja2. Here is how my template looks like:

我在使用 Jinja2 渲染模板时尝试使用 utf-8 字符。这是我的模板的样子:

<!DOCTYPE HTML>
<html manifest="" lang="en-US">
<head>
    <meta charset="UTF-8">
    <title>{{title}}</title>
...

The title variable is set something like this:

标题变量设置如下:

index_variables = {'title':''}
index_variables['title'] = myvar.encode("utf8")

template = env.get_template('index.html')
index_file = open(preview_root + "/" + "index.html", "w")

index_file.write(
    template.render(index_variables)
)
index_file.close()

Now, the problem is that myvaris a message read from a message queue and can contain those special utf8 characters (ex. "Séptimo Cine").

现在,问题是myvar是从消息队列中读取的消息,并且可以包含那些特殊的 utf8 字符(例如“Séptimo Cine”)。

The rendered template looks something like:

渲染的模板看起来像:

...
    <title>S\u00e9ptimo Cine</title>
...

and I want it to be:

我希望它是:

...
    <title>Séptimo Cine</title>
...

I have made several tests but I can't get this to work.

我已经做了几次测试,但我无法让它工作。

  • I have tried to set the title variable without .encode("utf8"), but it throws an exception (ValueError: Expected a bytes object, not a unicode object), so my guess is that the initial message is unicode

  • I have used chardet.detectto get the encoding of the message (it's "ascii"), then did the following: myvar.decode("ascii").encode("cp852"), but the title is still not rendered correctly.

  • I also made sure that my template is a UTF-8 file, but it didn't make a difference.

  • 我试图在没有.encode("utf8") 的情况下设置 title 变量,但它抛出一个异常(ValueError: Expected a bytes object, not a unicode object),所以我的猜测是初始消息是 unicode

  • 我已经使用chardet.detect获取消息的编码(它是“ascii”),然后执行以下操作:myvar.decode(“ascii”).encode(“cp852”),但标题仍未正确呈现。

  • 我还确保我的模板是一个 UTF-8 文件,但它没有任何区别。

Any ideas on how to do this?

关于如何做到这一点的任何想法?

采纳答案by Lukas Graf

TL;DR:

特尔;博士:

  • Pass Unicodeto template.render()
  • Encode the rendered unicode result to a bytestring before writing it to a file
  • 将 Unicode 传递template.render()
  • 在将呈现的 unicode 结果写入文件之前将其编码为字节串

This had me puzzled for a while. Because you do

这让我困惑了一段时间。因为你做

index_file.write(
    template.render(index_variables)
)

in one statement, that's basically just one line where Python is concerned, so the traceback you get is misleading: The exception I got when recreating your test case didn't happen in template.render(index_variables), but in index_file.write()instead. So splitting the code up like this

在一个声明中,这基本上只是与 Python 相关的一行,因此您得到的回溯具有误导性:我在重新创建测试用例时遇到的异常没有发生在 中template.render(index_variables),而是发生在index_file.write(). 所以像这样拆分代码

output = template.render(index_variables)
index_file.write(output)

was the first step to diagnose where exactly the UnicodeEncodeErrorhappens.

是诊断确切UnicodeEncodeError发生位置的第一步。

Jinja returns unicode whet you let it render the template. Therefore you need to encode the result to a bytestring before you can write it to a file:

Jinja 返回 unicode,你让它呈现模板。因此,您需要将结果编码为字节串,然后才能将其写入文件:

index_file.write(output.encode('utf-8'))

The second error is that you pass in an utf-8encoded bytestring to template.render()- Jinja wants unicode. So assuming your myvarcontains UTF-8, you need to decode it to unicode first:

第二个错误是,你在通过utf-8编码字节串到template.render()-金贾要统一。因此,假设您myvar包含 UTF-8,您需要先将其解码为 un​​icode:

index_variables['title'] = myvar.decode('utf-8')

So, to put it all together, this works for me:

所以,把它们放在一起,这对我有用:

# -*- coding: utf-8 -*-

from jinja2 import Environment, PackageLoader
env = Environment(loader=PackageLoader('myproject', 'templates'))


# Make sure we start with an utf-8 encoded bytestring
myvar = 'Séptimo Cine'

index_variables = {'title':''}

# Decode the UTF-8 string to get unicode
index_variables['title'] = myvar.decode('utf-8')

template = env.get_template('index.html')

with open("index_file.html", "w") as index_file:
    output = template.render(index_variables)

    # jinja returns unicode - so `output` needs to be encoded to a bytestring
    # before writing it to a file
    index_file.write(output.encode('utf-8'))

回答by Andrew Kloos

Try changing your render command to this...

尝试将渲染命令更改为此...

template.render(index_variables).encode( "utf-8" )

Jinja2's documentation says "This will return the rendered template as unicode string."

Jinja2 的文档说“这会将呈现的模板作为 unicode 字符串返回。”

http://jinja.pocoo.org/docs/api/?highlight=render#jinja2.Template.render

http://jinja.pocoo.org/docs/api/?highlight=render#jinja2.Template.render

Hope this helps!

希望这可以帮助!

回答by alfonso olivas

And if nothing works because you have a mix of languages -like in my case-, just replace "utf-8" for "utf-16"

如果没有任何效果,因为您有多种语言 - 就像在我的情况下一样 - 只需将“utf-8”替换为“utf-16”

All the encoding options here:

这里的所有编码选项:

https://docs.python.org/2.4/lib/standard-encodings.html

https://docs.python.org/2.4/lib/standard-encodings.html

回答by asmaier

Add the following lines to the beginning of your script and it will work fine without any further changes:

将以下行添加到脚本的开头,无需任何进一步更改即可正常工作:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
reload(sys)
sys.setdefaultencoding("utf-8")