python 在 Django 中对富文本字段使用安全过滤器

Question

提问by Ned Batchelder

I am using TinyMCEeditor for textarea fileds in Djangoforms.

我正在将TinyMCE编辑器用于Django表单中的textarea 文件。

Now, in order to display the rich text back to the user, I am forced to use the "safe" filter in Django templates so that HTML rich text can be displayed on the browser.

现在，为了向用户显示富文本，我被迫在 Django 模板中使用“安全”过滤器，以便可以在浏览器上显示 HTML 富文本。

Suppose JavaScript is disabled on the user's browser, TinyMCE won't load and the user could pass <script>or other XSStags from such a textarea field. Such HTML won't be safe to display back to the User.

假设在用户的浏览器上禁用了 JavaScript，TinyMCE 不会加载并且用户可以从这样的 textarea 字段传递<script>或其他XSS标签。这样的 HTML 不能安全地显示给用户。

How do I take care of such unsafe HTML Text that doesn't come from TinyMCE?

我如何处理这种不是来自 TinyMCE 的不安全 HTML 文本？

Answer 1

回答by Ned Batchelder

You are right to be concerned about raw HTML, but not just for Javascript-disabled browsers. When considering the security of your server, you have to ignore any work done in the browser, and look solely at what the server accepts and what happens to it. Your server accepts HTML and displays it on the page. This is unsafe.

您关心原始 HTML 是正确的，但不仅仅是针对禁用 Javascript 的浏览器。在考虑服务器的安全性时，您必须忽略在浏览器中完成的任何工作，而只查看服务器接受什么以及对它发生了什么。您的服务器接受 HTML 并将其显示在页面上。这是不安全的。

The fact that TinyMce quotes HTML is a false security: the server trusts what it accepts, which it should not.

TinyMce 引用 HTML 的事实是一种虚假的安全性：服务器信任它接受的内容，而不信任它。

The solution to this is to process the HTML when it arrives, to remove dangerous constructs. This is a complicated problem to solve. Take a look at the XSS Cheat Sheetto see the wide variety of inputs that could cause a problem.

对此的解决方案是在 HTML 到达时对其进行处理，以删除危险的结构。这是一个需要解决的复杂问题。查看XSS 备忘单，了解可能导致问题的各种输入。

lxml has a function to clean HTML: http://lxml.de/lxmlhtml.html#cleaning-up-html, but I've never used it, so I can't vouch for its quality.

lxml 有一个清理 HTML 的功能：http: //lxml.de/lxmlhtml.html#cleaning-up-html，但我从未使用过它，所以我不能保证它的质量。

Answer 2

回答by seddonym

Use django-bleach. This provides you with a bleachtemplate filter that allows you to filter out just the tags you want:

使用django-bleach。这为您提供了一个bleach模板过滤器，允许您仅过滤掉您想要的标签：

{% load bleach_tags %}
{{ mymodel.my_html_field|bleach }}

The trick is to configure the editor to produce the same tags as you're willing to 'let through' in your bleach settings.

诀窍是配置编辑器以生成与您愿意在漂白设置中“通过”相同的标签。

Here's an example of my bleach settings:

这是我的漂白设置示例：

# Which HTML tags are allowed
BLEACH_ALLOWED_TAGS = ['p', 'h3', 'h4', 'em', 'strong', 'a', 'ul', 'ol', 'li', 'blockquote']
# Which HTML attributes are allowed
BLEACH_ALLOWED_ATTRIBUTES = ['href', 'title', 'name']
BLEACH_STRIP_TAGS = True

Then you can configure TinyMCE (or whatever WYSIWYG editor you're using) only to have the buttons that create the allowed tags.

然后，您可以配置 TinyMCE（或您正在使用的任何 WYSIWYG 编辑器），只使用创建允许标签的按钮。

Answer 3

回答by AbeEstrada

You can use the template filter "removetags" and just remove 'script'.

您可以使用模板过滤器“ removetags”并删除“脚本”。

Note that removetagshas been removed from Django 2.0. Here is the deprecation notice from the docs:

请注意，removetags已从 Django 2.0 中删除。这是文档中的弃用通知：

Deprecated since version 1.8:removetagscannot guarantee HTML safe output and has been deprecated due to security concerns. Consider using bleachinstead.

1.8 版后已弃用：removetags无法保证 HTML 安全输出，出于安全考虑已弃用。考虑bleach改用。

Answer 4

回答by Paul McMillan

There isn't a good answer to this one. TinyMCE generates HTML, and django's auto-escape specifically removes HTML.

这个没有很好的答案。TinyMCE 生成 HTML，django 的自动转义专门去除 HTML。

The traditional solution to this problem has been to either use some non-html markup language in the user input side (bbcode, markdown, etc.) or to whitelist a limited number of HTML tags. TinyMCE/HTML are generally only appropriate input solutions for more or less trusted users.

这个问题的传统解决方案是在用户输入端使用一些非 html 标记语言（bbcode、markdown 等），或者将有限数量的 HTML 标签列入白名单。TinyMCE/HTML 通常只是适合或多或少受信任用户的输入解决方案。

The whitelist approach is tricky to implement without any security holes. The one thing you don't want to do is try to just detect "bad" tags - you WILL miss edge cases.

白名单方法很难在没有任何安全漏洞的情况下实现。您不想做的一件事就是尝试只检测“坏”标签——您将错过边缘情况。

python 在 Django 中对富文本字段使用安全过滤器

提问by Ned Batchelder

回答by Ned Batchelder

回答by seddonym

回答by AbeEstrada

回答by Paul McMillan

相关推荐

最近更新

标签

python 在 Django 中对富文本字段使用安全过滤器

提问by Ned Batchelder

回答by Ned Batchelder

回答by seddonym

回答by AbeEstrada

回答by Paul McMillan

相关推荐

使用 Python 将乳胶代码转换为图像（或其他可显示格式）

我如何对 Python 包进行 Debian 打包？

python smtp gmail身份验证错误（通过gmail smtp服务器发送电子邮件）

python 如何检查小部件在 Tkinter 中是否具有焦点？

相关推荐

最近更新

标签