如何清理 HTML 代码以防止 Java 或 JSP 中的 XSS 攻击?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3587199/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 02:32:40  来源:igfitidea点击:

How to sanitize HTML code to prevent XSS attacks in Java or JSP?

javajspxss

提问by KeatsPeeks

I'm writing a servlet-based application in which I need to provide a messaging system. I'm in a rush, so I choose CKEditorto provide editing capabilities, and I currently insert the generated html directly in the web page displaying all messages (messages are stored in a MySQL databse, fyi). CKEditor already filters HTML based on a white list, but a user can still inject malicious code with a POST request, so this is not enough.

我正在编写一个基于 servlet 的应用程序,我需要在其中提供一个消息传递系统。我很着急,所以我选择CKEditor来提供编辑功能,我目前将生成的html直接插入显示所有消息的网页中(消息存储在MySQL数据库中,仅供参考)。CKEditor 已经根据白名单过滤了 HTML,但是用户仍然可以通过 POST 请求注入恶意代码,所以这还不够。

A good library already exists to prevent XSS attacks by filtering HTML tags, but it's written in PHP: HTML Purifier

一个很好的库已经存在通过过滤 HTML 标签来防止 XSS 攻击,但它是用 PHP 编写的:HTML Purifier

So, is there a similar mature library that can be used in Java? A simple string replacement based on a white list doesn't seem to be enough, since I'd like to filter malformed tags too(which could alter the design of the page on which the message is displayed).

那么,是否有类似的成熟库可以在 Java 中使用?基于白名单的简单字符串替换似乎还不够,因为我也想过滤格式错误的标签(这可能会改变显示消息的页面的设计)。

If there isn't, then how should I proceed? An XML parser seems overkill.

如果没有,那么我应该如何进行?XML 解析器似乎有点矫枉过正。

Note: There are a lot of questions about this on SO, but all the answers refer to filter ALL HTML tags: I want to keep valid formatting tags.

注意:SO 上有很多关于此的问题,但所有答案都涉及过滤所有 HTML 标签:我想保留有效的格式标签。

采纳答案by Thierry-Dimitri Roy

You should use AntiSamy. (That's what I did)

你应该使用AntiSamy。(这就是我所做的

回答by BalusC

I'd recommend using Jsoupfor this. Here's an extract of relevance from its site.

我建议为此使用Jsoup。这是其网站的相关摘录。

Sanitize untrusted HTML

Problem

You want to allow untrusted users to supply HTML for output on your website (e.g. as comment submission). You need to clean this HTML to avoid cross-site scripting(XSS) attacks.

Solution

Use the jsoup HTML Cleanerwith a configuration specified by a Whitelist.

String unsafe = 
      "<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
      // now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>

清理不受信任的 HTML

问题

您希望允许不受信任的用户提供 HTML 以在您的网站上输出(例如作为评论提交)。您需要清理此 HTML 以避免跨站点脚本(XSS) 攻击。

解决方案

使用Cleaner具有指定配置的 jsoup HTML Whitelist

String unsafe = 
      "<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
      // now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>

Jsoup offers more advantages than that as well. See also Pros and Cons of HTML parsers in Java.

Jsoup 还提供了更多的优势。另请参阅Java 中 HTML 解析器的优缺点

回答by wilsona

If none of the ready-made options seem like enough, there is an excellent series of articles on XSS and attack prevention at Google Code. It should provide plenty of information to work with, if you end up going down that path.

如果没有一个现成的选项看起来足够,Google Code上有一系列关于 XSS 和攻击预防的优秀文章。如果您最终走上这条路,它应该提供大量信息供您使用。