java XSS 过滤器以删除所有脚本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31308968/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
XSS filter to remove all scripts
提问by Cool Techie
I am implementing an XSS filter for my web application and also using the ESAPI encoder to sanitise the input.
我正在为我的 Web 应用程序实现一个 XSS 过滤器,并使用 ESAPI 编码器来清理输入。
The patterns I am using are as given below,
我使用的模式如下所示,
// Script fragments
Pattern.compile("<script>(.*?)</script>", Pattern.CASE_INSENSITIVE),
// src='...'
Pattern.compile("src[\r\n]*=[\r\n]*\\'(.*?)\\'", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
Pattern.compile("src[\r\n]*=[\r\n]*\\"(.*?)\\"", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
// lonely script tags
Pattern.compile("</script>", Pattern.CASE_INSENSITIVE),
Pattern.compile("<script(.*?)>", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
// eval(...)
Pattern.compile("eval\((.*?)\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
// expression(...)
Pattern.compile("expression\((.*?)\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
// javascript:...
Pattern.compile("javascript:", Pattern.CASE_INSENSITIVE),
// vbscript:...
Pattern.compile("vbscript:", Pattern.CASE_INSENSITIVE),
// onload(...)=...
Pattern.compile("onload(.*?)=", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL)
But, still a few script are not getting filtered specially the one which are appended to a parameter like
但是,仍然有一些脚本没有被特别过滤,例如附加到参数的脚本
url?sourceId=abx;alert('hello');
url?sourceId=abx; 警报('你好');
How do I handle these?
我如何处理这些?
回答by avgvstvs
This isn't the right approach. It's mathematically impossible to write a regex capable of correctly punting XSS.(Regex is "regular" but HTML and Javascript are both context-free grammars.)
这不是正确的方法。 编写一个能够正确处理 XSS 的正则表达式在数学上是不可能的。(正则表达式是“常规的”,但 HTML 和 Javascript 都是上下文无关的语法。)
You can however guarantee that when you switch contexts, (hand off a piece of data that is going to be interpreted) that the data is correctly escaped for that context switch. So, when sending data to a browser, escape it for HTML if its being handled as HTML or as Javascript if its being handled by javascript.
但是,您可以保证在切换上下文时(传递将要解释的数据片段)该数据已正确转义以用于该上下文切换。因此,当向浏览器发送数据时,如果将其作为 HTML 处理,则将其转义为 HTML,如果由 javascript 处理,则将其转义为 Javascript。
If you DO need to allow HTML/javascript into your application, then you'll want a web-application firewall or a framework like HDIV.
如果您确实需要允许 HTML/javascript 进入您的应用程序,那么您将需要一个 Web 应用程序防火墙或像 HDIV 这样的框架。
回答by Alessandro Giannone
You can combine ESAPI and JSoup to clear out all the XSS vulnerabilities. I would definitely avoid trying to manually write all the regexes when other libraries are built to handle this for you.
你可以结合 ESAPI 和 JSoup 来清除所有的 XSS 漏洞。当构建其他库来为您处理此问题时,我绝对会避免尝试手动编写所有正则表达式。
Here is an XSS filter implementation for Jersey 2.x: How to Modify QueryParam and PathParam in Jersey 2
这是 Jersey 2.x 的 XSS 过滤器实现:How to Modify QueryParam and PathParam in Jersey 2