php PHP中的XSS过滤功能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1336776/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 02:05:22  来源:igfitidea点击:

XSS filtering function in PHP

phpfilterxss

提问by codecowboy

Does anyone know of a good function out there for filtering generic input from forms? Zend_Filter_input seems to require prior knowledge of the contents of the input and I'm concerned that using something like HTML Purifier will have a big performance impact.

有谁知道有一个很好的功能可以过滤表单中的通用输入?Zend_Filter_input 似乎需要输入内容的先验知识,我担心使用 HTML Purifier 之类的东西会对性能产生很大的影响。

What about something like : http://snipplr.com/view/1848/php--sacar-xss/

怎么样:http: //snipplr.com/view/1848/php--sacar-xss/

Many thanks for any input.

非常感谢您的任何意见。

回答by cletus

Simple way? Use strip_tags():

简单的方法?使用strip_tags()

$str = strip_tags($input);

You can also use filter_var()for that:

您还可以filter_var()为此使用:

$str = filter_var($input, FILTER_SANITIZE_STRING);

The advantage of filter_var()is that you can control the behaviour by, for example, stripping or encoding low and high characters.

的优点filter_var()是您可以通过例如剥离或编码低位和高位字符来控制行为。

Here is a list of sanitizing filters.

这是消毒过滤器的列表。

回答by Sarfraz

There are a number of ways hackers put to use for XSS attacks, PHP's built-in functions do not respond to all sorts of XSS attacks. Hence, functions such as strip_tags, filter_var, mysql_real_escape_string, htmlentities, htmlspecialchars, etc do not protect us 100%. You need a better mechanism, here is what is solution:

黑客针对 XSS 攻击的方法有很多种,PHP 的内置函数并不能响应各种 XSS 攻击。因此,strip_tags、filter_var、mysql_real_escape_string、htmlentities、htmlspecialchars 等函数并不能 100% 保护我们。您需要一个更好的机制,这是解决方案:

function xss_clean($data)
{
// Fix &entity\n;
$data = str_replace(array('&','<','>'), array('&','<','>'), $data);
$data = preg_replace('/(&#*\w+)[\x00-\x20]+;/u', ';', $data);
$data = preg_replace('/(&#x*[0-9A-F]+);*/iu', ';', $data);
$data = html_entity_decode($data, ENT_COMPAT, 'UTF-8');

// Remove any attribute starting with "on" or xmlns
$data = preg_replace('#(<[^>]+?[\x00-\x20"\'])(?:on|xmlns)[^>]*+>#iu', '>', $data);

// Remove javascript: and vbscript: protocols
$data = preg_replace('#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '=nojavascript...', $data);
$data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '=novbscript...', $data);
$data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '=nomozbinding...', $data);

// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#i', '>', $data);
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#i', '>', $data);
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#iu', '>', $data);

// Remove namespaced elements (we do not need them)
$data = preg_replace('#</*\w+:\w[^>]*+>#i', '', $data);

do
{
    // Remove really unwanted tags
    $old_data = $data;
    $data = preg_replace('#</*(?:applet|b(?:ase|gsound|link)|embed|frame(?:set)?|i(?:frame|layer)|l(?:ayer|ink)|meta|object|s(?:cript|tyle)|title|xml)[^>]*+>#i', '', $data);
}
while ($old_data !== $data);

// we are done...
return $data;
}

回答by opHASnoNAME

the best and the secure way is to use HTML Purifier. Follow this link for some hints on using it with Zend Framework.

最好和最安全的方法是使用 HTML Purifier。按照此链接获取有关将它与 Zend Framework 一起使用的一些提示。

HTML Purifier with Zend Framework

带有 Zend 框架的 HTML Purifier

回答by 3eighty

I have a similar problem. I need users to submit html content to a profile page with a great WYSIWYG editor (Redactorjs!), i wrote the following function to clean the submitted html:

我有一个类似的问题。我需要用户使用出色的 WYSIWYG 编辑器(Redactorjs!)将 html 内容提交到个人资料页面,我编写了以下函数来清理提交的 html:

    <?php function filterxss($str) {
//Initialize DOM:
$dom = new DOMDocument();
//Load content and add UTF8 hint:
$dom->loadHTML('<meta http-equiv="content-type" content="text/html; charset=utf-8">'.$str);
//Array holds allowed attributes and validation rules:
$check = array('src'=>'#(http://[^\s]+(?=\.(jpe?g|png|gif)))#i','href'=>'|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i');
//Loop all elements:
foreach($dom->getElementsByTagName('*') as $node){
    for($i = $node->attributes->length -1; $i >= 0; $i--){
        //Get the attribute:
        $attribute = $node->attributes->item($i);
        //Check if attribute is allowed:
        if( in_array($attribute->name,array_keys($check))) {
            //Validate by regex:    
            if(!preg_match($check[$attribute->name],$attribute->value)) { 
                //No match? Remove the attribute
                $node->removeAttributeNode($attribute); 
            }
        }else{
            //Not allowed? Remove the attribute:
            $node->removeAttributeNode($attribute);
        }
    }
}
var_dump($dom->saveHTML()); } ?>

The $check array holds all the allowed attributes and validation rules. Maybe this is useful for some of you. I haven't tested is yet, so tips are welcome

$check 数组包含所有允许的属性和验证规则。也许这对你们中的一些人有用。我还没有测试过,所以欢迎提供提示

回答by ingnorant

function clean($data){
    $data = rawurldecode($data);
    return filter_var($data, FILTER_SANITIZE_SPEC_CHARS);
}

回答by Doug Amos

htmlspecialchars()is perfectly adequate for filtering user input that is displayed in html forms.

htmlspecialchars()非常适合过滤以 html 形式显示的用户输入。

回答by ymakux

All above methods don't allow to preserve some tags like <a>, <table>etc. There is an ultimate solution http://sourceforge.net/projects/kses/Drupal uses it

以上所有方法都不允许保留一些标签,例如<a><table>等等。有一个最终的解决方案http://sourceforge.net/projects/kses/Drupal 使用它

回答by Taras

According to www.mcafeesecure.com General Solution for vulnerable to cross-site scripting (XSS) filter function can be:

根据 www.mcafeesecure.com 易受跨站脚本攻击 (XSS) 过滤功能的通用解决方案可以是:

function xss_cleaner($input_str) {
    $return_str = str_replace( array('<','>',"'",'"',')','('), array('&lt;','&gt;','&apos;','&#x22;','&#x29;','&#x28;'), $input_str );
    $return_str = str_ireplace( '%3Cscript', '', $return_str );
    return $return_str;
}

回答by Georg

I found a solution for my problem with the posts with german umlaut. To provide from totally cleaning (killing) the posts, i encode the incoming data:

我用德语元音变音的帖子找到了解决我的问题的方法。为了提供完全清理(杀死)帖子,我对传入的数据进行了编码:

    *$data = utf8_encode($data);
    ... function ...*

And at last i decode the output to get correct signs:

最后我解码输出以获得正确的符号:

    *$data = utf8_decode($data);*

Now the post go through the filter function and i get a correct result...

现在帖子通过过滤器功能,我得到了正确的结果......

回答by user3580379

Try using for Clean XSS

尝试使用 Clean XSS

xss_clean($data): "><script>alert(String.fromCharCode(74,111,104,116,111,32,82,111,98,98,105,101))</script>