如何使用 PHP 清理用户输入?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/129677/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 21:35:20  来源:igfitidea点击:

How can I sanitize user input with PHP?

phpsecurityxsssql-injectionuser-input

提问by Brent

Is there a catchall function somewhere that works well for sanitizing user input for SQL injection and XSS attacks, while still allowing certain types of HTML tags?

某处是否有一个包罗万象的功能可以很好地清理 SQL 注入和 XSS 攻击的用户输入,同时仍然允许某些类型的 HTML 标签?

回答by troelskn

It's a common misconception that user input can be filtered. PHP even has a (now deprecated) "feature", called magic-quotes, that builds on this idea. It's nonsense. Forget about filtering (or cleaning, or whatever people call it).

用户输入可以被过滤是一个常见的误解。PHP 甚至有一个(现已弃用)“功能”,称为magic-quotes,它建立在这个想法之上。这是胡说八道。忘记过滤(或清洁,或任何人们所说的)。

What you should do, to avoid problems, is quite simple: whenever you embed a string within foreign code, you must escape it, according to the rules of that language. For example, if you embed a string in some SQL targeting MySQL, you must escape the string with MySQL's function for this purpose (mysqli_real_escape_string). (Or, in case of databases, using prepared statements are a better approach, when possible.)

为了避免出现问题,您应该做的很简单:每当您将字符串嵌入到外部代码中时,您必须根据该语言的规则对其进行转义。例如,如果您在一些针对 MySQL 的 SQL 中嵌入了一个字符串,您必须为此使用 MySQL 的函数 ( mysqli_real_escape_string)对字符串进行转义。(或者,对于数据库,在可能的情况下,使用准备好的语句是更好的方法。)

Another example is HTML: If you embed strings within HTML markup, you must escape it with htmlspecialchars. This means that every single echoor printstatement should use htmlspecialchars.

另一个例子是 HTML:如果在 HTML 标记中嵌入字符串,则必须使用htmlspecialchars. 这意味着每个单独的echoprint语句都应该使用htmlspecialchars.

A third example could be shell commands: If you are going to embed strings (such as arguments) to external commands, and call them with exec, then you must use escapeshellcmdand escapeshellarg.

第三个示例可能是 shell 命令:如果要将字符串(例如参数)嵌入到外部命令中,并使用 调用它们exec,则必须使用escapeshellcmdescapeshellarg

And so on and so forth ...

等等等等 ...

The onlycase where you need to actively filter data, is if you're accepting preformatted input. For example, if you let your users post HTML markup, that you plan to display on the site. However, you should be wise to avoid this at all cost, since no matter how well you filter it, it will always be a potential security hole.

您需要主动过滤数据的唯一情况是您接受预格式化的输入。例如,如果您让您的用户发布 HTML 标记,您打算在网站上显示这些标记。但是,您应该明智地不惜一切代价避免这种情况,因为无论您过滤得多么好,它始终是一个潜在的安全漏洞。

回答by Andy Lester

Do not try to prevent SQL injection by sanitizing input data.

不要试图通过清理输入数据来防止 SQL 注入。

Instead, do not allow data to be used in creating your SQL code. Use Prepared Statements (i.e. using parameters in a template query) that uses bound variables. It is the only way to be guaranteed against SQL injection.

相反,不允许在创建 SQL 代码时使用数据。使用使用绑定变量的准备好的语句(即在模板查询中使用参数)。这是保证不会发生 SQL 注入的唯一方法。

Please see my website http://bobby-tables.com/for more about preventing SQL injection.

请访问我的网站http://bobby-tables.com/了解更多关于防止 SQL 注入的信息。

回答by Daniel Papasian

No. You can't generically filter data without any context of what it's for. Sometimes you'd want to take a SQL query as input and sometimes you'd want to take HTML as input.

不。您不能在没有任何上下文的情况下对数据进行一般过滤。有时您希望将 SQL 查询作为输入,有时您希望将 HTML 作为输入。

You need to filter input on a whitelist -- ensure that the data matches some specification of what you expect. Then you need to escape it before you use it, depending on the context in which you are using it.

您需要过滤白名单上的输入——确保数据符合您期望的某些规范。然后你需要在使用它之前转义它,这取决于你使用它的上下文。

The process of escaping data for SQL - to prevent SQL injection - is very different from the process of escaping data for (X)HTML, to prevent XSS.

为 SQL 转义数据的过程 - 防止 SQL 注入 - 与为 (X)HTML 转义数据的过程非常不同,以防止 XSS。

回答by SchizoDuckie

PHP has the new nice filter_input functions now, that for instance liberate you from finding 'the ultimate e-mail regex' now that there is a built-in FILTER_VALIDATE_EMAIL type

PHP 现在有新的漂亮的 filter_input 函数,例如,现在有一个内置的 FILTER_VALIDATE_EMAIL 类型,让你从寻找“终极电子邮件正则表达式”中解放出来

My own filter class (uses JavaScript to highlight faulty fields) can be initiated by either an ajax request or normal form post. (see the example below)

我自己的过滤器类(使用 JavaScript 来突出显示错误字段)可以通过 ajax 请求或普通表单发布来启动。(见下面的例子)

/**
 *  Pork.FormValidator
 *  Validates arrays or properties by setting up simple arrays. 
 *  Note that some of the regexes are for dutch input!
 *  Example:
 * 
 *  $validations = array('name' => 'anything','email' => 'email','alias' => 'anything','pwd'=>'anything','gsm' => 'phone','birthdate' => 'date');
 *  $required = array('name', 'email', 'alias', 'pwd');
 *  $sanitize = array('alias');
 *
 *  $validator = new FormValidator($validations, $required, $sanitize);
 *                  
 *  if($validator->validate($_POST))
 *  {
 *      $_POST = $validator->sanitize($_POST);
 *      // now do your saving, $_POST has been sanitized.
 *      die($validator->getScript()."<script type='text/javascript'>alert('saved changes');</script>");
 *  }
 *  else
 *  {
 *      die($validator->getScript());
 *  }   
 *  
 * To validate just one element:
 * $validated = new FormValidator()->validate('blah@bla.', 'email');
 * 
 * To sanitize just one element:
 * $sanitized = new FormValidator()->sanitize('<b>blah</b>', 'string');
 * 
 * @package pork
 * @author SchizoDuckie
 * @copyright SchizoDuckie 2008
 * @version 1.0
 * @access public
 */
class FormValidator
{
    public static $regexes = Array(
            'date' => "^[0-9]{1,2}[-/][0-9]{1,2}[-/][0-9]{4}$",
            'amount' => "^[-]?[0-9]+$",
            'number' => "^[-]?[0-9,]+$",
            'alfanum' => "^[0-9a-zA-Z ,.-_\s\?\!]+$",
            'not_empty' => "[a-z0-9A-Z]+",
            'words' => "^[A-Za-z]+[A-Za-z \s]*$",
            'phone' => "^[0-9]{10,11}$",
            'zipcode' => "^[1-9][0-9]{3}[a-zA-Z]{2}$",
            'plate' => "^([0-9a-zA-Z]{2}[-]){2}[0-9a-zA-Z]{2}$",
            'price' => "^[0-9.,]*(([.,][-])|([.,][0-9]{2}))?$",
            '2digitopt' => "^\d+(\,\d{2})?$",
            '2digitforce' => "^\d+\,\d\d$",
            'anything' => "^[\d\D]{1,}$"
    );
    private $validations, $sanatations, $mandatories, $errors, $corrects, $fields;


    public function __construct($validations=array(), $mandatories = array(), $sanatations = array())
    {
        $this->validations = $validations;
        $this->sanitations = $sanitations;
        $this->mandatories = $mandatories;
        $this->errors = array();
        $this->corrects = array();
    }

    /**
     * Validates an array of items (if needed) and returns true or false
     *
     */
    public function validate($items)
    {
        $this->fields = $items;
        $havefailures = false;
        foreach($items as $key=>$val)
        {
            if((strlen($val) == 0 || array_search($key, $this->validations) === false) && array_search($key, $this->mandatories) === false) 
            {
                $this->corrects[] = $key;
                continue;
            }
            $result = self::validateItem($val, $this->validations[$key]);
            if($result === false) {
                $havefailures = true;
                $this->addError($key, $this->validations[$key]);
            }
            else
            {
                $this->corrects[] = $key;
            }
        }

        return(!$havefailures);
    }

    /**
     *
     *  Adds unvalidated class to thos elements that are not validated. Removes them from classes that are.
     */
    public function getScript() {
        if(!empty($this->errors))
        {
            $errors = array();
            foreach($this->errors as $key=>$val) { $errors[] = "'INPUT[name={$key}]'"; }

            $output = '$$('.implode(',', $errors).').addClass("unvalidated");'; 
            $output .= "new FormValidator().showMessage();";
        }
        if(!empty($this->corrects))
        {
            $corrects = array();
            foreach($this->corrects as $key) { $corrects[] = "'INPUT[name={$key}]'"; }
            $output .= '$$('.implode(',', $corrects).').removeClass("unvalidated");';   
        }
        $output = "<script type='text/javascript'>{$output} </script>";
        return($output);
    }


    /**
     *
     * Sanitizes an array of items according to the $this->sanitations
     * sanitations will be standard of type string, but can also be specified.
     * For ease of use, this syntax is accepted:
     * $sanitations = array('fieldname', 'otherfieldname'=>'float');
     */
    public function sanitize($items)
    {
        foreach($items as $key=>$val)
        {
            if(array_search($key, $this->sanitations) === false && !array_key_exists($key, $this->sanitations)) continue;
            $items[$key] = self::sanitizeItem($val, $this->validations[$key]);
        }
        return($items);
    }


    /**
     *
     * Adds an error to the errors array.
     */ 
    private function addError($field, $type='string')
    {
        $this->errors[$field] = $type;
    }

    /**
     *
     * Sanitize a single var according to $type.
     * Allows for static calling to allow simple sanitization
     */
    public static function sanitizeItem($var, $type)
    {
        $flags = NULL;
        switch($type)
        {
            case 'url':
                $filter = FILTER_SANITIZE_URL;
            break;
            case 'int':
                $filter = FILTER_SANITIZE_NUMBER_INT;
            break;
            case 'float':
                $filter = FILTER_SANITIZE_NUMBER_FLOAT;
                $flags = FILTER_FLAG_ALLOW_FRACTION | FILTER_FLAG_ALLOW_THOUSAND;
            break;
            case 'email':
                $var = substr($var, 0, 254);
                $filter = FILTER_SANITIZE_EMAIL;
            break;
            case 'string':
            default:
                $filter = FILTER_SANITIZE_STRING;
                $flags = FILTER_FLAG_NO_ENCODE_QUOTES;
            break;

        }
        $output = filter_var($var, $filter, $flags);        
        return($output);
    }

    /** 
     *
     * Validates a single var according to $type.
     * Allows for static calling to allow simple validation.
     *
     */
    public static function validateItem($var, $type)
    {
        if(array_key_exists($type, self::$regexes))
        {
            $returnval =  filter_var($var, FILTER_VALIDATE_REGEXP, array("options"=> array("regexp"=>'!'.self::$regexes[$type].'!i'))) !== false;
            return($returnval);
        }
        $filter = false;
        switch($type)
        {
            case 'email':
                $var = substr($var, 0, 254);
                $filter = FILTER_VALIDATE_EMAIL;    
            break;
            case 'int':
                $filter = FILTER_VALIDATE_INT;
            break;
            case 'boolean':
                $filter = FILTER_VALIDATE_BOOLEAN;
            break;
            case 'ip':
                $filter = FILTER_VALIDATE_IP;
            break;
            case 'url':
                $filter = FILTER_VALIDATE_URL;
            break;
        }
        return ($filter === false) ? false : filter_var($var, $filter) !== false ? true : false;
    }       



}

Of course, keep in mind that you need to do your sql query escaping too depending on what type of db your are using (mysql_real_escape_string() is useless for an sql server for instance). You probably want to handle this automatically at your appropriate application layer like an ORM. Also, as mentioned above: for outputting to html use the other php dedicated functions like htmlspecialchars ;)

当然,请记住,您还需要根据您使用的数据库类型进行 sql 查询转义(例如 mysql_real_escape_string() 对 sql server 无用)。您可能希望在适当的应用程序层(如 ORM)自动处理此问题。另外,如上所述:要输出到 html,请使用其他 php 专用函数,如 htmlspecialchars ;)

For really allowing HTML input with like stripped classes and/or tags depend on one of the dedicated xss validation packages. DO NOT WRITE YOUR OWN REGEXES TO PARSE HTML!

真正允许带有类似剥离类和/或标签的 HTML 输入取决于专用的 xss 验证包之一。不要编写自己的正则表达式来解析 HTML!

回答by Peter Bailey

No, there is not.

不,那里没有。

First of all, SQL injection is an input filtering problem, and XSS is an output escaping one - so you wouldn't even execute these two operations at the same time in the code lifecycle.

首先,SQL注入是一个输入过滤问题,XSS是一个输出转义问题——所以你甚至不会在代码生命周期中同时执行这两个操作。

Basic rules of thumb

基本经验法则

  • For SQL query, bind parameters (as with PDO) or use a driver-native escaping function for query variables (such as mysql_real_escape_string())
  • Use strip_tags()to filter out unwanted HTML
  • Escape all other output with htmlspecialchars()and be mindful of the 2nd and 3rd parameters here.
  • 对于 SQL 查询,绑定参数(与 PDO 一样)或对查询变量使用驱动程序本机转义函数(例如mysql_real_escape_string()
  • 使用strip_tags()过滤掉不需要的HTML
  • 转义所有其他输出htmlspecialchars()并注意此处的第二个和第三个参数。

回答by jasonbar

To address the XSS issue, take a look at HTML Purifier. It is fairly configurable and has a decent track record.

要解决 XSS 问题,请查看HTML Purifier。它是相当可配置的,并且有不错的记录。

As for the SQL injection attacks, make sure you check the user input, and then run it though mysql_real_escape_string(). The function won't defeat all injection attacks, though, so it is important that you check the data before dumping it into your query string.

对于 SQL 注入攻击,请确保检查用户输入,然后通过 mysql_real_escape_string() 运行它。但是,该函数不会击败所有注入攻击,因此在将数据转储到查询字符串之前检查数据非常重要。

A better solution is to use prepared statements. The PDO libraryand mysqli extension support these.

更好的解决方案是使用准备好的语句。该PDO库和mysqli扩展支持这些。

回答by dangel

PHP 5.2 introduced the filter_varfunction.

PHP 5.2 引入了filter_var函数。

It supports a great deal of SANITIZE, VALIDATE filters.

它支持大量的 SANITIZE、VALIDATE 过滤器。

http://php.net/manual/en/function.filter-var.php

http://php.net/manual/en/function.filter-var.php

回答by Hamish Downer

One trick that can help in the specific circumstance where you have a page like /mypage?id=53and you use the id in a WHERE clause is to ensure that id definitely is an integer, like so:

在您有一个类似页面/mypage?id=53并在 WHERE 子句中使用 id的特定情况下,一个有用的技巧是确保 id 绝对是一个整数,如下所示:

if (isset($_GET['id'])) {
  $id = $_GET['id'];
  settype($id, 'integer');
  $result = mysql_query("SELECT * FROM mytable WHERE id = '$id'");
  # now use the result
}

But of course that only cuts out one specific attack, so read all the other answers. (And yes I know that the code above isn't great, but it shows the specific defence.)

但当然,这只会减少一种特定的攻击,所以请阅读所有其他答案。(是的,我知道上面的代码不是很好,但它显示了特定的防御。)

回答by Mark Martin

Methods for sanitizing user input with PHP:

使用 PHP 清理用户输入的方法:

  • Use Modern Versions of MySQL and PHP.

  • Set charset explicitly:

    • $mysqli->set_charset("utf8");
      manual
    • $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);
      manual
    • $pdo->exec("set names utf8");
      manual
    • $pdo = new PDO(
      "mysql:host=$host;dbname=$db", $user, $pass, 
      array(
      PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
      PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
      )
      );
      manual
    • mysql_set_charset('utf8')
      [deprecated in PHP 5.5.0, removed in PHP 7.0.0].
  • Use secure charsets:

    • Select utf8, latin1, ascii.., dont use vulnerable charsets big5, cp932, gb2312, gbk, sjis.
  • Use spatialized function:

    • MySQLi prepared statements:
      $stmt = $mysqli->prepare('SELECT * FROM test WHERE name = ? LIMIT 1'); 
      $param = "' OR 1=1 /*";
      $stmt->bind_param('s', $param);
      $stmt->execute();
    • PDO::quote()- places quotes around the input string (if required) and escapes special characters within the input string, using a quoting style appropriate to the underlying driver:

      $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set
      $pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection)
      $var = $pdo->quote("' OR 1=1 /*");not only escapes the literal, but also quotes it (in single-quote ' characters) $stmt = $pdo->query("SELECT * FROM test WHERE name = $var LIMIT 1");

    • PDO Prepared Statements: vs MySQLi prepared statements supports more database drivers and named parameters:

      $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set
      $pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection) $stmt = $pdo->prepare('SELECT * FROM test WHERE name = ? LIMIT 1'); $stmt->execute(["' OR 1=1 /*"]);

    • mysql_real_escape_string[deprecated in PHP 5.5.0, removed in PHP 7.0.0].
    • mysqli_real_escape_stringEscapes special characters in a string for use in an SQL statement, taking into account the current charset of the connection. But recommended to use Prepared Statements because they are not simply escaped strings, a statement comes up with a complete query execution plan, including which tables and indexes it would use, it is a optimized way.
    • Use single quotes (' ') around your variables inside your query.
  • Check the variable contains what you are expecting for:

    • If you are expecting an integer, use:
      ctype_digit — Check for numeric character(s);
      $value = (int) $value;
      $value = intval($value);
      $var = filter_var('0755', FILTER_VALIDATE_INT, $options);
    • For Strings use:
      is_string() — Find whether the type of a variable is string

      Use Filter Functionfilter_var() — filters a variable with a specified filter:
      $email = filter_var($email, FILTER_SANITIZE_EMAIL);
      $newstr = filter_var($str, FILTER_SANITIZE_STRING);
      more predefined filters
    • filter_input()— Gets a specific external variable by name and optionally filters it:
      $search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
    • preg_match()— Perform a regular expression match;
    • Write Your own validation function.
  • 使用现代版本的 MySQL 和 PHP。

  • 显式设置字符集:

    • $mysqli->set_charset("utf8");
      手动的
    • $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);
      手动的
    • $pdo->exec("set names utf8");
      手动的
    • $pdo = new PDO(
      "mysql:host=$host;dbname=$db", $user, $pass, 
      array(
      PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
      PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
      )
      );
      手动的
    • mysql_set_charset('utf8')
      [在 PHP 5.5.0 中弃用,在 PHP 7.0.0 中删除]。
  • 使用安全字符集:

    • 选择 utf8、latin1、ascii..,不要使用易受攻击的字符集 big5、cp932、gb2312、gbk、sjis。
  • 使用空间化函数:

    • MySQLi 准备的语句:
      $stmt = $mysqli->prepare('SELECT * FROM test WHERE name = ? LIMIT 1'); 
      $param = "' OR 1=1 /*";
      $stmt->bind_param('s', $param);
      $stmt->execute();
    • PDO::quote()- 在输入字符串周围放置引号(如果需要)并转义输入字符串中的特殊字符,使用适合底层驱动程序的引用样式:

      $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set
      $pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection)
      $var = $pdo->quote("' OR 1=1 /*");not only escapes the literal, but also quotes it (in single-quote ' characters) $stmt = $pdo->query("SELECT * FROM test WHERE name = $var LIMIT 1");

    • PDO Prepared Statements: vs MySQLi Preparedstatements 支持更多的数据库驱动程序和命名参数:

      $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set
      $pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection) $stmt = $pdo->prepare('SELECT * FROM test WHERE name = ? LIMIT 1'); $stmt->execute(["' OR 1=1 /*"]);

    • mysql_real_escape_string[在 PHP 5.5.0 中弃用,在 PHP 7.0.0 中删除]。
    • mysqli_real_escape_string转义字符串中用于 SQL 语句的特殊字符,同时考虑连接的当前字符集。但是推荐使用Prepared Statements,因为它们不是简单的转义字符串,一个语句会提出一个完整的查询执行计划,包括它会使用哪些表和索引,这是一种优化的方式。
    • 在查询中的变量周围使用单引号 (' ')。
  • 检查变量包含您期望的内容:

    • 如果您需要一个整数,请使用:
      ctype_digit — Check for numeric character(s);
      $value = (int) $value;
      $value = intval($value);
      $var = filter_var('0755', FILTER_VALIDATE_INT, $options);
    • 对于字符串使用:
      is_string() — Find whether the type of a variable is string

      使用过滤器函数filter_var() — 使用指定的过滤器过滤变量:
      $email = filter_var($email, FILTER_SANITIZE_EMAIL);
      $newstr = filter_var($str, FILTER_SANITIZE_STRING);
      更多预定义过滤器
    • filter_input()— 按名称获取特定的外部变量并可选择对其进行过滤:
      $search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
    • preg_match()— 执行正则表达式匹配;
    • 编写您自己的验证函数。

回答by Andrew

What you are describing here is two separate issues:

你在这里描述的是两个不同的问题:

  1. Sanitizing / filtering of user input data.
  2. Escaping output.
  1. 清理/过滤用户输入数据。
  2. 逃逸输出。

1) User input should always be assumed to be bad.

1) 应始终假定用户输入是错误的。

Using prepared statements, or/and filtering with mysql_real_escape_string is definitely a must. PHP also has filter_input built in which is a good place to start.

使用准备好的语句,或/和使用 mysql_real_escape_string 过滤绝对是必须的。PHP 还内置了 filter_input,这是一个很好的起点。

2) This is a large topic, and it depends on the context of the data being output. For HTML there are solutions such as htmlpurifier out there. as a rule of thumb, always escape anything you output.

2)这是一个很大的话题,它取决于输出数据的上下文。对于 HTML,有一些解决方案,例如 htmlpurifier。根据经验,总是逃避你输出的任何东西。

Both issues are far too big to go into in a single post, but there are lots of posts which go into more detail:

这两个问题都太大了,无法在一个帖子中讨论,但是有很多帖子可以更详细地介绍:

Methods PHP output

方法 PHP 输出

Safer PHP output

更安全的 PHP 输出