php 何时使用 filter_input()
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15102796/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
When to use filter_input()
提问by Jonathan
This question was originally asked in a comment here.
这个问题最初是在此处的评论中提出的。
Is filter_input()still necessary if you're using parameterized queries and htmlspecialchars()before you print any user-supplied data?
是filter_input()仍然是必要的,如果你使用参数化查询和是用htmlspecialchars()打印所有用户提供的数据之前?
It seems unnecessary to me, but I've always been told to "Filter Input, Escape Output". So, aside from a database (or another form of storage), is there any need to filter inputted data?
这对我来说似乎没有必要,但我一直被告知“过滤输入,转义输出”。那么,除了数据库(或其他形式的存储)之外,是否还需要过滤输入的数据?
回答by Sverri M. Olsen
Well, there are going to be differing opinions.
嗯,会有不同的意见。
My take is that you should always use it (or, the filterextension in general). There are at least 3 reasons for this:
我的看法是你应该总是使用它(或者,filter一般的扩展)。至少有3个原因:
Sanitizing input is something you should always do. Since the function gives you this capability there is really no reason to find other ways of sanitizing input. Since it is an extension the filter will also be much faster and most likely safer than most PHP solutions out there, which certainly does not hurt. The only exception is if you need a more specialized filter. Even then you should grab the value using the
FILTER_UNSAFE_RAWfilter (see #3).There are a lot of goodies in the
filterextension. It can save you hours from writing sanitizing and validation code. Of course, it does not cover every single case, but there is enough so that you can focus more on specific filtering/validating code.Using the function is very good for when you are debugging/auditing your code. When the function is used you know exactly what the input will be. For example, if you use the
FILTER_SANITIZE_NUMBER_INTfilter then you can be sure that the input will be a number -- no SQL injections, no HTML or Javascript code, etc. If you, on the other hand, use something likeFILTER_UNSAFE_RAWthen you know that it should be treated carefully, and that it can easily cause security problems.
消毒输入是您应该始终做的事情。由于该功能为您提供了此功能,因此真的没有理由寻找其他方式来清理输入。由于它是一个扩展,过滤器也将比大多数 PHP 解决方案更快,而且很可能更安全,这当然不会造成伤害。唯一的例外是如果您需要更专业的过滤器。即便如此,您也应该使用
FILTER_UNSAFE_RAW过滤器获取值(参见 #3)。filter扩展中有很多好东西。它可以为您节省编写清理和验证代码的时间。当然,它并没有涵盖每一种情况,但已经足够让您可以更多地关注特定的过滤/验证代码。在调试/审计代码时使用该函数非常有用。当使用该函数时,您确切地知道输入将是什么。例如,如果您使用
FILTER_SANITIZE_NUMBER_INT过滤器,那么您可以确定输入将是一个数字——没有 SQL 注入,没有 HTML 或 Javascript 代码等。另一方面,如果您使用类似的东西,FILTER_UNSAFE_RAW那么您就知道它应该慎重对待,而且很容易造成安全问题。
回答by toxalot
As Sverri M. Olsen says, there are differing opinions on this.
正如 Sverri M. Olsen 所说,对此存在不同意见。
I agree very much with the philosophy Filter Input, Escape Output.
我非常同意Filter Input, Escape Output的理念。
Is filter_input() still necessary if you're using parameterized queries and htmlspecialchars() before you print any user-supplied data?
如果在打印任何用户提供的数据之前使用参数化查询和 htmlspecialchars(),还需要 filter_input() 吗?
Short answer:IMO, No. It's not necessary, but can be useful in some cases.
简短回答:IMO,不。这不是必需的,但在某些情况下可能有用。
The filter_inputfunction has many useful filters, and I do use some of them (i.e. FILTER_VALIDATE_EMAIL). The validate filtersare useful for validatinginput. However, IMO, the ones that transformdata should only be used on output.
该filter_input函数有许多有用的过滤器,我确实使用了其中的一些(即 FILTER_VALIDATE_EMAIL)。的验证过滤器是有用的验证输入。但是,IMO,那些转换数据的应该只用于输出。
Some people encourage escaping input. Indeed, the examples given on the filter_inputmanual page seem to encourage this as well.
有些人鼓励逃避输入。事实上,filter_input手册页上给出的例子似乎也鼓励了这一点。
$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
$search_url = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_ENCODED);
The only examples are for escaping. That combined with the name of the function (filter_input) seems to suggest that escaping inputis good practice. Escaping is necessary, but, IMO, should be done before output, not on input. At least the return values are being stored in appropriately named variables.
唯一的例子是escaping。这与功能(filter_的名称组合输入)似乎表明,逃逸的输入是很好的做法。转义是必要的,但是,IMO 应该在输出之前完成,而不是在输入时完成。至少返回值被存储在适当命名的变量中。
I strongly disagree with escaping input. I've already come across real world situations where transforming data too early is a problem.
我强烈反对转义input。我已经遇到过过早转换数据是一个问题的现实世界情况。
For example, Google Analytics processes input in such a way that is causing my encoded ampersands (%26) to be decoded prior to query parameters being excluded. The result is that I have stats for query parameters that actually don't even exist in my URLs. See my questionregarding this issue that remains unsolved.
例如,Google Analytics 处理输入的方式会导致在排除查询参数之前对我编码的与符号 (%26) 进行解码。结果是我有查询参数的统计信息,这些信息实际上甚至不存在于我的 URL 中。请参阅我关于这个仍未解决的问题的问题。
You may also want to read Why escape-on-input is a bad idea. Here are some excerpts that I agree with, just in case the article disappears [emphasis in the original].
您可能还想阅读为什么在输入时转义是一个坏主意。以下是我同意的一些摘录,以防万一文章消失[强调原文]。
[...] escape-on-input is just wrong[...] it is a layering violation — it mixes an output formatting concern into input handling. Layering violations make your code much harder to understand and maintain, because you have to take into account other layers instead of letting each component and layer do its own job.
[...] 输入时转义是错误的[...] 这是一种分层违规——它将输出格式问题混合到输入处理中。分层违规使您的代码更难以理解和维护,因为您必须考虑其他层而不是让每个组件和层各自完成自己的工作。
and
和
You have corrupted your data by default. The system [...] is now lying about what data has come in.
默认情况下,您已损坏数据。系统 [...] 现在对传入的数据撒谎。
and
和
Escaping on input will not only fail to deal with the problems of more than one output, it will actually make your data incorrectfor many outputs.
对输入进行转义不仅无法处理多个输出的问题,而且实际上会使您的数据对许多输出不正确。
and
和
PHP used to have a feature called magic quotes. It was an escape-on-input feature that [...] caused all kinds of problems. [...] According to Lerdorf, the much newer PHP 'filter' extension is "magic_quotes done right". But it still suffers from almost all the problems described here.
PHP 曾经有一个称为魔术引号的功能。这是一种输入时逃逸功能,[...] 导致了各种问题。[...] 根据 Lerdorf 的说法,更新的 PHP 'filter' 扩展是“magic_quotes done right”。但它仍然受到这里描述的几乎所有问题的困扰。
So how is the filterextension better than magic quotes (other than the fact that it has many different filters)? The filters cause many of the same issues that magic quotes did.
那么过滤器扩展如何比魔术引号更好(除了它有许多不同的过滤器这一事实)?过滤器会导致许多与魔术引号相同的问题。
Here are the coding conventions I use:
以下是我使用的编码约定:
- values in $_POST, $_GET, $_REQUEST, etc. should not be escaped and should always be considered unsafe
- values should be validated1before being written to database or stored in $_SESSION
- values expected to be numeric or boolean should be sanitized2before being written to database or stored in $_SESSION
- trust that numeric and boolean values from database and $_SESSION are indeed numeric or boolean
- string values should be SQL-escaped before being used directly in any SQL query (non-string values should be sanitized2) or use prepared statements
- string values should be HTML-escaped before being used in HTML output (non-string values should be sanitized2)
- string values should be percent-encoded before being used in query strings (non-string values should be sanitized2)
- use a variable naming convention (such as *_url, *_html, *_sql) to store transformed data
- $_POST、$_GET、$_REQUEST 等中的值不应该被转义,并且应该始终被认为是不安全的
- 值应在写入数据库或存储在 $_SESSION 之前验证为1
- 预期为数字或布尔值的值应在写入数据库或存储在 $_SESSION 之前清理2
- 相信来自数据库和 $_SESSION 的数字和布尔值确实是数字或布尔值
- 字符串值在直接用于任何 SQL 查询之前应该是 SQL 转义的(非字符串值应该被清理2)或使用准备好的语句
- 字符串值在用于 HTML 输出之前应该是 HTML 转义的(非字符串值应该被清理2)
- 字符串值在用于查询字符串之前应该进行百分比编码(非字符串值应该被清理2)
- 使用变量命名约定(例如 *_url、*_html、*_sql)来存储转换后的数据
Terminology
术语
For my purposes here, this is how I define the terms used above.
出于我的目的,这就是我定义上面使用的术语的方式。
- to validate means to confirm any assumptions being made about the data such as having a specific format or required fields having a value
- to sanitize means to confirm values are exactlyas expected (i.e. $id_num should contain nothing but digits)
- 验证意味着确认对数据所做的任何假设,例如具有特定格式或具有值的必填字段
- 消毒意味着确认值完全符合预期(即 $id_num 应该只包含数字)
Summary
概括
In general (there may be some exceptions), I'd recommend the following:
一般来说(可能有一些例外),我建议如下:
- use validate filterson input
- use sanitize filterson output
- remember TIMTOWDI - For example, I prefer htmlspecialchars()(which has more options) over FILTER_SANITIZE_FULL_SPECIAL_CHARS or FILTER_SANITIZE_SPECIAL_CHARS (which escapes line breaks)

