htmlspecialchars 和 mysql_real_escape_string 是否可以防止我的 PHP 代码被注入?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/110575/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Do htmlspecialchars and mysql_real_escape_string keep my PHP code safe from injection?
提问by Cheekysoft
Earlier today a question was asked regarding input validation strategies in web apps.
今天早些时候,有人问了一个关于Web 应用程序中的输入验证策略的问题。
The top answer, at time of writing, suggests in PHPjust using htmlspecialcharsand mysql_real_escape_string.
在撰写本文时,最佳答案建议PHP仅使用htmlspecialchars和mysql_real_escape_string。
My question is: Is this always enough? Is there more we should know? Where do these functions break down?
我的问题是:这总是足够的吗?还有更多我们应该知道的吗?这些功能在哪里分解?
回答by Cheekysoft
When it comes to database queries, always try and use prepared parameterised queries. The mysqliand PDOlibraries support this. This is infinitely safer than using escaping functions such as mysql_real_escape_string.
当涉及到数据库查询时,总是尝试使用准备好的参数化查询。在mysqli和PDO库支持这一点。这比使用转义函数(如mysql_real_escape_string.
Yes, mysql_real_escape_stringis effectively just a string escaping function. It is not a magic bullet. All it will do is escape dangerous characters in order that they can be safe to use in a single query string. However, if you do not sanitise your inputs beforehand, then you will be vulnerable to certain attack vectors.
是的,mysql_real_escape_string实际上只是一个字符串转义函数。它不是灵丹妙药。它所做的只是转义危险字符,以便在单个查询字符串中安全使用它们。但是,如果您不事先清理您的输入,那么您将容易受到某些攻击媒介的攻击。
Imagine the following SQL:
想象一下以下 SQL:
$result = "SELECT fields FROM table WHERE id = ".mysql_real_escape_string($_POST['id']);
You should be able to see that this is vulnerable to exploit.
Imagine the idparameter contained the common attack vector:
您应该能够看到这很容易被利用。
想象一下id包含常见攻击向量的参数:
1 OR 1=1
There's no risky chars in there to encode, so it will pass straight through the escaping filter. Leaving us:
那里没有要编码的危险字符,因此它将直接通过转义过滤器。离开我们:
SELECT fields FROM table WHERE id= 1 OR 1=1
Which is a lovely SQL injection vector and would allow the attacker to return all the rows. Or
这是一个可爱的 SQL 注入向量,允许攻击者返回所有行。或者
1 or is_admin=1 order by id limit 1
which produces
产生
SELECT fields FROM table WHERE id=1 or is_admin=1 order by id limit 1
Which allows the attacker to return the first administrator's details in this completely fictional example.
这允许攻击者在这个完全虚构的例子中返回第一个管理员的详细信息。
Whilst these functions are useful, they must be used with care. You need to ensure that all web inputs are validated to some degree. In this case, we see that we can be exploited because we didn't check that a variable we were using as a number, was actually numeric. In PHP you should widely use a set of functions to check that inputs are integers, floats, alphanumeric etc. But when it comes to SQL, heed most the value of the prepared statement. The above code would have been secure if it was a prepared statement as the database functions would have known that 1 OR 1=1is not a valid literal.
虽然这些功能很有用,但必须谨慎使用。您需要确保在某种程度上验证所有 Web 输入。在这种情况下,我们看到我们可以被利用,因为我们没有检查我们用作数字的变量实际上是数字。在 PHP 中,您应该广泛使用一组函数来检查输入是否为整数、浮点数、字母数字等。但是当涉及到 SQL 时,请注意准备好的语句的大部分值。如果上面的代码是一个准备好的语句,那么它是安全的,因为数据库函数会知道它1 OR 1=1不是一个有效的文字。
As for htmlspecialchars(). That's a minefield of its own.
至于htmlspecialchars()。那是它自己的雷区。
There's a real problem in PHP in that it has a whole selection of different html-related escaping functions, and no clear guidance on exactly which functions do what.
PHP 存在一个真正的问题,因为它有一整套不同的 html 相关转义函数,并且没有明确指导哪些函数具体做什么。
Firstly, if you are inside an HTML tag, you are in real trouble. Look at
首先,如果你在一个 HTML 标签中,你就有麻烦了。看着
echo '<img src= "' . htmlspecialchars($_GET['imagesrc']) . '" />';
We're already inside an HTML tag, so we don't need < or > to do anything dangerous. Our attack vector could just be javascript:alert(document.cookie)
我们已经在一个 HTML 标签中,所以我们不需要 < 或 > 来做任何危险的事情。我们的攻击向量可能只是javascript:alert(document.cookie)
Now resultant HTML looks like
现在生成的 HTML 看起来像
<img src= "javascript:alert(document.cookie)" />
The attack gets straight through.
攻击直接通过。
It gets worse. Why? because htmlspecialchars(when called this way) only encodes double quotes and not single. So if we had
它变得更糟。为什么?因为htmlspecialchars(当以这种方式调用时)只编码双引号而不是单引号。所以如果我们有
echo "<img src= '" . htmlspecialchars($_GET['imagesrc']) . ". />";
Our evil attacker can now inject whole new parameters
我们邪恶的攻击者现在可以注入全新的参数
pic.png' onclick='location.href=xxx' onmouseover='...
gives us
给我们
<img src='pic.png' onclick='location.href=xxx' onmouseover='...' />
In these cases, there is no magic bullet, you just have to santise the input yourself. If you try and filter out bad characters you will surely fail. Take a whitelist approach and only let through the chars which are good. Look at the XSS cheat sheetfor examples on how diverse vectors can be
在这些情况下,没有灵丹妙药,您只需要自己清理输入即可。如果您尝试过滤掉坏字符,您肯定会失败。采取白名单方法,只让好的字符通过。查看XSS 备忘单,了解向量的多样性示例
Even if you use htmlspecialchars($string)outside of HTML tags, you are still vulnerable to multi-byte charset attack vectors.
即使您htmlspecialchars($string)在 HTML 标签之外使用,您仍然容易受到多字节字符集攻击向量的攻击。
The most effective you can be is to use the a combination of mb_convert_encoding and htmlentities as follows.
最有效的方法是使用 mb_convert_encoding 和 htmlentities 的组合,如下所示。
$str = mb_convert_encoding($str, 'UTF-8', 'UTF-8');
$str = htmlentities($str, ENT_QUOTES, 'UTF-8');
Even this leaves IE6 vulnerable, because of the way it handles UTF. However, you could fall back to a more limited encoding, such as ISO-8859-1, until IE6 usage drops off.
即使这样,IE6 也容易受到攻击,因为它处理 UTF 的方式。但是,您可以回退到更有限的编码,例如 ISO-8859-1,直到 IE6 使用率下降。
For a more in-depth study to the multibyte problems, see https://stackoverflow.com/a/12118602/1820
有关多字节问题的更深入研究,请参阅https://stackoverflow.com/a/12118602/1820
回答by MarkR
In addition to Cheekysoft's excellent answer:
除了 Cheekysoft 的出色回答:
- Yes, they will keep you safe, but only if they're used absolutely correctly. Use them incorrectly and you will still be vulnerable, and may have other problems (for example data corruption)
- Please use parameterised queries instead (as stated above). You can use them through e.g. PDO or via a wrapper like PEAR DB
- Make sure that magic_quotes_gpc and magic_quotes_runtime are off at all times, and never get accidentally turned on, not even briefly. These are an early and deeply misguided attempt by PHP's developers to prevent security problems (which destroys data)
- 是的,它们会保证您的安全,但前提是它们的使用绝对正确。错误地使用它们,您仍然会受到攻击,并且可能会遇到其他问题(例如数据损坏)
- 请改用参数化查询(如上所述)。您可以通过例如 PDO 或通过 PEAR DB 之类的包装器使用它们
- 确保magic_quotes_gpc 和magic_quotes_runtime 始终处于关闭状态,并且永远不会意外打开,甚至不会短暂打开。这些是 PHP 开发人员为防止安全问题(破坏数据)而进行的早期且深受误导的尝试
There isn't really a silver bullet for preventing HTML injection (e.g. cross site scripting), but you may be able to achieve it more easily if you're using a library or templating system for outputting HTML. Read the documentation for that for how to escape things appropriately.
防止 HTML 注入(例如跨站点脚本)并没有真正的灵丹妙药,但是如果您使用库或模板系统来输出 HTML,则可以更轻松地实现它。阅读文档以了解如何适当地逃避事物。
In HTML, things need to be escaped differently depending on context. This is especially true of strings being placed into Javascript.
在 HTML 中,需要根据上下文进行不同的转义。将字符串放入 Javascript 时尤其如此。
回答by BrilliantWinter
I would definitely agree with the above posts, but I have one small thing to add in reply to Cheekysoft's answer, specifically:
我绝对同意上述帖子,但我有一件小事要补充 Cheekysoft 的回答,特别是:
When it comes to database queries, always try and use prepared parameterised queries. The mysqli and PDO libraries support this. This is infinitely safer than using escaping functions such as mysql_real_escape_string.
Yes, mysql_real_escape_string is effectively just a string escaping function. It is not a magic bullet. All it will do is escape dangerous characters in order that they can be safe to use in a single query string. However, if you do not sanitise your inputs beforehand, then you will be vulnerable to certain attack vectors.
Imagine the following SQL:
$result = "SELECT fields FROM table WHERE id = ".mysql_real_escape_string($_POST['id']);
You should be able to see that this is vulnerable to exploit. Imagine the id parameter contained the common attack vector:
1 OR 1=1
There's no risky chars in there to encode, so it will pass straight through the escaping filter. Leaving us:
SELECT fields FROM table WHERE id = 1 OR 1=1
当涉及到数据库查询时,总是尝试使用准备好的参数化查询。mysqli 和 PDO 库支持这一点。这比使用诸如 mysql_real_escape_string 之类的转义函数要安全得多。
是的,mysql_real_escape_string 实际上只是一个字符串转义函数。它不是灵丹妙药。它所做的只是转义危险字符,以便在单个查询字符串中安全使用它们。但是,如果您不事先清理您的输入,那么您将容易受到某些攻击媒介的攻击。
想象一下以下 SQL:
$result = "SELECT fields FROM table WHERE id = ".mysql_real_escape_string($_POST['id']);
您应该能够看到这很容易被利用。想象一下 id 参数包含常见的攻击向量:
1 或 1=1
那里没有要编码的危险字符,因此它将直接通过转义过滤器。离开我们:
从表中选择字段 WHERE id = 1 OR 1=1
I coded up a quick little function that I put in my database class that will strip out anything that isnt a number. It uses preg_replace, so there is prob a bit more optimized function, but it works in a pinch...
我编写了一个快速的小函数,我把它放在我的数据库类中,它会去掉任何不是数字的东西。它使用 preg_replace,所以有一个更优化的功能,但它在紧要关头工作......
function Numbers($input) {
$input = preg_replace("/[^0-9]/","", $input);
if($input == '') $input = 0;
return $input;
}
So instead of using
所以而不是使用
$result = "SELECT fields FROM table WHERE id = ".mysqlrealescapestring("1 OR 1=1");
$result = "SELECT fields FROM table WHERE id = ".mysqlrealescapestring("1 OR 1=1");
I would use
我会用
$result = "SELECT fields FROM table WHERE id = ".Numbers("1 OR 1=1");
$result = "SELECT fields FROM table WHERE id = ".Numbers("1 OR 1=1");
and it would safely run the query
它会安全地运行查询
SELECT fields FROM table WHERE id = 111
从表中选择字段 WHERE id = 111
Sure, that just stopped it from displaying the correct row, but I dont think that is a big issue for whoever is trying to inject sql into your site ;)
当然,这只是阻止它显示正确的行,但我认为这对于试图将 sql 注入您的站点的人来说不是一个大问题;)
回答by Lucas Oman
An important piece of this puzzle is contexts. Someone sending "1 OR 1=1" as the ID is not a problem if you quote every argument in your query:
这个难题的一个重要部分是上下文。如果您引用查询中的每个参数,则有人发送“1 OR 1=1”作为 ID 不是问题:
SELECT fields FROM table WHERE id='".mysql_real_escape_string($_GET['id'])."'"
Which results in:
结果是:
SELECT fields FROM table WHERE id='1 OR 1=1'
which is ineffectual. Since you're escaping the string, the input cannot break out of the string context. I've tested this as far as version 5.0.45 of MySQL, and using a string context for an integer column does not cause any problems.
这是无效的。由于您要转义字符串,因此输入无法脱离字符串上下文。我已经在 MySQL 5.0.45 版本中对此进行了测试,并且对整数列使用字符串上下文不会导致任何问题。
回答by cnizzardini
$result = "SELECT fields FROM table WHERE id = ".(INT) $_GET['id'];
Works well, even better on 64 bit systems. Beware of your systems limitations on addressing large numbers though, but for database ids this works great 99% of the time.
效果很好,在 64 位系统上效果更好。不过请注意您的系统在处理大量数字方面的限制,但对于数据库 ID,这在 99% 的情况下都有效。
You should be using a single function/method for cleaning your values as well. Even if this function is just a wrapper for mysql_real_escape_string(). Why? Because one day when an exploit to your preferred method of cleaning data is found you only have to update it one place, rather than a system-wide find and replace.
您也应该使用单个函数/方法来清理您的值。即使这个函数只是 mysql_real_escape_string() 的一个包装器。为什么?因为有一天,当发现对您首选的数据清理方法的漏洞利用时,您只需要在一个地方更新它,而不是在系统范围内进行查找和替换。
回答by Jarett L
why, oh WHY, would you notinclude quotes around user input in your sql statement? seems quite silly not to! including quotes in your sql statement would render "1 or 1=1" a fruitless attempt, no?
为什么,哦,为什么,你不在你的 sql 语句中包含用户输入的引号?似乎很傻不!在您的 sql 语句中包含引号会使“1 或 1=1”成为徒劳的尝试,不是吗?
so now, you'll say, "what if the user includes a quote (or double quotes) in the input?"
所以现在,您会说,“如果用户在输入中包含引号(或双引号)怎么办?”
well, easy fix for that: just remove user input'd quotes. eg: input =~ s/'//g;. now, it seems to me anyway, that user input would be secured...
好吧,很容易解决这个问题:只需删除用户输入的引号。例如:input =~ s/'//g;。现在,在我看来,无论如何,用户输入将受到保护......

