PHP & mySQL:什么时候使用 htmlentities?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2077576/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PHP & mySQL: When exactly to use htmlentities?
提问by Devner
PLATFORM:PHP & mySQL
平台:PHP & MySQL
For my experimentation purposes, I have tried out few of the XSS injections myself on my own website. Consider this situation where I have my form textarea input. As this is a textarea, I am able to enter text and all sorts of (English) characters. Here are my observations:
为了我的实验目的,我自己在自己的网站上尝试了一些 XSS 注入。考虑这种情况,我有我的表单 textarea 输入。由于这是一个文本区域,我可以输入文本和各种(英文)字符。以下是我的观察:
A). If I apply only strip_tags and mysql_real_escape_string and do not use htmlentities on my input just before inserting the data into the database, the query is breakingand I am hit with an error that shows my table structure, due to the abnormal termination.
一种)。如果我只应用 strip_tags 和 mysql_real_escape_string 并且在将数据插入数据库之前不在我的输入中使用 htmlentities,那么查询就会中断,并且由于异常终止,我会遇到一个显示我的表结构的错误。
B). If I am applying strip_tags, mysql_real_escape_string and htmlentities on my input just before inserting the data into the database, the query is NOT breakingand I am able to successfully able to insert data from the textarea into my database.
B)。如果我在将数据插入数据库之前在我的输入中应用 strip_tags、mysql_real_escape_string 和 htmlentities,则查询不会中断,并且我能够成功地将数据从 textarea 插入到我的数据库中。
So I do understand that htmentities must be used at all costs but unsure when exactly it should be used. With the above in mind, I would like to know:
所以我明白必须不惜一切代价使用 htmentities,但不确定何时应该使用它。考虑到上述情况,我想知道:
When exactly htmlentities should be used? Should it be used just before inserting the data into DB or somehow get the data into DB and then apply htmlentities when I am trying to show the data from the DB?
If I follow the method described in point B) above (which I believe is the most obvious and efficient solution in my case), do I still need to apply htmlentities when I am trying to show the data from the DB? If so, why? If not, why not? I ask this because it's really confusing for me after I have gone through the post at: http://shiflett.org/blog/2005/dec/google-xss-example
Then there is this one more PHP function called: html_entity_decode. Can I use that to show my data from DB (after following my procedure as indicated in point B) as htmlentities was applied on my input? Which one should I prefer from: html_entity_decode and htmlentities and when?
什么时候应该使用 htmlentities?它应该在将数据插入数据库之前使用还是以某种方式将数据放入数据库然后在我尝试显示来自数据库的数据时应用 htmlentities?
如果我按照上面 B) 点中描述的方法(我认为这是我的案例中最明显和最有效的解决方案),当我尝试显示来自 DB 的数据时,我还需要应用 htmlentities 吗?如果是这样,为什么?如果没有,为什么不呢?我问这个是因为在我阅读了以下帖子后,我真的很困惑:http: //shiflett.org/blog/2005/dec/google-xss-example
然后还有一个名为html_entity_decode 的PHP 函数。当 htmlentities 应用于我的输入时,我是否可以使用它来显示来自 DB 的数据(在按照 B 点所示的程序执行之后)?我应该更喜欢哪一个:html_entity_decode 和 htmlentities,什么时候?
PREVIEW PAGE:
预览页面:
I thought it might help to add some more specific details of a specific situation here. Consider that there is a 'Preview' page. Now when I submit the input from a textarea, the Preview page receives the input and shows it html and at the same time, a hidden input collects this input. When the submit button on the Preview button is hit, then the data from the hidden input is POST'ed to a new page and that page inserts the data contained in the hidden input, into the DB. If I do not apply htmlentities when the form is initially submitted (but apply only strip_tags and mysql_real_escape_string) and there's a malicious input in the textarea, the hidden input is broken and the last few characters of the hidden input visibly seen as " />on the page, which is undesirable. So keeping this in mind, I need to do something to preserve the integrity of the hidden input properly on the Preview page and yet collect the data in the hidden input so that it does not break it. How do I go about this? Apologize for the delay in posting this info.
我认为在这里添加一些特定情况的更具体细节可能会有所帮助。考虑有一个“预览”页面。现在,当我从 textarea 提交输入时,预览页面会接收输入并将其显示为 html,同时,隐藏的输入会收集此输入。当点击预览按钮上的提交按钮时,来自隐藏输入的数据被 POST 到一个新页面,该页面将包含在隐藏输入中的数据插入到数据库中。如果我在最初提交表单时没有应用 htmlentities(但只应用了 strip_tags 和 mysql_real_escape_string)并且在 textarea 中有恶意输入,则隐藏输入被破坏并且隐藏输入的最后几个字符明显被视为 " />在页面上,这是不可取的。所以记住这一点,我需要做一些事情来在预览页面上正确地保持隐藏输入的完整性,同时收集隐藏输入中的数据,这样它就不会破坏它。我该怎么做?为延迟发布此信息道歉。
Thank you in advance.
先感谢您。
回答by nickf
Here's the general rule of thumb.
这是一般的经验法则。
Escape variables at the last possible moment.
在最后可能的时刻转义变量。
You want your variables to be clean representations of the data. That is, if you are trying to store the last name of someone named "O'Brien", then you definitely don'twant these:
您希望变量是数据的干净表示。也就是说,如果您尝试存储名为“O'Brien”的人的姓氏,那么您绝对不想要这些:
O'Brien
O\'Brien
.. because, well, that's not his name: there's no ampersands or slashes in it. When you take that variable and output it in a particular context (eg: insert into an SQL query, or print to a HTML page), thatis when you modify it.
.. 因为,好吧,那不是他的名字:里面没有&符号或斜线。当您获取该变量并在特定上下文中输出它时(例如:插入 SQL 查询,或打印到 HTML 页面),这就是您修改它的时间。
$name = "O'Brien";
$sql = "SELECT * FROM people "
. "WHERE lastname = '" . mysql_real_escape_string($name) . "'";
$html = "<div>Last Name: " . htmlentities($name, ENT_QUOTES) . "</div>";
You never want to have htmlentities-encoded strings stored in your database. What happens when you want to generate a CSV or PDF, or anything which isn'tHTML?
您永远不想将htmlentities-encoded 字符串存储在您的数据库中。当您想要生成 CSV 或 PDF 或任何非HTML 的内容时会发生什么?
Keep the data clean, and only escape for the specific context of the moment.
保持数据干净,只在当前的特定上下文中逃逸。
回答by John Parker
In essence, you should use mysql_real_escape_stringprior to database insertion (to prevent SQL injection) and then htmlentities, etc. at the point of output.
本质上,您应该mysql_real_escape_string在数据库插入之前使用(以防止 SQL 注入)然后htmlentities在输出点使用等。
You'll also want to apply sanity checking to all user input to ensure (for example) that numerical values are really numeric, etc. Functions such as is_int, is_float, etc. are useful at this point. (See the variable handling functionssection of the PHP manual for more information on these functions and other similar ones.)
您还需要对所有用户输入进行完整性检查,以确保(例如)数值确实是数字等。is_int、is_float等函数此时很有用。(有关这些函数和其他类似函数的更多信息,请参阅PHP 手册的变量处理函数部分。)
回答by BarsMonster
- Only before you are printing value(no matter from DB or from $_GET/$_POST) into HTML. htmlentities have nothing to do with database.
- B is overkill. You should mysql_real_escape_string before inserting to DB, and htmlentities before printing to HTML. You don't need to strip tags, after htmlentities tags will be displayed on screen as < b r / > e.t.c
- 仅在您将值(无论是从 DB 还是从 $_GET/$_POST)打印到 HTML 之前。htmlentities 与数据库无关。
- B 太过分了。您应该在插入到数据库之前使用 mysql_real_escape_string,并在打印到 HTML 之前使用 htmlentities。您不需要剥离标签,在 htmlentities 标签将在屏幕上显示为 < br /> 等
Theoretically you may do htmlentities before inserting to DB, but this might make further data processing harder, if you would need original text.
从理论上讲,您可以在插入数据库之前执行 htmlentities,但是如果您需要原始文本,这可能会使进一步的数据处理变得更加困难。
3. See above
回答by netrox
I've been through this before and learned two important things:
我以前经历过这件事,并学到了两件重要的事情:
If you're getting values from $_POST/$_GET/$_REQUEST and plan to add to DB, use mysql_real_escape_string function to sanitize the values. Do not encode them with htmlentities.
如果您从 $_POST/$_GET/$_REQUEST 获取值并计划添加到数据库,请使用 mysql_real_escape_string 函数来清理这些值。不要使用 htmlentities 对它们进行编码。
Why not just encode them with htmlentities and put them in database? Well, here's the thing - the goal is to make data as meaningful and clean as possible and when you encode the data with htmlentities like Jeff's Dog becomes Jeff"s Dog ... that will cause the context of data to lose its meaning. And if you decide to implement REST servcies and you fetch that string from DB and put it in JSON - it'll come up like Jeff"s Dog which isn't pretty. You'd have to add another function to decode as well.
为什么不直接用 htmlentities 对它们进行编码并将它们放入数据库中?嗯,事情就是这样 - 目标是使数据尽可能有意义和干净,当你使用 htmlentities 对数据进行编码时,比如 Jeff 的狗变成了 Jeff 的狗……这将导致数据的上下文失去其意义。如果您决定实现 REST 服务并从数据库中获取该字符串并将其放入 JSON 中 - 它会像杰夫的狗一样出现,这并不漂亮。您还必须添加另一个函数来解码。
Suppose you want to search for "Jeff's Dog" using SQL "select * from table where field='Jeff\'s Dog'", you won't find it since "Jeff's Dog" does not match "Jeff"s Dog." Bad, eh?
假设您想使用 SQL“select * from table where field='Jeff\'s Dog'”搜索“Jeff's Dog”,您将找不到它,因为“Jeff's Dog”与“Jeff's Dog”不匹配。不好,嗯?
To output alphanumeric strings (from CHAR type) to a webpage, use htmlentities - ALWAYS!
要将字母数字字符串(从 CHAR 类型)输出到网页,请使用 htmlentities - 始终!

![php RuntimeException] 供应商不存在且无法创建](/res/img/loading.gif)