如何使用 HTML/PHP 防止 XSS?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1996122/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 04:35:33  来源:igfitidea点击:

How to prevent XSS with HTML/PHP?

phpxss

提问by TimTim

How do I prevent XSS (cross-site scripting) using just HTML and PHP?

如何仅使用 HTML 和 PHP 防止 XSS(跨站点脚本)?

I've seen numerous other posts on this topic but I have not found an article that clear and concisely states how to actually prevent XSS.

我已经看过许多关于这个主题的其他帖子,但我还没有找到一篇清晰简洁地说明如何实际防止 XSS 的文章。

回答by Alix Axel

Basically you need to use the function htmlspecialchars()whenever you want to output something to the browser that came from the user input.

基本上,htmlspecialchars()只要您想将来自用户输入的内容输出到浏览器,就需要使用该函数。

The correct way to use this function is something like this:

使用这个函数的正确方法是这样的:

echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

Google Code University also has these very educational videos on Web Security:

谷歌代码大学也有这些关于网络安全的非常有教育意义的视频:

回答by Wahyu Kristianto

One of my favorite OWASPreferences is the Cross-Site Scriptingexplanation because while there are a large number of XSS attack vectors, the following of a few rules can defend against the majority of them greatly!

我最喜欢的OWASP参考资料之一是Cross-Site Scripting解释,因为虽然存在大量 XSS 攻击向量,但以下几条规则可以极大地抵御其中的大多数!

This is PHP Security Cheat Sheet

这是PHP 安全备忘单

回答by James Kolpack

One of the most important steps is to sanitize any user input before it is processed and/or rendered back to the browser. PHP has some "filter" functions that can be used.

最重要的步骤之一是在处理和/或呈现回浏览器之前清理任何用户输入。PHP 有一些可以使用的“过滤器”函数。

The form that XSS attacks usually have is to insert a link to some off-site javascript that contains malicious intent for the user. Read more about it here.

XSS 攻击通常采用的形式是插入指向某些包含用户恶意意图的非现场 javascript 的链接。在此处阅读更多相关信息。

You'll also want to test your site - I can recommend the Firefox add-on XSS Me.

您还需要测试您的站点 - 我可以推荐 Firefox 插件XSS Me

回答by Scott Arciszewski

In order of preference:

按优先顺序:

  1. If you are using a templating engine (e.g. Twig, Smarty, Blade), check that it offers context-sensitive escaping. I know from experience that Twig does. {{ var|e('html_attr') }}
  2. If you want to allow HTML, use HTML Purifier. Even if you think you only accept Markdown or ReStructuredText, you still want to purify the HTML these markup languages output.
  3. Otherwise, use htmlentities($var, ENT_QUOTES | ENT_HTML5, $charset)and make sure the rest of your document uses the same character set as $charset. In most cases, 'UTF-8'is the desired character set.
  1. 如果您使用的是模板引擎(例如 Twig、Smarty、Blade),请检查它是否提供上下文敏感的转义。我从经验中知道 Twig 会这样做。{{ var|e('html_attr') }}
  2. 如果要允许 HTML,请使用HTML Purifier。即使您认为您只接受 Markdown 或 ReStructuredText,您仍然希望净化这些标记语言输出的 HTML。
  3. 否则,请使用htmlentities($var, ENT_QUOTES | ENT_HTML5, $charset)并确保文档的其余部分使用与$charset. 在大多数情况下,'UTF-8'是所需的字符集。

Also, make sure you escape on output, not on input.

另外,请确保在 output 上转义,而不是在 input 上转义

回答by Matt S

Cross-posting this as a consolidated reference from the SO Documentation beta which is going offline.

将此作为来自 SO 文档测试版的合并参考进行交叉发布,该测试版即将脱机。

Problem

问题

Cross-site scripting is the unintended execution of remote code by a web client. Any web application might expose itself to XSS if it takes input from a user and outputs it directly on a web page. If input includes HTML or JavaScript, remote code can be executed when this content is rendered by the web client.

跨站点脚本是 Web 客户端意外执行远程代码。如果任何 Web 应用程序从用户那里获取输入并将其直接输出到网页上,则它可能会将自身暴露给 XSS。如果输入包含 HTML 或 JavaScript,则可以在 Web 客户端呈现此内容时执行远程代码。

For example, if a 3rd party side contains a JavaScript file:

例如,如果第 3 方包含一个 JavaScript 文件:

// http://example.com/runme.js
document.write("I'm running");

And a PHP application directly outputs a string passed into it:

而一个 PHP 应用程序直接输出一个传入它的字符串:

<?php
echo '<div>' . $_GET['input'] . '</div>';

If an unchecked GET parameter contains <script src="http://example.com/runme.js"></script>then the output of the PHP script will be:

如果包含未选中的 GET 参数<script src="http://example.com/runme.js"></script>,则 PHP 脚本的输出将是:

<div><script src="http://example.com/runme.js"></script></div>

The 3rd party JavaScript will run and the user will see "I'm running" on the web page.

第 3 方 JavaScript 将运行,用户将在网页上看到“我正在运行”。

Solution

解决方案

As a general rule, never trust input coming from a client. Every GET, POST, and cookie value could be anything at all, and should therefore be validated. When outputting any of these values, escape them so they will not be evaluated in an unexpected way.

作为一般规则,永远不要相信来自客户的输入。每个 GET、POST 和 cookie 值都可以是任何值,因此应该进行验证。当输出这些值中的任何一个时,将它们转义,这样它们就不会以意外的方式进行评估。

Keep in mind that even in the simplest applications data can be moved around and it will be hard to keep track of all sources. Therefore it is a best practice to alwaysescape output.

请记住,即使在最简单的应用程序中,数据也可以四处移动,并且很难跟踪所有来源。因此,始终转义输出是最佳实践。

PHP provides a few ways to escape output depending on the context.

PHP 提供了几种根据上下文转义输出的方法。

Filter Functions

过滤功能

PHPs Filter Functionsallow the input data to the php script to be sanitizedor validatedin many ways. They are useful when saving or outputting client input.

PHP 过滤器功能允许以多种方式对 php 脚本的输入数据进行清理验证。它们在保存或输出客户端输入时很有用。

HTML Encoding

HTML 编码

htmlspecialcharswill convert any "HTML special characters" into their HTML encodings, meaning they will then notbe processed as standard HTML. To fix our previous example using this method:

htmlspecialchars将任何“HTML 特殊字符”转换为它们的 HTML 编码,这意味着它们将不会作为标准 HTML 进行处理。要使用此方法修复我们之前的示例:

<?php
echo '<div>' . htmlspecialchars($_GET['input']) . '</div>';
// or
echo '<div>' . filter_input(INPUT_GET, 'input', FILTER_SANITIZE_SPECIAL_CHARS) . '</div>';

Would output:

会输出:

<div>&lt;script src=&quot;http://example.com/runme.js&quot;&gt;&lt;/script&gt;</div>

Everything inside the <div>tag will notbe interpreted as a JavaScript tag by the browser, but instead as a simple text node. The user will safely see:

<div>标签内的所有内容都不会被浏览器解释为 JavaScript 标签,而是一个简单的文本节点。用户将安全地看到:

<script src="http://example.com/runme.js"></script>

URL Encoding

网址编码

When outputting a dynamically generated URL, PHP provides the urlencodefunction to safely output valid URLs. So, for example, if a user is able to input data that becomes part of another GET parameter:

在输出动态生成的 URL 时,PHP 提供了urlencode安全输出有效 URL的功能。因此,例如,如果用户能够输入成为另一个 GET 参数一部分的数据:

<?php
$input = urlencode($_GET['input']);
// or
$input = filter_input(INPUT_GET, 'input', FILTER_SANITIZE_URL);
echo '<a href="http://example.com/page?input="' . $input . '">Link</a>';

Any malicious input will be converted to an encoded URL parameter.

任何恶意输入都将转换为编码的 URL 参数。

Using specialised external libraries or OWASP AntiSamy lists

使用专门的外部库或 OWASP AntiSamy 列表

Sometimes you will want to send HTML or other kind of code inputs. You will need to maintain a list of authorised words (white list) and un-authorized (blacklist).

有时您会想要发送 HTML 或其他类型的代码输入。您需要维护一份授权词(白名单)和未授权词(黑名单)的列表。

You can download standard lists available at the OWASP AntiSamy website. Each list is fit for a specific kind of interaction (ebay api, tinyMCE, etc...). And it is open source.

您可以从OWASP AntiSamy 网站下载标准列表。每个列表都适合特定类型的交互(ebay api、tinyMCE 等)。它是开源的。

There are libraries existing to filter HTML and prevent XSS attacks for the general case and performing at least as well as AntiSamy lists with very easy use. For example you have HTML Purifier

现有的库可以过滤 HTML 并防止一般情况下的 XSS 攻击,并且至少与 AntiSamy 列表一样好用,使用起来非常方便。例如你有HTML Purifier

回答by webaholik

Many frameworks help handle XSS in various ways. When rolling your own or if there's some XSS concern, we can leverage filter_input_array(available in PHP 5 >= 5.2.0, PHP 7.) I typically will add this snippet to my SessionController, because all calls go through there before any other controller interacts with the data. In this manner, all user input gets sanitized in 1 central location. If this is done at the beginning of a project or before your database is poisoned, you shouldn't have any issues at time of output...stops garbage in, garbage out.

许多框架以各种方式帮助处理 XSS。当你自己滚动或者如果有一些 XSS 问题,我们可以利用filter_input_array(在 PHP 5 >= 5.2.0,PHP 7 中可用。)我通常会将这个片段添加到我的 SessionController,因为所有调用在任何其他控制器之前通过那里与数据进行交互。以这种方式,所有用户输入都在 1 个中心位置进行消毒。如果这是在项目开始时或在您的数据库中毒之前完成的,那么您在输出时应该不会有任何问题...停止垃圾进,垃圾出。

/* Prevent XSS input */
$_GET   = filter_input_array(INPUT_GET, FILTER_SANITIZE_STRING);
$_POST  = filter_input_array(INPUT_POST, FILTER_SANITIZE_STRING);
/* I prefer not to use $_REQUEST...but for those who do: */
$_REQUEST = (array)$_POST + (array)$_GET + (array)$_REQUEST;

The above will remove ALLHTML & script tags. If you need a solution that allows safe tags, based on a whitelist, check out HTML Purifier.

以上将删除所有HTML 和脚本标签。如果您需要一个允许基于白名单的安全标签的解决方案,请查看HTML Purifier



If your database is already poisoned or you want to deal with XSS at time of output, OWASPrecommends creating a custom wrapper function for echo, and using it EVERYWHERE you output user-supplied values:

如果您的数据库已经中毒或者您想在输出时处理 XSS,OWASP建议为 创建自定义包装函数echo,并在输出用户提供的值的任何地方使用它:

//xss mitigation functions
function xssafe($data,$encoding='UTF-8')
{
   return htmlspecialchars($data,ENT_QUOTES | ENT_HTML401,$encoding);
}
function xecho($data)
{
   echo xssafe($data);
}

回答by chris

You are also able to set some XSS related HTTP response headers via header(...)

您还可以通过以下方式设置一些与 XSS 相关的 HTTP 响应标头 header(...)

X-XSS-Protection "1; mode=block"

X-XSS-保护“1;模式=块”

to be sure, the browser XSS protection mode is enabled.

可以肯定的是,浏览器 XSS 保护模式已启用。

Content-Security-Policy "default-src 'self'; ..."

内容安全策略“default-src 'self'; ...”

to enable browser-side content security. See this one for Content Security Policy (CSP) details: http://content-security-policy.com/Especially setting up CSP to block inline-scripts and external script sources is helpful against XSS.

启用浏览器端内容安全。请参阅此内容安全策略 (CSP) 详细信息:http: //content-security-policy.com/特别是设置 CSP 以阻止内联脚本和外部脚本源有助于抵御 XSS。

for a general bunch of useful HTTP response headers concerning the security of you webapp, look at OWASP: https://www.owasp.org/index.php/List_of_useful_HTTP_headers

有关与您的 web 应用程序的安全性有关的一堆有用的 HTTP 响应标头,请查看 OWASP:https: //www.owasp.org/index.php/List_of_useful_HTTP_headers

回答by Abdo-Host

<?php
function xss_clean($data)
{
// Fix &entity\n;
$data = str_replace(array('&amp;','&lt;','&gt;'), array('&amp;amp;','&amp;lt;','&amp;gt;'), $data);
$data = preg_replace('/(&#*\w+)[\x00-\x20]+;/u', ';', $data);
$data = preg_replace('/(&#x*[0-9A-F]+);*/iu', ';', $data);
$data = html_entity_decode($data, ENT_COMPAT, 'UTF-8');

// Remove any attribute starting with "on" or xmlns
$data = preg_replace('#(<[^>]+?[\x00-\x20"\'])(?:on|xmlns)[^>]*+>#iu', '>', $data);

// Remove javascript: and vbscript: protocols
$data = preg_replace('#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '=nojavascript...', $data);
$data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '=novbscript...', $data);
$data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '=nomozbinding...', $data);

// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#i', '>', $data);
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#i', '>', $data);
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#iu', '>', $data);

// Remove namespaced elements (we do not need them)
$data = preg_replace('#</*\w+:\w[^>]*+>#i', '', $data);

do
{
    // Remove really unwanted tags
    $old_data = $data;
    $data = preg_replace('#</*(?:applet|b(?:ase|gsound|link)|embed|frame(?:set)?|i(?:frame|layer)|l(?:ayer|ink)|meta|object|s(?:cript|tyle)|title|xml)[^>]*+>#i', '', $data);
}
while ($old_data !== $data);

// we are done...
return $data;
}

回答by Pablo

Use htmlspecialcharson PHP. On HTML try to avoid using:

使用htmlspecialcharsPHP。在 HTML 上尽量避免使用:

element.innerHTML = “…”; element.outerHTML = “…”; document.write(…); document.writeln(…);

element.innerHTML = “…”; element.outerHTML = “…”; document.write(…); document.writeln(…);

where varis controlled by the user.

其中var由用户控制

Also obviously try avoiding eval(var), if you have to use any of them then try JSescaping them, HTMLescape them and you might have to do some more but for the basics this should be enough.

显然也尝试避免eval(var),如果你必须使用它们中的任何一个,然后尝试JS转义它们,HTML转义它们,你可能需要做更多的事情,但对于基础知识来说,这应该足够了。

回答by Marco Concas

The best way to protect your input it's use htmlentitiesfunction. Example:

保护您的输入的最佳方法是使用htmlentities功能。例子:

htmlentities($target, ENT_QUOTES, 'UTF-8');

You can get more information here.

您可以在此处获取更多信息。