亿笑 XML DoS 攻击是如何工作的?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3451203/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 13:14:23  来源:igfitidea点击:

How does the billion laughs XML DoS attack work?

xml

提问by l--''''''---------''''''''''''

<!DOCTYPE root [
 <!ENTITY ha "Ha !">
 <!ENTITY ha2 "&ha; &ha;">
 <!ENTITY ha3 "&ha2; &ha2;">
 <!ENTITY ha4 "&ha3; &ha3;">
 <!ENTITY ha5 "&ha4; &ha4;">
 ...
 <!ENTITY ha128 "&ha127; &ha127;">
 ]>
 <root>&ha128;</root>

supposedly this is called a billion laughs DoS attack.

据说这被称为十亿笑 DoS 攻击。

does anyone know how it works?

有谁知道它是如何工作的?

回答by cytinus

The Billion Laughs attack is a denial-of-service attack that targets XML parsers. The Billion Laughs attack is also known as an XML bomb, or more esoterically, the exponential entity expansion attack. A Billion Laughs attack can occur even when using well-formed XML and can also pass XML schema validation.

Billion Laughs 攻击是一种针对 XML 解析器的拒绝服务攻击。Billion Laughs 攻击也被称为 XML 炸弹,或者更深奥的,指数实体扩展攻击。即使使用格式良好的 XML 也可能发生 Billion Laughs 攻击,并且还可以通过 XML 模式验证。

The vanilla Billion Laughs attack is illustrated in the XML file represented below.

下面的 XML 文件说明了普通的 Billion Laughs 攻击。

<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>

In this example, there are 10 different XML entities, lollol9. The first entity, lolis defined to be the string “lol”. However, each of the other entities are defined to be 10 of another entity. The document content section of this XML file contains a reference to only one instance of the entity lol9. However, when this is being parsed by a DOM or SAX parser, when lol9is encountered, it is expanded into 10 lol8s, each of which is expanded into 10 lol7s, and so on and so forth. By the time everything is expanded to the text lol, there are 100,000,000 instances of the string "lol". If there was one more entity, or lolwas defined as 10 strings of “lol”, there would be a Billion “lol”s, hence the name of the attack. Needless to say, this many expansions consumes an exponential amount of resources and time, causing the DOS.

在这个例子中,有 10 个不同的 XML 实体,lollol9。第一个实体lol被定义为字符串“lol”。但是,每个其他实体都定义为另一个实体的 10 个。此 XML 文件的文档内容部分仅包含对实体的一个实例的引用lol9。但是,当它被 DOM 或 SAX 解析器解析时lol9,遇到时,将其扩展为 10lol8秒,每个扩展为 10lol7秒,依此类推。到所有内容都展开到文本时lol,字符串已经有 100,000,000 个实例"lol"。如果还有一个实体,或者lol被定义为 10 个字符串“lol”,将有十亿个“大声笑”,因此攻击的名称。不用说,这么多的扩展消耗了指数数量的资源和时间,导致 DOS。

A more extensive explanation exists on my blog.

我的博客上有更广泛的解释。

回答by Andrey Taptunov

One of the XML bombs - http://msdn.microsoft.com/en-us/magazine/ee335713.aspx

XML 炸弹之一 - http://msdn.microsoft.com/en-us/magazine/ee335713.aspx

An attacker can now take advantage of these three properties of XML (substitution entities, nested entities, and inline DTDs) to craft a malicious XML bomb. The attacker writes an XML document with nested entities just like the previous example, but instead of nesting just one level deep, he nests his entities many levels deep...

攻击者现在可以利用 XML 的这三个属性(替换实体、嵌套实体和内联 DTD)来制作恶意 XML 炸弹。攻击者编写一个带有嵌套实体的 XML 文档,就像前面的例子一样,但他不是只嵌套一层,而是将他的实体嵌套到多层深......

There is also code to protect from these "bombs" (in .NET world):

还有一些代码可以防止这些“炸弹”(在 .NET 世界中):

XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
settings.MaxCharactersFromEntities = 1024;
XmlReader reader = XmlReader.Create(stream, settings);

回答by Matthew Crumley

<!ENTITY ha "Ha !">defines an entity, &ha;that expands to "Ha !". The next line defines another entity, &ha2;that expands to "&ha; &ha;"and eventually, "Ha ! Ha !".

<!ENTITY ha "Ha !">定义一个实体,&ha;扩展为"Ha !". 下一行定义了另一个实体,&ha2;它扩展到"&ha; &ha;"并最终扩展到"Ha ! Ha !"

&ha3;turns into Ha ! Ha ! Ha ! Ha !, and so on, doubling the number each time. If you follow the pattern, &haN;is "Ha !", 2N-1times, so &ha128, expands to 2127"Ha !"s, which is too big for any computer to handle.

&ha3;变成Ha ! Ha ! Ha ! Ha !,依此类推,每次都将数字加倍。如果你遵循这个模式,&haN;"Ha !", 2 N-1次,那么&ha128, 扩展到 2 127"Ha !"s,这对于任何计算机来说都太大了。