最近,除了国内明文密码的安全事件,还有一个事是比较大的,那就是Hash Collision DoS (Hash碰撞的拒绝式服务攻击),有恶意的人会通过这个安全弱点会让你的服务器运行巨慢无比。这个安全弱点利用了各语言的Hash算法的“非随机性”可以制造出N多的value不一样,但是key一样数据,然后让你的Hash表成为一张单向链表,而导致你的整个网站或是程序的运行性能以级数下降(可以很轻松的让你的CPU升到100%)。
目前,这个问题出现于Java, JRuby, PHP, Python, Rubinius, Ruby这些语言中,主要:
- Java, 所有版本
- JRuby <= 1.6.5 (目前fix在 1.6.5.1)
- PHP <= 5.3.8, <= 5.4.0RC3 (目前fix在 5.3.9, 5.4.0RC4)
- Python, all versions
- Rubinius, all versions
- Ruby <= 1.8.7-p356 (目前fix在 1.8.7-p357, 1.9.x)
- Apache Geronimo, 所有版本
- Apache Tomcat <= 5.5.34, <= 6.0.34, <= 7.0.22 (目前fix在 5.5.35, 6.0.35, 7.0.23)
- Oracle Glassfish <= 3.1.1 (目前fix在mainline)
- Jetty, 所有版本
- Plone, 所有版本
- Rack <= 1.3.5, <= 1.2.4, <= 1.1.2 (目前fix 在 1.4.0, 1.3.6, 1.2.5, 1.1.3)
- V8 JavaScript Engine, 所有版本
- ASP.NET没有打MS11-100补丁
注意,Perl没有这个问题,因为Perl在N年前就fix了这个问题了。关于这个列表的更新,请参看oCERT的2011-003报告,比较坑爹的是,这个问题早在2003年就在论文《通过算法复杂性进行拒绝式服务攻击》中被报告了,但是好像没有引起注意,尤其是Java。
弱点攻击解释
你可以会觉得这个问题没有什么大不了的,因为黑客是看不到hash算法的,如果你这么认为,那么你就错了,这说明对Web编程的了解还不足够底层。
无论你用JSP,PHP,Python,Ruby来写后台网页的时候,在处理HTTP POST数据的时候,你的后台程序可以很容易地以访问表单字段名来访问表单值,就像下面这段程序一样:
- $usrname = $_POST['username'];
- $passwd = $_POST['password'];
这是怎么实现的呢?这后面的东西就是Hash Map啊,所以,我可以给你后台提交一个有10K字段的表单,这些字段名都被我精心地设计过,他们全是Hash Collision ,于是你的Web Server或语言处理这个表单的时候,就会建造这个hash map,于是在每插入一个表单字段的时候,都会先遍历一遍你所有已插入的字段,于是你的服务器的CPU一下就100%了,你会觉得这10K没什么,那么我就发很多个的请求,你的服务器一下就不行了。
举个例子,你可能更容易理解:
如果你有n个值—— v1, v2, v3, … vn,把他们放到hash表中应该是足够散列的,这样性能才高:
- 0 -> v2
-
- 1 -> v4
-
- 2 -> v1
-
- …
-
- …
-
- n -> v(x)
-
但是,这个攻击可以让我造出N个值—— dos1, dos2, …., dosn,他们的hash key都是一样的(也就是Hash Collision),导致你的hash表成了下面这个样子:
- 0 – > dos1 -> dos2 -> dos3 -> …. ->dosn
-
- 1 -> null
-
- 2 -> null
-
- …
-
- …
-
- n -> null
-
于是,单向链接就这样出现了。这样一来,O(1)的搜索算法复杂度就成了O(n),而插入N个数据的算法复杂度就成了O(n^2),你想想这是什么样的性能。
(关于Hash表的实现,如果你忘了,那就把大学时的《数据结构》一书拿出来看看)
Hash Collision DoS详解
StackOverflow.com是个好网站,合格的程序员都应该知道这个网站。上去一查,就看到了这个贴子“Application vulnerability due to Non Random Hash Functions”。我把这个贴子里的东西摘一些过来。
首先,这些语言使用的Hash算法都是“非随机的”,如下所示,这个是Java和Oracle使用的Hash函数:
- static int hash(int h)
- {
- h ^= (h >>> 20) ^ (h >>> 12);
- return h ^ (h >>> 7) ^ (h >>> 4);
- }
所谓“非随机的”Hash算法,就可以猜。比如:
1)在Java里,Aa和BB这两个字符串的hash code(或hash key)是一样的,也就是Collision 。
2)于是,我们就可以通过这两个种子生成更多的拥有同一个hash key的字符串。如:”AaAa”, “AaBB”, “BBAa”, “BBBB”。这是第一次迭代。其实就是一个排列组合,写个程序就搞定了。
3)然后,我们可以用这4个长度的字符串,构造8个长度的字符串,如下所示:
"AaAaAaAa", "AaAaBBBB", "AaAaAaBB", "AaAaBBAa", "BBBBAaAa", "BBBBBBBB", "BBBBAaBB", "BBBBBBAa", "AaBBAaAa", "AaBBBBBB", "AaBBAaBB", "AaBBBBAa", "BBAaAaAa", "BBAaBBBB", "BBAaAaBB", "BBAaBBAa", |
4)同理,我们就可以生成16个长度的,以及256个长度的字符串,总之,很容易生成N多的这样的值。
在攻击时,我只需要把这些数据做成一个HTTP POST表单,然后写一个无限循环的程序,不停地提交这个表单。你用你的浏览器就可以了。当然,如果做得更精妙一点的话,把你的这个表单做成一个跨站脚本,然后找一些网站的跨站漏洞,放上去,于是能过SNS的力量就可以找到N多个用户来帮你从不同的IP来攻击某服务器。
防守
要防守这样的攻击,有下面几个招:
- 打补丁,把hash算法改了。
- 限制POST的参数个数,限制POST的请求长度。
- 最好还有防火墙检测异常的请求。
不过,对于更底层的或是其它形式的攻击,可能就有点麻烦了。
------------------------------------------------------------------------------------------------------------------------------------------------------
- US-CERT is aware of reports stating that multiple programming language implementations, including web platforms, are vulnerable to hash table collision attacks. This vulnerability could be used by an attacker to launch a denial-of-service attack against websites using affected products.
-
- The Ruby Security Team has updated Ruby 1.8.7. The Ruby 1.9 series is not affected by this attack. Additional information can be found in the ruby 1.8.7 patchlevel 357 release notes.
-
- Microsoft has released a security advisory for ASP.NET containing a workaround. Additional information can be found in Microsoft Security Advisory 2659883.
-
- More information regarding this vulnerability can be found in US-CERT Vulnerability Note VU#903934 and n.runs Security Advisory n.runs-SA-2011.004.
-
- ---
- sebug
-
- 目前已知的受影响的语言以及版本有::
- Java, 所有版本
- JRuby <= 1.6.5
- PHP <= 5.3.8, <= 5.4.0RC3
- Python, 所有版本
- Rubinius, 所有版本
- Ruby <= 1.8.7-p356
- Apache Geronimo, 所有版本
- Apache Tomcat <= 5.5.34, <= 6.0.34, <= 7.0.22
- Oracle Glassfish <= 3.1.1
- Jetty, 所有版本
- Plone, 所有版本
- Rack, 所有版本
- V8 JavaScript Engine, 所有版本
-
- 不受此影响的语言或者修复版本的语言有::
- PHP >= 5.3.9, >= 5.4.0RC4
- JRuby >= 1.6.5.1
- Ruby >= 1.8.7-p357, 1.9.x
- Apache Tomcat >= 5.5.35, >= 6.0.35, >= 7.0.23
- Oracle Glassfish, N/A (Oracle reports that the issue is fixed in the main codeline and scheduled for a future CPU)
-
- CVE: CVE-2011-4885 (PHP), CVE-2011-4461 (Jetty), CVE-2011-4838 (JRuby), CVE-2011-4462 (Plone), CVE-2011-4815 (Ruby)
-
- ---
-
- ===================================
-
- n.runs AG
- http://www.nruns.com/ security(at)nruns.com
- n.runs-SA-2011.004 28-Dec-2011
- ________________________________________________________________________
- Vendors: PHP, http://www.php.net
- Oracle, http://www.oracle.com
- Microsoft, http://www.microsoft.com
- Python, http://www.python.org
- Ruby, http://www.ruby.org
- Google, http://www.google.com
- Affected Products: PHP 4 and 5
- Java
- Apache Tomcat
- Apache Geronimo
- Jetty
- Oracle Glassfish
- ASP.NET
- Python
- Plone
- CRuby 1.8, JRuby, Rubinius
- v8
- Vulnerability: Denial of Service through hash table
- multi-collisions
- Tracking IDs: oCERT-2011-003
- CERT VU#903934
- ________________________________________________________________________
- Vendor communication:
- 2011/11/01 Coordinated notification to PHP, Oracle, Python, Ruby, Google
- via oCERT
- 2011/11/29 Coordinated notification to Microsoft via CERT
-
- Various communication with the vendors for clarifications, distribution
- of PoC code, discussion of fixes, etc.
- ___________________________________________________________________________
- Overview:
-
- Hash tables are a commonly used data structure in most programming
- languages. Web application servers or platforms commonly parse
- attacker-controlled POST form data into hash tables automatically, so
- that they can be accessed by application developers.
-
- If the language does not provide a randomized hash function or the
- application server does not recognize attacks using multi-collisions, an
- attacker can degenerate the hash table by sending lots of colliding
- keys. The algorithmic complexity of inserting n elements into the table
- then goes to O(n**2), making it possible to exhaust hours of CPU time
- using a single HTTP request.
-
- This issue has been known since at least 2003 and has influenced Perl
- and CRuby 1.9 to change their hash functions to include randomization.
-
- We show that PHP 5, Java, ASP.NET as well as v8 are fully vulnerable to
- this issue and PHP 4, Python and Ruby are partially vulnerable,
- depending on version or whether the server running the code is a 32 bit
- or 64 bit machine.
-
- Description:
-
- = Theory =
-
- Most hash functions used in hash table implementations can be broken
- faster than by using brute-force techniques (which is feasible for hash
- functions with 32 bit output, but very expensive for 64 bit functions)
- by using one of two “tricks”: equivalent substrings or a
- meet-in-the-middle attack.
-
- == Equivalent substrings ==
-
- Some hash functions have the property that if two strings collide, e.g.
- hash('string1') = hash('string2'), then hashes having this substring at
- the same position collide as well, e.g. hash('prefixstring1postfix') =
- hash('prefixstring2postfix'). If for example 'Ez' and 'FY' collide under
- a hash function with this property, then 'EzEz', 'EzFY', 'FYEz', 'FYFY'
- collide as well. An observing reader may notice that this is very
- similar to binary counting from zero to four. Using this knowledge, an
- attacker can construct arbitrary numbers of collisions (2^n for
- 2*n-sized strings in this example).
-
- == Meet-in-the-middle attack ==
-
- If equivalent substrings are not present in a given hash function, then
- brute-force seems to be the only solution. The obvious way to best use
- brute-force would be to choose a target value and hash random
- (fixed-size) strings and store those which hash to the target value. For
- a non-biased hash function with 32 bit output length, the probability of
- hitting a target in this way is 1/(2^32).
-
- A meet-in-the-middle attack now tries to hit more than one target at a
- time. If the hash function can be inverted and the internal state of the
- hash function has the same size as the output, one can split the string
- into two parts, a prefix (of size n) and a postfix (of size m). One can
- now iterate over all possible m-sized postfix strings and calculate the
- intermediate value under which the hash function maps to a certain
- target. If one stores these strings and corresponding intermediate value
- in a lookup table, one can now generate random n-sized prefix strings
- and see if they map to one of the intermediate values in the lookup
- table. If this is the case, the complete string will map to the target
- value.
-
- Splitting in the middle reduces the complexity of this attack by the
- square root, which gives us the probability of 1/(2^16) for a collision,
- thus enabling an attacker to generate multi-collisions much faster.
-
- The hash functions we looked at which were vulnerable to an equivalent
- substring attack were all vulnerable to a meet-in-the-middle attack as
- well. In this case, the meet-in-the-middle attack provides more
- collisions for strings of a fixed size than the equivalent substring
- attack.
-
- = The real world =
-
- The different language use different hash functions which suffer from
- different problems. They also differ in how they use hash tables in
- storing POST form data.
-
- == PHP 5 ==
-
- PHP 5 uses the DJBX33A (Dan Bernstein's times 33, addition) hash
- function and parses POST form data into the $_POST hash table. Because
- of the structure of the hash function, it is vulnerable to an equivalent
- substring attack.
-
- The maximal POST request size is typically limited to 8 MB, which when
- filled with a set of multi-collisions would consume about four hours of
- CPU time on an i7 core. Luckily, this time can not be exhausted because
- it is limited by the max_input_time (default configuration: -1,
- unlimited), Ubuntu and several BSDs: 60 seconds) configuration
- parameter. If the max_input_time parameter is set to -1 (theoretically:
- unlimited), it is bound by the max_execution_time configuration
- parameter (default value: 30).
-
- On an i7 core, the 60 seconds take a string of multi-collisions of about
- 500k. 30 seconds of CPU time can be generated using a string of about
- 300k. This means that an attacker needs about 70-100kbit/s to keep one
- i7 core constantly busy. An attacker with a Gigabit connection can keep
- about 10.000 i7 cores busy.
-
- == ASP.NET ==
-
- ASP.NET uses the Request.Form object to provide POST data to a web
- application developer. This object is of class NameValueCollection. This
- uses a different hash function than the standard .NET one, namely
- CaseInsensitiveHashProvider.getHashCode(). This is the DJBX33X (Dan
- Bernstein's times 33, XOR) hash function on the uppercase version of the
- key, which is breakable using a meet-in-the-middle attack.
-
- CPU time is limited by the IIS webserver to a value of typically 90
- seconds. This allows an attacker with about 30kbit/s to keep one Core2
- core constantly busy. An attacker with a Gigabit connection can keep
- about 30.000 Core2 cores busy.
-
- == Java ==
-
- Java offers the HashMap and Hashtable classes, which use the
- String.hashCode() hash function. It is very similar to DJBX33A (instead
- of 33, it uses the multiplication constant 31 and instead of the start
- value 5381 it uses 0). Thus it is also vulnerable to an equivalent
- substring attack. When hashing a string, Java also caches the hash value
- in the hash attribute, but only if the result is different from zero.
- Thus, the target value zero is particularly interesting for an attacker
- as it prevents caching and forces re-hashing.
-
- Different web application parse the POST data differently, but the ones
- tested (Tomcat, Geronima, Jetty, Glassfish) all put the POST form data
- into either a Hashtable or HashMap object. The maximal POST sizes also
- differ from server to server, with 2 MB being the most common.
-
- A Tomcat 6.0.32 server parses a 2 MB string of colliding keys in about
- 44 minutes of i7 CPU time, so an attacker with about 6 kbit/s can keep
- one i7 core constantly busy. If the attacker has a Gigabit connection,
- he can keep about 100.000 i7 cores busy.
-
- == Python ==
-
- Python uses a hash function which is very similar to DJBX33X, which can
- be broken using a meet-in-the-middle attack. It operates on register
- size and is thus different for 64 and 32 bit machines. While generating
- multi-collisions efficiently is also possible for the 64 bit version of
- the function, the resulting colliding strings are too large to be
- relevant for anything more than an academic attack.
-
- Plone as the most prominent Python web framework accepts 1 MB of POST
- data, which it parses in about 7 minutes of CPU time in the worst case.
- This gives an attacker with about 20 kbit/s the possibility to keep one
- Core Duo core constantly busy. If the attacker is in the position to
- have a Gigabit line available, he can keep about 50.000 Core Duo cores
- busy.
-
- == Ruby ==
-
- The Ruby language consists of several implementations which do not share
- the same hash functions. It also differs in versions (1.8, 1.9), which −
- depending on the implementation − also do not necessarily share the same
- hash function.
-
- The hash function of CRuby 1.9 has been using randomization since 2008
- (a result of the algorithmic complexity attacks disclosed in 2003). The
- CRuby 1.8 function is very similar to DJBX33A, but the large
- multiplication constant of 65599 prevents an effective equivalent
- substring attack. The hash function can be easily broken using a meet-
- in-the-middle attack, though. JRuby uses the CRuby 1.8 hash function for
- both 1.8 and 1.9. Rubinius uses a different hash function but also does
- not randomize it.
-
- A typical POST size limit in Ruby frameworks is 2 MB, which takes about
- 6 hours of i7 CPU time to parse. Thus, an attacker with a single 850
- bits/s line can keep one i7 core busy. The other way around, an attacker
- with a Gigabit connection can keep about 1.000.000 (one million!) i7
- cores busy.
-
- == v8 ==
-
- Google's Javascript implementation v8 uses a hash function which looks
- different from the ones seen before, but can be broken using a meet-in-
- the-middle attack, too.
-
- Node.js uses v8 to run Javascript-based web applications. The
- querystring module parses POST data into a hash table structure.
-
- As node.js does not limit the POST size by default (we assume this would
- typically be the job of a framework), no effectiveness/efficiency
- measurements were performed.
-
- Impact:
-
- Any website running one of the above technologies which provides the
- option to perform a POST request is vulnerable to very effective DoS
- attacks.
-
- As the attack is just a POST request, it could also be triggered from
- within a (third-party) website. This means that a cross-site-scripting
- vulnerability on a popular website could lead to a very effective DDoS
- attack (not necessarily against the same website).
-
- Fixes:
-
- The Ruby Security Team was very helpful in addressing this issue and
- both CRuby and JRuby provide updates for this issue with a randomized
- hash function (CRuby 1.8.7-p357, JRuby 1.6.5.1, CVE-2011-4815).
-
- Oracle has decided there is nothing that needs to be fixed within Java
- itself, but will release an updated version of Glassfish in a future CPU
- (Oracle Security ticket S0104869).
-
- Tomcat has released updates (7.0.23, 6.0.35) for this issue which limit
- the number of request parameters using a configuration parameter. The
- default value of 10.000 should provide sufficient protection.
-
- Workarounds:
-
- For languages were no fixes have been issued (yet?), there are a number
- of workarounds.
-
- = Limiting CPU time =
-
- The easiest way to reduce the impact of such an attack is to reduce the
- CPU time that a request is allowed to take. For PHP, this can be
- configured using the max_input_time parameter. On IIS (for ASP.NET),
- this can be configured using the “shutdown time limit for processes”
- parameter.
-
- = Limiting maximal POST size =
-
- If you can live with the fact that users can not put megabytes of data
- into your forms, limiting the form size to a small value (in the 10s of
- kilobytes rather than the usual megabytes) can drastically reduce the
- impact of the attack as well.
-
- = Limiting maximal number of parameters =
-
- The updated Tomcat versions offer an option to reduce the amount of
- parameters accepted independent from the maximal POST size. Configuring
- this is also possible using the Suhosin version of PHP using the
- suhosin.{post|request}.max_vars parameters.
-
- ________________________________________________________________________
- Credits:
- Alexander Klink, n.runs AG
- Julian Wälde, Technische Universität Darmstadt
-
- The original theory behind this attack vector is described in the 2003
- Usenix Security paper “Denial of Service via Algorithmic Complexity
- Attacks” by Scott A. Crosby and Dan S. Wallach, Rice University
- ________________________________________________________________________
- References:
- This advisory and upcoming advisories:
- http://www.nruns.com/security_advisory.php
- ________________________________________________________________________
- About n.runs:
- n.runs AG is a vendor-independent consulting company specialising in the
- areas of: IT Infrastructure, IT Security and IT Business Consulting.
-
- Copyright Notice:
- Unaltered electronic reproduction of this advisory is permitted. For all
- other reproduction or publication, in printing or otherwise, contact
- [email protected] for permission. Use of the advisory constitutes
- acceptance for use in an “as is” condition. All warranties are excluded.
- In no event shall n.runs be liable for any damages whatsoever including
- direct, indirect, incidental, consequential, loss of business profits or
- special damages, even if n.runs has been advised of the possibility of
- such damages.
- Copyright 2011 n.runs AG. All rights reserved. Terms of use apply.