[解析]Hash碰撞的拒绝式服务攻击

最近,除了国内明文密码的安全事件,还有一个事是比较大的,那就是Hash Collision DoS (Hash碰撞的拒绝式服务攻击),有恶意的人会通过这个安全弱点会让你的服务器运行巨慢无比。这个安全弱点利用了各语言的Hash算法的“非随机性”可以制造出N多的value不一样,但是key一样数据,然后让你的Hash表成为一张单向链表,而导致你的整个网站或是程序的运行性能以级数下降(可以很轻松的让你的CPU升到100%)

目前,这个问题出现于Java, JRuby, PHP, Python, Rubinius, Ruby这些语言中,主要:

  • Java, 所有版本
  • JRuby <= 1.6.5 (目前fix在 1.6.5.1)
  • PHP <= 5.3.8, <= 5.4.0RC3 (目前fix在 5.3.9,  5.4.0RC4)
  • Python, all versions
  • Rubinius, all versions
  • Ruby <= 1.8.7-p356 (目前fix在 1.8.7-p357, 1.9.x)
  • Apache Geronimo, 所有版本
  • Apache Tomcat <= 5.5.34, <= 6.0.34, <= 7.0.22 (目前fix在 5.5.35,  6.0.35,  7.0.23)
  • Oracle Glassfish <= 3.1.1 (目前fix在mainline)
  • Jetty, 所有版本
  • Plone, 所有版本
  • Rack <= 1.3.5, <= 1.2.4, <= 1.1.2 (目前fix 在 1.4.0, 1.3.6, 1.2.5, 1.1.3)
  • V8 JavaScript Engine, 所有版本
  • ASP.NET没有打MS11-100补丁

注意,Perl没有这个问题,因为Perl在N年前就fix了这个问题了。关于这个列表的更新,请参看oCERT的2011-003报告,比较坑爹的是,这个问题早在2003年就在论文《通过算法复杂性进行拒绝式服务攻击》中被报告了,但是好像没有引起注意,尤其是Java。

弱点攻击解释

你可以会觉得这个问题没有什么大不了的,因为黑客是看不到hash算法的,如果你这么认为,那么你就错了,这说明对Web编程的了解还不足够底层。

无论你用JSP,PHP,Python,Ruby来写后台网页的时候,在处理HTTP POST数据的时候,你的后台程序可以很容易地以访问表单字段名来访问表单值,就像下面这段程序一样:

  
  
  
  
  1. $usrname = $_POST['username'];  
  2. $passwd = $_POST['password']; 

这是怎么实现的呢?这后面的东西就是Hash Map啊,所以,我可以给你后台提交一个有10K字段的表单,这些字段名都被我精心地设计过,他们全是Hash Collision ,于是你的Web Server或语言处理这个表单的时候,就会建造这个hash map,于是在每插入一个表单字段的时候,都会先遍历一遍你所有已插入的字段,于是你的服务器的CPU一下就100%了,你会觉得这10K没什么,那么我就发很多个的请求,你的服务器一下就不行了。

举个例子,你可能更容易理解:

如果你有n个值—— v1, v2, v3, … vn,把他们放到hash表中应该是足够散列的,这样性能才高:

  
  
  
  
  1. 0 -> v2  
  2.  
  3. 1 -> v4  
  4.  
  5. 2 -> v1  
  6.  
  7. …  
  8.  
  9. …  
  10.  
  11. n -> v(x)  
  12.  

但是,这个攻击可以让我造出N个值——  dos1, dos2, …., dosn,他们的hash key都是一样的(也就是Hash Collision),导致你的hash表成了下面这个样子:

  
  
  
  
  1. 0 – > dos1 -> dos2 -> dos3 -> …. ->dosn  
  2.  
  3. 1 -> null  
  4.  
  5. 2 -> null  
  6.  
  7. …  
  8.  
  9. …  
  10.  
  11. n -> null  
  12.  

于是,单向链接就这样出现了。这样一来,O(1)的搜索算法复杂度就成了O(n),而插入N个数据的算法复杂度就成了O(n^2),你想想这是什么样的性能。

(关于Hash表的实现,如果你忘了,那就把大学时的《数据结构》一书拿出来看看)

Hash Collision DoS详解

StackOverflow.com是个好网站,合格的程序员都应该知道这个网站。上去一查,就看到了这个贴子“Application vulnerability due to Non Random Hash Functions”。我把这个贴子里的东西摘一些过来。

首先,这些语言使用的Hash算法都是“非随机的”,如下所示,这个是Java和Oracle使用的Hash函数:

  
  
  
  
  1. static int hash(int h)  
  2. {  
  3. h ^= (h >>> 20) ^ (h >>> 12);  
  4. return h ^ (h >>> 7) ^ (h >>> 4);  

所谓“非随机的”Hash算法,就可以猜。比如:

1)在Java里,Aa和BB这两个字符串的hash code(或hash key)是一样的,也就是Collision 。

2)于是,我们就可以通过这两个种子生成更多的拥有同一个hash key的字符串。如:”AaAa”, “AaBB”, “BBAa”, “BBBB”。这是第一次迭代。其实就是一个排列组合,写个程序就搞定了。

3)然后,我们可以用这4个长度的字符串,构造8个长度的字符串,如下所示:

 "AaAaAaAa", "AaAaBBBB", "AaAaAaBB", "AaAaBBAa",

"BBBBAaAa", "BBBBBBBB", "BBBBAaBB", "BBBBBBAa",

"AaBBAaAa", "AaBBBBBB", "AaBBAaBB", "AaBBBBAa",

"BBAaAaAa", "BBAaBBBB", "BBAaAaBB", "BBAaBBAa",

4)同理,我们就可以生成16个长度的,以及256个长度的字符串,总之,很容易生成N多的这样的值。

在攻击时,我只需要把这些数据做成一个HTTP POST表单,然后写一个无限循环的程序,不停地提交这个表单。你用你的浏览器就可以了。当然,如果做得更精妙一点的话,把你的这个表单做成一个跨站脚本,然后找一些网站的跨站漏洞,放上去,于是能过SNS的力量就可以找到N多个用户来帮你从不同的IP来攻击某服务器。

防守

要防守这样的攻击,有下面几个招:

  • 打补丁,把hash算法改了。
  • 限制POST的参数个数,限制POST的请求长度。
  • 最好还有防火墙检测异常的请求。

不过,对于更底层的或是其它形式的攻击,可能就有点麻烦了。


------------------------------------------------------------------------------------------------------------------------------------------------------


  1. US-CERT is aware of reports stating that multiple programming language implementations, including web platforms, are vulnerable to hash table collision attacks. This vulnerability could be used by an attacker to launch a denial-of-service attack against websites using affected products.
  2.  
  3. The Ruby Security Team has updated Ruby 1.8.7. The Ruby 1.9 series is not affected by this attack. Additional information can be found in the ruby 1.8.7 patchlevel 357 release notes.
  4.  
  5. Microsoft has released a security advisory for ASP.NET containing a workaround. Additional information can be found in Microsoft Security Advisory 2659883.
  6.  
  7. More information regarding this vulnerability can be found in US-CERT Vulnerability Note VU#903934 and n.runs Security Advisory n.runs-SA-2011.004.
  8.  
  9. ---
  10. sebug
  11.  
  12. 目前已知的受影响的语言以及版本有::
  13. Java, 所有版本
  14. JRuby <= 1.6.5
  15. PHP <= 5.3.8, <= 5.4.0RC3
  16. Python, 所有版本
  17. Rubinius, 所有版本
  18. Ruby <= 1.8.7-p356
  19. Apache Geronimo, 所有版本
  20. Apache Tomcat <= 5.5.34, <= 6.0.34, <= 7.0.22
  21. Oracle Glassfish <= 3.1.1
  22. Jetty, 所有版本
  23. Plone, 所有版本
  24. Rack, 所有版本
  25. V8 JavaScript Engine, 所有版本
  26.  
  27. 不受此影响的语言或者修复版本的语言有::
  28. PHP >= 5.3.9, >= 5.4.0RC4
  29. JRuby >= 1.6.5.1
  30. Ruby >= 1.8.7-p357, 1.9.x
  31. Apache Tomcat >= 5.5.35, >= 6.0.35, >= 7.0.23
  32. Oracle Glassfish, N/A (Oracle reports that the issue is fixed in the main codeline and scheduled for a future CPU)
  33.  
  34. CVE: CVE-2011-4885 (PHP), CVE-2011-4461 (Jetty), CVE-2011-4838 (JRuby), CVE-2011-4462 (Plone), CVE-2011-4815 (Ruby)
  35.  
  36. ---
  37.  
  38. ===================================
  39.  
  40. n.runs AG
  41. http://www.nruns.com/ security(at)nruns.com
  42. n.runs-SA-2011.004 28-Dec-2011
  43. ________________________________________________________________________
  44. Vendors: PHP, http://www.php.net
  45. Oracle, http://www.oracle.com
  46. Microsoft, http://www.microsoft.com
  47. Python, http://www.python.org
  48. Ruby, http://www.ruby.org
  49. Google, http://www.google.com
  50. Affected Products: PHP 4 and 5
  51. Java
  52. Apache Tomcat
  53. Apache Geronimo
  54. Jetty
  55. Oracle Glassfish
  56. ASP.NET
  57. Python
  58. Plone
  59. CRuby 1.8, JRuby, Rubinius
  60. v8
  61. Vulnerability: Denial of Service through hash table
  62. multi-collisions
  63. Tracking IDs: oCERT-2011-003
  64. CERT VU#903934
  65. ________________________________________________________________________
  66. Vendor communication:
  67. 2011/11/01 Coordinated notification to PHP, Oracle, Python, Ruby, Google
  68. via oCERT
  69. 2011/11/29 Coordinated notification to Microsoft via CERT
  70.  
  71. Various communication with the vendors for clarifications, distribution
  72. of PoC code, discussion of fixes, etc.
  73. ___________________________________________________________________________
  74. Overview:
  75.  
  76. Hash tables are a commonly used data structure in most programming
  77. languages. Web application servers or platforms commonly parse
  78. attacker-controlled POST form data into hash tables automatically, so
  79. that they can be accessed by application developers.
  80.  
  81. If the language does not provide a randomized hash function or the
  82. application server does not recognize attacks using multi-collisions, an
  83. attacker can degenerate the hash table by sending lots of colliding
  84. keys. The algorithmic complexity of inserting n elements into the table
  85. then goes to O(n**2), making it possible to exhaust hours of CPU time
  86. using a single HTTP request.
  87.  
  88. This issue has been known since at least 2003 and has influenced Perl
  89. and CRuby 1.9 to change their hash functions to include randomization.
  90.  
  91. We show that PHP 5, Java, ASP.NET as well as v8 are fully vulnerable to
  92. this issue and PHP 4, Python and Ruby are partially vulnerable,
  93. depending on version or whether the server running the code is a 32 bit
  94. or 64 bit machine.
  95.  
  96. Description:
  97.  
  98. = Theory =
  99.  
  100. Most hash functions used in hash table implementations can be broken
  101. faster than by using brute-force techniques (which is feasible for hash
  102. functions with 32 bit output, but very expensive for 64 bit functions)
  103. by using one of two tricks”: equivalent substrings or a
  104. meet-in-the-middle attack.
  105.  
  106. == Equivalent substrings ==
  107.  
  108. Some hash functions have the property that if two strings collide, e.g.
  109. hash('string1') = hash('string2'), then hashes having this substring at
  110. the same position collide as well, e.g. hash('prefixstring1postfix') =
  111. hash('prefixstring2postfix'). If for example 'Ez' and 'FY' collide under
  112. a hash function with this property, then 'EzEz', 'EzFY', 'FYEz', 'FYFY'
  113. collide as well. An observing reader may notice that this is very
  114. similar to binary counting from zero to four. Using this knowledge, an
  115. attacker can construct arbitrary numbers of collisions (2^n for
  116. 2*n-sized strings in this example).
  117.  
  118. == Meet-in-the-middle attack ==
  119.  
  120. If equivalent substrings are not present in a given hash function, then
  121. brute-force seems to be the only solution. The obvious way to best use
  122. brute-force would be to choose a target value and hash random
  123. (fixed-size) strings and store those which hash to the target value. For
  124. a non-biased hash function with 32 bit output length, the probability of
  125. hitting a target in this way is 1/(2^32).
  126.  
  127. A meet-in-the-middle attack now tries to hit more than one target at a
  128. time. If the hash function can be inverted and the internal state of the
  129. hash function has the same size as the output, one can split the string
  130. into two parts, a prefix (of size n) and a postfix (of size m). One can
  131. now iterate over all possible m-sized postfix strings and calculate the
  132. intermediate value under which the hash function maps to a certain
  133. target. If one stores these strings and corresponding intermediate value
  134. in a lookup table, one can now generate random n-sized prefix strings
  135. and see if they map to one of the intermediate values in the lookup
  136. table. If this is the case, the complete string will map to the target
  137. value.
  138.  
  139. Splitting in the middle reduces the complexity of this attack by the
  140. square root, which gives us the probability of 1/(2^16) for a collision,
  141. thus enabling an attacker to generate multi-collisions much faster.
  142.  
  143. The hash functions we looked at which were vulnerable to an equivalent
  144. substring attack were all vulnerable to a meet-in-the-middle attack as
  145. well. In this case, the meet-in-the-middle attack provides more
  146. collisions for strings of a fixed size than the equivalent substring
  147. attack.
  148.  
  149. = The real world =
  150.  
  151. The different language use different hash functions which suffer from
  152. different problems. They also differ in how they use hash tables in
  153. storing POST form data.
  154.  
  155. == PHP 5 ==
  156.  
  157. PHP 5 uses the DJBX33A (Dan Bernstein's times 33, addition) hash
  158. function and parses POST form data into the $_POST hash table. Because
  159. of the structure of the hash function, it is vulnerable to an equivalent
  160. substring attack.
  161.  
  162. The maximal POST request size is typically limited to 8 MB, which when
  163. filled with a set of multi-collisions would consume about four hours of
  164. CPU time on an i7 core. Luckily, this time can not be exhausted because
  165. it is limited by the max_input_time (default configuration: -1,
  166. unlimited), Ubuntu and several BSDs: 60 seconds) configuration
  167. parameter. If the max_input_time parameter is set to -1 (theoretically:
  168. unlimited), it is bound by the max_execution_time configuration
  169. parameter (default value: 30).
  170.  
  171. On an i7 core, the 60 seconds take a string of multi-collisions of about
  172. 500k. 30 seconds of CPU time can be generated using a string of about
  173. 300k. This means that an attacker needs about 70-100kbit/s to keep one
  174. i7 core constantly busy. An attacker with a Gigabit connection can keep
  175. about 10.000 i7 cores busy.
  176.  
  177. == ASP.NET ==
  178.  
  179. ASP.NET uses the Request.Form object to provide POST data to a web
  180. application developer. This object is of class NameValueCollection. This
  181. uses a different hash function than the standard .NET one, namely
  182. CaseInsensitiveHashProvider.getHashCode(). This is the DJBX33X (Dan
  183. Bernstein's times 33, XOR) hash function on the uppercase version of the
  184. key, which is breakable using a meet-in-the-middle attack.
  185.  
  186. CPU time is limited by the IIS webserver to a value of typically 90
  187. seconds. This allows an attacker with about 30kbit/s to keep one Core2
  188. core constantly busy. An attacker with a Gigabit connection can keep
  189. about 30.000 Core2 cores busy.
  190.  
  191. == Java ==
  192.  
  193. Java offers the HashMap and Hashtable classes, which use the
  194. String.hashCode() hash function. It is very similar to DJBX33A (instead
  195. of 33, it uses the multiplication constant 31 and instead of the start
  196. value 5381 it uses 0). Thus it is also vulnerable to an equivalent
  197. substring attack. When hashing a string, Java also caches the hash value
  198. in the hash attribute, but only if the result is different from zero.
  199. Thus, the target value zero is particularly interesting for an attacker
  200. as it prevents caching and forces re-hashing.
  201.  
  202. Different web application parse the POST data differently, but the ones
  203. tested (Tomcat, Geronima, Jetty, Glassfish) all put the POST form data
  204. into either a Hashtable or HashMap object. The maximal POST sizes also
  205. differ from server to server, with 2 MB being the most common.
  206.  
  207. A Tomcat 6.0.32 server parses a 2 MB string of colliding keys in about
  208. 44 minutes of i7 CPU time, so an attacker with about 6 kbit/s can keep
  209. one i7 core constantly busy. If the attacker has a Gigabit connection,
  210. he can keep about 100.000 i7 cores busy.
  211.  
  212. == Python ==
  213.  
  214. Python uses a hash function which is very similar to DJBX33X, which can
  215. be broken using a meet-in-the-middle attack. It operates on register
  216. size and is thus different for 64 and 32 bit machines. While generating
  217. multi-collisions efficiently is also possible for the 64 bit version of
  218. the function, the resulting colliding strings are too large to be
  219. relevant for anything more than an academic attack.
  220.  
  221. Plone as the most prominent Python web framework accepts 1 MB of POST
  222. data, which it parses in about 7 minutes of CPU time in the worst case.
  223. This gives an attacker with about 20 kbit/s the possibility to keep one
  224. Core Duo core constantly busy. If the attacker is in the position to
  225. have a Gigabit line available, he can keep about 50.000 Core Duo cores
  226. busy.
  227.  
  228. == Ruby ==
  229.  
  230. The Ruby language consists of several implementations which do not share
  231. the same hash functions. It also differs in versions (1.8, 1.9), which
  232. depending on the implementation also do not necessarily share the same
  233. hash function.
  234.  
  235. The hash function of CRuby 1.9 has been using randomization since 2008
  236. (a result of the algorithmic complexity attacks disclosed in 2003). The
  237. CRuby 1.8 function is very similar to DJBX33A, but the large
  238. multiplication constant of 65599 prevents an effective equivalent
  239. substring attack. The hash function can be easily broken using a meet-
  240. in-the-middle attack, though. JRuby uses the CRuby 1.8 hash function for
  241. both 1.8 and 1.9. Rubinius uses a different hash function but also does
  242. not randomize it.
  243.  
  244. A typical POST size limit in Ruby frameworks is 2 MB, which takes about
  245. 6 hours of i7 CPU time to parse. Thus, an attacker with a single 850
  246. bits/s line can keep one i7 core busy. The other way around, an attacker
  247. with a Gigabit connection can keep about 1.000.000 (one million!) i7
  248. cores busy.
  249.  
  250. == v8 ==
  251.  
  252. Google's Javascript implementation v8 uses a hash function which looks
  253. different from the ones seen before, but can be broken using a meet-in-
  254. the-middle attack, too.
  255.  
  256. Node.js uses v8 to run Javascript-based web applications. The
  257. querystring module parses POST data into a hash table structure.
  258.  
  259. As node.js does not limit the POST size by default (we assume this would
  260. typically be the job of a framework), no effectiveness/efficiency
  261. measurements were performed.
  262.  
  263. Impact:
  264.  
  265. Any website running one of the above technologies which provides the
  266. option to perform a POST request is vulnerable to very effective DoS
  267. attacks.
  268.  
  269. As the attack is just a POST request, it could also be triggered from
  270. within a (third-party) website. This means that a cross-site-scripting
  271. vulnerability on a popular website could lead to a very effective DDoS
  272. attack (not necessarily against the same website).
  273.  
  274. Fixes:
  275.  
  276. The Ruby Security Team was very helpful in addressing this issue and
  277. both CRuby and JRuby provide updates for this issue with a randomized
  278. hash function (CRuby 1.8.7-p357, JRuby 1.6.5.1, CVE-2011-4815).
  279.  
  280. Oracle has decided there is nothing that needs to be fixed within Java
  281. itself, but will release an updated version of Glassfish in a future CPU
  282. (Oracle Security ticket S0104869).
  283.  
  284. Tomcat has released updates (7.0.23, 6.0.35) for this issue which limit
  285. the number of request parameters using a configuration parameter. The
  286. default value of 10.000 should provide sufficient protection.
  287.  
  288. Workarounds:
  289.  
  290. For languages were no fixes have been issued (yet?), there are a number
  291. of workarounds.
  292.  
  293. = Limiting CPU time =
  294.  
  295. The easiest way to reduce the impact of such an attack is to reduce the
  296. CPU time that a request is allowed to take. For PHP, this can be
  297. configured using the max_input_time parameter. On IIS (for ASP.NET),
  298. this can be configured using the “shutdown time limit for processes”
  299. parameter.
  300.  
  301. = Limiting maximal POST size =
  302.  
  303. If you can live with the fact that users can not put megabytes of data
  304. into your forms, limiting the form size to a small value (in the 10s of
  305. kilobytes rather than the usual megabytes) can drastically reduce the
  306. impact of the attack as well.
  307.  
  308. = Limiting maximal number of parameters =
  309.  
  310. The updated Tomcat versions offer an option to reduce the amount of
  311. parameters accepted independent from the maximal POST size. Configuring
  312. this is also possible using the Suhosin version of PHP using the
  313. suhosin.{post|request}.max_vars parameters.
  314.  
  315. ________________________________________________________________________
  316. Credits:
  317. Alexander Klink, n.runs AG
  318. Julian Wälde, Technische Universität Darmstadt
  319.  
  320. The original theory behind this attack vector is described in the 2003
  321. Usenix Security paper “Denial of Service via Algorithmic Complexity
  322. Attacks” by Scott A. Crosby and Dan S. Wallach, Rice University
  323. ________________________________________________________________________
  324. References:
  325. This advisory and upcoming advisories:
  326. http://www.nruns.com/security_advisory.php
  327. ________________________________________________________________________
  328. About n.runs:
  329. n.runs AG is a vendor-independent consulting company specialising in the
  330. areas of: IT Infrastructure, IT Security and IT Business Consulting.
  331.  
  332. Copyright Notice:
  333. Unaltered electronic reproduction of this advisory is permitted. For all
  334. other reproduction or publication, in printing or otherwise, contact
  335. [email protected] for permission. Use of the advisory constitutes
  336. acceptance for use in an “as is” condition. All warranties are excluded.
  337. In no event shall n.runs be liable for any damages whatsoever including
  338. direct, indirect, incidental, consequential, loss of business profits or
  339. special damages, even if n.runs has been advised of the possibility of
  340. such damages.
  341. Copyright 2011 n.runs AG. All rights reserved. Terms of use apply.

你可能感兴趣的:(function,python,Security,asp.net,jruby,Glassfish)