上周技术关注:软件的力量

  • [软件; 比赛; 象棋; Intel] 软件的力量 #
    Deep Juinor 其实是已经是 4 次冠军了。本来运行在 4 路 AMD Opteron 系统上的,效能大约是每秒计算 600 万步,million nodes per second (MNPS),在 2 路 Intel Woodcrest 上,能达到 820 万步-- Woodcrest 确实是很强。用 Intel Compiler 重新捣鼓一下,提升到 840 万步每秒,再用 Intel Compiler 的 profile-guided optimizations 优化后,这个 Woodcrest 怪物达到了 900 万步每秒,相比 AMD 系统的 baseline 真的提高了 50%。
  • [安全] Whitepapers #
    这些论文是Honeynet项目的成果。它们讨论了入侵者团体的工具,手段和动机。
  • [计算机科学] Classic Texts In Computer Science #
    计算机科学中的经典论文
  • [dojo; ajax] Dojo.Book中文版(第一章) #
    随着web2.0的热潮,google,yahoo等各大web供应商争先恐后的退出自己的ajax开发包,意图争夺ajax标准.究竟鹿死谁手?有人看好Yui,有人喜欢GWT,还有人对这些都嗤之以鼻,认为自己写javascript才是王道.好了,谁优谁劣暂放一边,现在我要给大家的是由IBM研发的dojo开发工具(toolkit)的开发首册,由我来翻译.
  • [web2.0; 数据库技术] Database War Stories #9 (finis): Brian Aker of MySQL Responds #
    I didn't hear that flat files don't scale. What I heard is that some very big sites are saying that traditional databases don't scale, and that the evolution isn't from flat files to SQL databases, but from flat files to sophisticated custom file systems. Brian acknowledges that SQL vendors haven't solved the problem, but doesn't seem to think that anyone else has either.
  • [web2.0; 数据库技术] Database War Stories #8: Findory and Amazon #
    Our read-only databases are flat files -- Berkeley DB to be specific -- and are replicated out using our own replication management tools to our webservers. This strategy gives us extremely fast access from the local filesystem. We make thousands of random accesses to this read-only data on each page serve; Berkeley DB offers the performance necessary to be able to still serve our personalized pages rapidly under this load.
  • [web2.0; 数据库技术] Database War Stories #7: Google File System and BigTable #
    Greg Linden of Findory wrote: 'I've been enjoying your series on O'Reilly Radar about database war stories at popular startups. I was thinking that it would be fantastic if you could get Jeff Dean or Adam Bosworth at Google to chat a little bit about their database issues. As you probably know, Jeff Dean was involved designing BigTable and the Google File System. Adam Bosworth wrote a much discussed post about the need for better, large scale, distributed databases.'
  • [web2.0; 数据库技术] Database War Stories #6: O'Reilly Research #
    In building our Research data mart, which includes data on book sales trends, job postings), blog postings, and other data sources, Roger Magoulas has had to deal with a lot of very messy textual data, transforming it into something with enough structure to put it into a database. In this entry, he describes some of the problems, solutions, and the skills that are needed for dealing with unstructured data.
  • [web2.0; 数据库技术] Database War Stories #5: craigslist #
    Do Not expect FullText indexing to work on a very large table. It's just not fast enough for what user expect on the web and an updating rows will make bad things happen. We want forward facing queries to be measured in a few 100ths of a second.
  • [web2.0; 数据库技术] Database War Stories #4: NASA World Wind #
    Patrick Hogan of NASA World Wind, an open source program that does many of the same things as Google Earth, uses both flat files and SQL databases in his application. Flat files are used for quick response on the client side, while on the server side, SQL databases store both imagery (and soon to come, vector files.) However, he admits that 'using file stores, especially when a large number of files are present (millions) has proven to be fairly inconsistent across multiple OS and hardware platforms.'
  • [web2.0; 数据库技术] Database War Stories #3: Flickr #
    I also asked for any information on the scale of data Flickr manages and its growth rates. Cal answered:total stored unique data : 935 GBtotal stored duplicated data : ~3TB
  • [web2.0; 数据库技术] Database War Stories #2: bloglines and memeorandum #
    Bloglines has several data stores, only a couple of which are managed by 'traditional' database tools (which in our case is Sleepycat). User information, including email address, password, and subscription data, is stored in one database. Feed information, including the name of the feed, description of the feed, and the various URLs associated with feed, are stored in another database. The vast majority of data within Bloglines however, the 1.4 billion blog posts we've archived since we went on-line, are stored in a data storage system that we wrote ourselves. This system is based on flat files that are replicated across multiple machines, somewhat like the system outlined in the Google File System paper,but much more specific to just our application. To round things out, we make extensive use of memcached to try to keep as much data in memory as possible to keep performance as snappy as possible.
  • [web2.0; 数据库技术] Web 2.0 and Databases Part 1: Second Life #
    In this first installment, a few thoughts from Cory Ondrejka and Ian Wilkes of Linden Labs, creators of Second Life
  • [数学; 算法; 人物] 数学之美 系列八-- 贾里尼克的故事和现代语言处理 #
    贾里尼克和波尔,库克以及拉维夫对人类的另一大贡献是 BCJR 算法,这是今天数字通信中应用的最广的两个算法之一(另一个是维特比算法)。有趣的是,这个算法发明了二十年后,才得以广泛应用。IBM 于是把它列为了 IBM 有史以来对人类最大贡献之一,并贴在加州 Amaden 实现室墙上。遗憾的是 BCJR 四个人已经全部离开 IBM,有一次IBM 的通信部门需要用这个算法,还得从斯坦福大学请一位专家去讲解,这位专家看到 IBM 橱窗里的成就榜,感慨万分。

更多技术动态,请访问我的365KeyRSS),你可以通过365Key订阅

你可能感兴趣的:(算法,Ajax,IBM,Google,dojo)