eBay架构的思想金矿

2008年01月24日 星期四 11:53 P.M.

英文来源: http://www.manageability.org/blog/stuff/about-ebays-architecture

An accurate way of knowing what really works is looking at what truly works in practice. The software industry is plagued with so many ideas that for all intents and purposes are purely theoretical. Compounding the problem is the fact the software vendors continue to praise and sell these ideas as best practices.

 

Massively scalable architectures is one area where not many practitioners have truly been a witness of. Fortunately, sometimes information is graciously released for all to see and hear. I gained a lot of wisdom reading about Google's design of its hardware infrastructure or even Yahoo's page rendering patent. Now, another internet behemoth, eBay, has provided us with some insight on its own architecture.

There are many pieces of information in this presentation, however, I'll try to highlight and comment on the ones that are unusual or interesting.

The impressive part is that eBay had 380M page views a day with a site availability of 99.92%. In addition to that, nearly 30K lines of code changes per week. Just plain and simply enviable, not only that, incontrovertible evidence of the scalability of Java.

Now for the details on how it was achieved using J2EE technologies. The highlights to Ebay's scalability is as follows:

* Judicious use of server-side state
* No server affinity
* Functional server pools
* Horizontal and vertical database partitioning

What's interesting is how eBay enables data access scalability. They mention the use of "custom O-R mapping" with support for features like caching (local and global), lazy loading, fetch sets (deep and shallow) and support for retrieval and submit update subsets. Furthemore, they use bean managed transaction exclusively, autocommited to the database, and use the O-R mapping to route to different data sources.

A couple of things are quite striking. The first is its complete lack of usage of Entity Beans, using its own O-R mapping solution (Hibernate anyone?). The second is the partitioning of application servers based on use-cases. The third, the partitioning also of databases is also based on use-cases. The last is the stateless nature of the system and the conspicuous absence of clustering technologies.

Here's the quote about server state:

This basically means that right now we are not really using server-side state. We may use it; right now we have not found a good reason to use it. [snip] if there is something that needs to be stateful, then we put in the database; we go back and get it, if we need to. We just take the hit. We do not have to do clustering; we do not have to do any of that stuff.

In short, save yourself the trouble of building stateful servers, furthermore forget about clustering, you simply may not need it. Now, read this about functional partitioning:

So we have a pool or a farm of machines that are dedicated to a specific use case; like search will have its own farm of machines, and we can tune those much differently because the footprint and the replay of those are much different than viewing an item, which is essentially a read-only use case, versus selling an item, which is read-mostly type of use case. [snip] Horizontal database partitioning is something that we have adopted in the last probably four or five years to really get the availability, and also scalability, that we need.

In short, forget about placing your application and database on one giant machine, just use pools of servers that are dedicated on a use case basis. Doesn't that sound awfully similar to Google's strategy?

A little bit more about horizontal partitioning:

What enables our horizontal scalability is content based routing. So, if imagine eBay has on any given day 60 million items. We do not want to store that in one behemoth Sun machine. [snip] let us scale it across; may be, many Sun machines, but how you get to the right one? There is the content-based routing idea that comes in play. So, the idea was that given some hint, find out which of my 20 physical database hosts do I need to go to. The other cool thing about this is that failover could be defined.

Finally a word about using a more loosely coupled architecture in the future:

Using messaging to actually decouple disparate use cases is something that we are investigating.

Isn't it strange that the original presentation was about J2EE Design Patterns? The key scalability ideas are only tangentially related to the Patterns. Yes, eBay does use patterns to structure their code, however, focusing on the patterns misses the entire picture. The key nuggets of wisdom are a stateless design, the use of a flexible and highly tuned OR-mapping layer and the partitioning of servers based on use cases. The design patterns are nice, however don't expect blind application of it to lead to scalability.

In general, the approach that eBay is alluding to (and Google has confirmed) is that architectures that consist of pools or farms of machines dedicated on a use-case basis will provide better scalability and availability as compared to a few behemoth machines. The vendors, of course, are gripped in fear about this conclusion for obvious reasons. Nevertheless, the biggest technical hurdle in deploying a large number of servers is, of course, none other than the need for manageability ;-)

杨争 /译

了解一件事情是怎么做的一个正确的方式是看看它在现实中是怎么做的。软件工业一直以来都在为"很多idea仅仅在理论上说说"所困惑。与此同时,软件厂商不断地把这些idea作为最佳实践推销给大家。
很少的软件开发者亲眼目睹过大规模可扩展的架构这一领域。幸运的是,有时我们可以看到和听到关于这方面公开发表的资料。我读过一些好的资料关于 google的硬件基础设施的设计以及yahoo的页面渲染专利。现在,另一个互连网的巨人,eBay,给我们提供了其架构的一些资料(译者注:指的是" 一天十亿次的访问-采用Core J2EE Pattern架构的J2EE 系统"这篇文章)。
这篇文章提供了很多信息。然而,我们将只对那些独特的和我感兴趣的那部分进行评论。
给我留下深刻印象是eBay站点的99.92%的可用性和380M page的页面数据。除此之外,每周近3万行代码的改动,清楚明白地告诉我们ebay的java代码的高度扩展性。
eBay使用J2EE技术是如何做到这些的。eBay可扩展性的部分如下:

Judicious use of server-side state
No server affinity
Functional server pools
Horizontal and vertical database partitioning

eBay取得数据访问的线性扩展的做法是非常让人感兴趣的。他们提到使用"定制的O-R mapping" 来支持本地Cache和全局Cache、lazy loading, fetch sets (deep and shallow)以及读取和提交更新的子集。 而且,他们只使用bean管理的事务以及使用数据库的自动提交和O-R mapping来route不同的数据源.
有几个事情是非常令人吃惊的。第一,完全不使用Entity Beans,只使用他自己的O-R mapping工具(Hibernate anyone?)。第二、基于Use-Case的应用服务器划分。第三、数据库的划分也是基于Use-Case。最后是系统的无状态本性以及明显不使用集 群技术。

下面是关于服务器状态的引用:
基本上我们没有真正地使用server-side state。我们可能使用它,但现在我们并没有找到使用它的理由。....。如果需要状态化的话,我们把状态放到数据库中;需要的时候我们再从数据库中取。我们不必使用集群。也就不用为集群做任何工作。

总之,你自己不必为架构一台有状态的服务器所困扰,更进一步,忘掉集群,你不需要它。现在看看功能划分:

我们有一组或者一批机器,上面运行的应用是某个具体的use case,比如搜索功能有他们自己的服务器群,我们可以采用不同的调优策略,原因是浏览商品这个基本上是只读的用例和卖一件商品这个读写的用例在执行的时 候是不同。在过去四五年我们一直采用水平数据库划分达到我们需要的可用性和线性扩展性。

总之,不要把你的应用和数据库放在一个giant machine,仅仅使用servers pools,每个pools对应一个Use Case. 听起来是否类似Google的策略。

下面是关于水平划分的一些介绍:
基于内容的路由可以实现系统的水平线性扩展。所以,想象一下,如果eBay某天拥有6000万种商品,我们不必把这些数据存储到一台超级Sun服务器 上。.....也许我们可以把这些数据库放到许多台Sun服务器,但是我们怎么取到我们需要的数据呢?eBay提出了基于内容路由的方法. 这种方法通过一定的规则,从20台物理服务器中找到我需要的数据。更cool的事情是这里还定义了failover的策略。

最后,下面一句话描述了未来采用更加松散耦合的架构:

使用消息系统来耦合不同的Use Case是我们研究的内容。

是不是觉得很奇怪,最初这篇文章是介绍J2EE设计模式的?关键的线性扩展的思想几乎和Patterns无关。是的,eBay采用设计模式组织他们 的代码。然而过分强调设计模式将失去对整体的把握。eBay架构关键的思想是无状态的设计,使用灵活的,高度优化的 OR-mapping 层以及服务器基于use cases划分。设计模式是好的,然而不能期望它使应用具有线性扩展性。

总之,eBay和Google的例子表明以Use-Case为基础组成的服务器pools的架构比几个大型计算机证明是具有更好线性扩展性的和可用性。当然,厂商害怕听到这样的结论。然而,部署这么多服务器的最大麻烦是如何管理好他们。-)

 

我的总结:
eBay采用设计模式达到eBay架构的分层,各层(表示层、商业逻辑层、数据访问层)之间松散耦合,职责明确,分层提高了代码的扩展性和程序开发的效率。
eBay采用无状态的设计,灵活的、高度优化的 OR-mapping 层以及服务器基于use cases划分,达到应用之间的松散耦合,提高系统的线性扩展性。
为什么要求系统具有可线性扩展,目的就是当网站的访问量上升的时候,我们可以不用改动系统的任何代码,仅仅通过增加服务器就可以提高整个网站的支撑量。

你可能感兴趣的:(设计模式,应用服务器,Hibernate,Google,IT厂商)