最近一直在研究Scalability和Performance,没事就泡在InfoQ上,看了些文章,从中摘入了些比较赞同的观点:
performance and Scalability
Performance is about the resources used to service a single request. Scalability is about how resource consumption grows when you have to service more (or larger) requests.
We define performance as how rapidly an operation (or operations) complete, e.g. response time, number of events processed per second, etc; whilst scalability - this is how well the application can be scaled up to handle greater usage demands (e.g. number of users, request rates, volume of data).
Decrease processing time
Collocation : reduce any overheads associated with fetching data required for a piece of work, by collocating the data and the code.
Caching : if the data and the code can't be collocated, cache the data to reduce the overhead of fetching it over and over again.
Pooling : reduce the overhead associated with using expensive resources by pooling them.
Parallelization : decrease the time taken to complete a unit of work by decomposing the problem and parallelizing the individual steps.
Partitioning : concentrate related processing as close together as possible, by partitioning the code and collocating related partitions.
Remoting : reduce the amount of time spent accessing remote services by, for example, making the interfaces more coarse-grained. It's also worth remembering that remote vs local is an explicit design decision not a switch and to consider the first law of distributed computing - do not distribute your objects.
Requirements must be known
Target average and peak performance (i.e. response time, latency, etc).
Target average and peak load (i.e. concurrent users, message volumes, etc).
Acceptable limits for performance and scalability.
Partition by Function
the more decoupled that unrelated functionality can be, the more flexibility you will have to scale them independently of one another
At the database tier, we follow much the same approach. this approach allows us to scale the database infrastructure for each type of data independently of the others.
Split Horizontally
Different use cases use different schemes for partitioning the data: some are based on a simple modulo of a key (item ids ending in 1 go to one host, those ending in 2 go to the next, etc.), some on a range of ids (0-1M, 1-2M, etc.), some on a lookup table, some on a combination of these strategies. Regardless of the details of the partitioning scheme, though, the general idea is that an infrastructure which supports partitioning and repartitioning of data will be far more scalable than one which does not.
Avoid Distributed Transactions
The pragmatic answer is to relax your transactional guarantees across unrelated systems.It turns out that you can't have everything. In particular, guaranteeing immediate consistency across multiple systems or partitions is typically neither required nor possible. The CAP theorem, postulated almost 10 years ago by Inktomi's Eric Brewer, states that of three highly desirable properties of distributed systems - consistency (C), availability (A), and partition-tolerance (P) - you can only choose two at any one time. For a high-traffic web site, we have to choose partition-tolerance, since it is fundamental to scaling. For a 24x7 web site, we typically choose availability. So immediate consistency has to give way.We do employ various techniques to help the system reach eventual consistency: careful ordering of database operations, asynchronous recovery events, and reconciliation or settlement batches. We choose the technique according to the consistency demands of the particular use case.
Decouple Functions Asynchronously
The next key element to scaling is the aggressive use of asynchrony. If component A calls component B synchronously, A and B are tightly coupled, and that coupled system has a single scalability characteristic -- to scale A, you must also scale B. Equally problematic is its effect on availability. Going back to Logic 101, if A implies B, then not-B implies not-A. In other words, if B is down then A is down. By contrast, if A and B integrate asynchronously, whether through a queue, multicast messaging, a batch process, or some other means, each can be scaled independently of the other. Moreover, A and B now have independent availability characteristics - A can continue to move forward even if B is down or distressed.At every level, decomposing the processing into stages or phases, and connecting them up asynchronously, is critical to scaling.
Virtualize At All Levels
Virtualization and abstraction are everywhere, following the old computer science aphorism that the solution to every problem is another level of indirection. The operating system abstracts the hardware. The virtual machine in many modern languages abstracts the operating system. Object-relational mapping layers abstract the database. Load-balancers and virtual IPs abstract network endpoints. As we scale our infrastructure through partitioning by function and data, an additional level of virtualization of those partitions becomes critical.The motivation here is not only programmer convenience, but also operational flexibility. Hardware and software systems fail, and requests need to be re-routed. Components, machines, and partitions are added, moved, and removed. With judicious use of virtualization, higher levels of your infrastructure are blissfully unaware of these changes, and you are therefore free to make them. Virtualization makes scaling the infrastructure possible because it makes scaling manageable.
Cache Appropriately
The most obvious opportunities for caching come with slow-changing, read-mostly data - metadata, configuration, and static data.More challenging is rapidly-changing, read-write data. For the most part, we intentionally sidestep these challenges at eBay.
Scalability Worst Practices
The Golden Hammer
Forcing a particular technology to work in ways it was not intended is sometimes counter-productive.
Resource Abuse
Dependencies
Dependencies are a necessary evil in most systems and failure to manage dependencies and their versions diligently can inhibit agility and scalability.
Dependency management for code has different flavors:1) Compile the entire codebase together 2) Pick and choose components and services based on known versions 3)Publish models and services comprised of only backwards-compatible changes
Forgetting to check the time
To properly scale a system it is imperative to manage the time alloted for requests to be handled.
Runtime
the ability to easily deploy and operate the system in a production environment must be held in equal regard. There are a number of worst practices which jeopardize the scalability of a system.
1)Hero Pattern
2)Not automating
3)Not Monitoring
NOSQL(Not Only SQL) Alternatives
Partition the Data
By partitioning the data, we minimize the impact of a failure, and we distribute the load for both write and read operations. If only one node fails, the data belonging to that node is impacted, but not the entire data store.
Keep Multiple Replicas of the Same Data
Most of the NOSQL implementations rely on hot-backup copies of the data, to ensure continuous high availability.The most common configuration with GigaSpaces, is synchronous replication to the backup, and asynchronous to the backend storage.
Dynamic Scaling
In order to handle the continuous growth of data, most NOSQL alternatives provide a way of growing your data cluster, without bringing the cluster down or forcing a complete re-partitioning.One algorithm notifies the neighbors of a certain partition, that a node joined or failed. Only those neighbor nodes are impacted by that change, not the entire cluster.Another (and significantly simpler) algorithm uses logical partitions. With logical partitions, the number of partitions is fixed, but the distribution of partitions between machines is dynamic.
Use Map/Reduce to Handle Aggregation
Map/Reduce is a model that is often used to perform complex analytics, that are often associated with Hadoop. Having said that, it is important to note that map/reduce is often referred to as a pattern for parallel aggregated queries
Using Processing Units for Scaling
Washing Your Car The Tier-Based Way: