一、两种scaling方式
zz from:http://hideto.javaeye.com/blog/133138
1,Vertical partitioning,生成的segments称为partitions
2,Horizontal federation,生成的segments称为shards
二、Storage engines
1,MyISAM
IBM开发的ISAM(Indexed Sequential Access Method)的扩展,是MySQL默认的storage engine
MyISAM由三个文件组成,.frm文件存储table定义,.MYD文件存储row data,.MYI文件存储索引
MyISAM使用table-level locking,三种lock类型,READ LOCAL、READ、WRITE
MyISAM不支持transaction
MyISAM的一个特性FULLTEXT索引:
- mysql> SELECT * FROM articles WHERE MATCH (title,body)
- -> AGAINST ('+foo -"bar baz"' IN BOOLEAN MODE);
MyISAM使用R-Tree索引支持GIS(geographical and spatial)
2,InnoDB
InnoDB完全支持ACID(atomicity, consistency, isolation, durability)
InnoDB支持transaction
InnoDB使用MVCC(Multi-Versioned Concurrency Control)支持row-level locking,concurrency支持很好
InnoDB支持foreign keys
InnoDB使用B-tree和clustered primary keys来存储索引
InnoDB tables中的row data根据primary key排序来存储,按顺序检索速度快
3,Berkeley DB
BDB数据库由成对的key、value组成,每个row存储为一个唯一的key、value对,按key检索速度快
BDB支持transaction和page-level locking
4,MEMORY
数据存储在内存中,不会持久化到硬盘
速度快,对临时表非常有用
三、MySQL备份
MySQL支持replication来帮助我们scale reads
1,Master-Slave Replication
- Master: Reads and writes
- |
- | Replication
- |
- Slave、Slave、Slave... : Reads
可以很好的scale read capacity,但是不能scale write capacity
2,Tree Replication
- Master: Reads and writes
- |
- | Replication
- |
- Slave/Master、Slave、Slave... : Reads
- |
- | Replication
- |
- Slave、Slave、Slave... : Reads
可以将一部分数据备份到Slave/Master中以增加频繁读取部分数据的Reads capacity
3,Master-Master Replication
- Master: Reads and writes
- |
- | Replication
- |
- Master: Reads and writes
这种备份方式可以链成环,还可以为每个Master建立Slave以增强Reads capacity
以上各种备份均可能有Replication Lag和Stale read的问题
四、数据库Partitioning
数据库备份只能增加读性能而不能很好的增加写性能,所以引入数据库Partitioning
有两种方式:纵向(Clustering)和横向(Federation)
1,Clustering
- Large database with 6 tables
- | |
- X
- Cluster with 2 tables Cluster with 2 tables
缺点是维护困难,同时会增加连接数,这种Partitioning方式的scaling能力有限
2,Federation
MySQL5的NDB存储引擎尝试在内部实现横向Partitioning而我们不用更改程序逻辑
Oracle的RAC(Real Application Clusters)做同样的事情,只不过价钱太贵,$25,000/processor
SQL Server也有同样的实现,但是除了速度较慢外,你只能用Windows,而且价钱也是高达$30,000/processor
The key to avoiding cross-shard queries is to federate your data in such as way that all the records you need to fetch together reside on the same shard.
例如,一个页面中需要显示User的Profile和Comments,我们可以将User表与Comments表中相关的数据放在一个shard中
- Application logic
- |
- |
- Federation logic(Middleware)
- |
- |
- Shard Shard Shard
当我们知道一个User的ID来查找User的Profile和Comments信息时,我们将User的ID传递给Middleware,Middleware知道去哪里查找User的Profile和Comments数据,然后Middleware来take care返回正确的数据给应用层
应用层不用知道有多少shards、数据在shards之间怎样划分、User数据被赋到哪个shard,一切都是透明的