分区(partitioning )就是把表分割存储在不同的物理单元(文件)中,表面上看数据还是在同一张表中。mysql分区无法应用在 MERGE
, CSV
, FEDERATED
引擎中。
查看mysql server是否支持分区使用命令SHOW PLUGINS,
mysql> SHOW PLUGINS; +------------+----------+----------------+---------+---------+ | Name | Status | Type | Library | License | +------------+----------+----------------+---------+---------+ | binlog | ACTIVE | STORAGE ENGINE | NULL | GPL | | partition | ACTIVE | STORAGE ENGINE | NULL | GPL | | ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | GPL | | BLACKHOLE | ACTIVE | STORAGE ENGINE | NULL | GPL | | CSV | ACTIVE | STORAGE ENGINE | NULL | GPL | | FEDERATED | DISABLED | STORAGE ENGINE | NULL | GPL | | MEMORY | ACTIVE | STORAGE ENGINE | NULL | GPL | | InnoDB | ACTIVE | STORAGE ENGINE | NULL | GPL | | MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | GPL | | MyISAM | ACTIVE | STORAGE ENGINE | NULL | GPL | | ndbcluster | DISABLED | STORAGE ENGINE | NULL | GPL | +------------+----------+----------------+---------+---------+ 11 rows in set (0.00 sec)
如果要禁用分区,启动mysql时增加参数 --skip-partition
源码安装时启用分区需要加入参数 -DWITH_PARTITION_STORAGE_ENGINE
mysql5.7以后支持显示的操作指定分区的数据,比如:SELECT * FROM t PARTITION (p0,p1) WHERE c < 5
同理DELETE, INSERT, REPLACE, UPDATE
例子1:
CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT NOT NULL, store_id INT NOT NULL ) PARTITION BY RANGE (store_id) ( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN (11), PARTITION p2 VALUES LESS THAN (16), PARTITION p3 VALUES LESS THAN (21) );
partition表达式中的store_id对应数据库中的字段,p0、p1、p2、p3为分区名。
该表按照store_id字段的数值分为4个分区,如果store_id小于6,被分到p0分区;大于等于6小于11,被分到p1分区;以此类推。如果store_id不在定义范围内则会报错。所以,可以把上面的partition定义修改为:
PARTITION BY RANGE (store_id) ( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN (11), PARTITION p2 VALUES LESS THAN (16), PARTITION p3 VALUES LESS THAN MAXVALUE );
例子2:
CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT ) PARTITION BY RANGE ( YEAR(separated) ) ( PARTITION p0 VALUES LESS THAN (1991), PARTITION p1 VALUES LESS THAN (1996), PARTITION p2 VALUES LESS THAN (2001), PARTITION p3 VALUES LESS THAN MAXVALUE );
可以根据Date类型来为表进行分区,所有在1991年以前(不包括)离开公司(separeted)的将被分到p0分区;在1991年到1995年离开的分到p1;1996年到2000年的p2;2001年以后的p3
关于表达式 YEAR(separated)
例子3:
CREATE TABLE quarterly_report_status ( report_id INT NOT NULL, report_status VARCHAR(20) NOT NULL, report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ) PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) ( PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ), PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ), PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ), PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ), PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ), PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ), PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ), PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ), PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ), PARTITION p9 VALUES LESS THAN (MAXVALUE) );
按照TIMESTAMP分区的例子。
语法:
PARTITION BY LIST(expr) ( PARTITION pName1 VALUES IN (intVal1,intVal2), PARTITION pName2 VALUES IN (intVal3,intVal4,intVal5) );
expr指的是表中的某个字段,该字段必须返回一个整数;
pName1、pName2指的是分区的名字;
intVal表示具体的整数数值,如果expr的值为intVal1,则该条数据被分到pName1区。如果expr的值不在intVal1~intVal5中,则数据库会报错。在一个批量操作中,如果想忽略错误继续执行其余的可以使用IGNORE关键字。
例子:
mysql> CREATE TABLE h2 ( -> c1 INT, -> c2 INT -> ) -> PARTITION BY LIST(c1) ( -> PARTITION p0 VALUES IN (1, 4, 7), -> PARTITION p1 VALUES IN (2, 5, 8) -> );Query OK, 0 rows affected (0.11 sec) mysql> INSERT INTO h2 VALUES (3, 5);ERROR 1525 (HY000): Table has no partition for value 3
类似与Range和List partition ,唯一不同的就是可以使用非整数作为partition key并且可以有多个key。允许的类型为:
1、所有的整数类型:TINYINT,SMALLINT,MEDIUMINT,INT (INTEGER),BIGINT。
不支持其它的数字类型(比如 DECIMAL、FLOAT)。
2、日期类型:DATE、DATETIME
不支持其它的日期类型。
3、字符类型:CHAR,VARCHAR,BINARY,VARBINARY。
不支持TEXT、BLOB类型。
例子:
CREATE TABLE rc1 ( a INT, b INT ) PARTITION BY RANGE COLUMNS(a, b) ( PARTITION p0 VALUES LESS THAN (5, 12), PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE) );
mysql> INSERT INTO rc1 VALUES (5,10), (5,11), (5,12); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> SELECT PARTITION_NAME,TABLE_ROWS -> FROM INFORMATION_SCHEMA.PARTITIONS -> WHERE TABLE_NAME = 'rc1'; +--------------+----------------+------------+ | TABLE_SCHEMA | PARTITION_NAME | TABLE_ROWS | +--------------+----------------+------------+ | p | p0 | 2 | | p | p1 | 1 | +--------------+----------------+------------+ 2 rows in set (0.00 sec)
为什么(5,10),(5,11)被分到p0了?只有(5,12)分到了p1
官方给出了这个一个测试的例子:
mysql> SELECT (5,10) < (5,12), (5,11) < (5,12), (5,12) < (5,12); +-----------------+-----------------+-----------------+ | (5,10) < (5,12) | (5,11) < (5,12) | (5,12) < (5,12) | +-----------------+-----------------+-----------------+ | 1 | 1 | 0 | +-----------------+-----------------+-----------------+ 1 row in set (0.00 sec)
例子:
CREATE TABLE customers_1 ( first_name VARCHAR(25), last_name VARCHAR(25), street_1 VARCHAR(30), street_2 VARCHAR(30), city VARCHAR(15), renewal DATE ) PARTITION BY LIST COLUMNS(city) ( PARTITION pRegion_1 VALUES IN('Oskarshamn', 'Högsby', 'Mönsterås'), PARTITION pRegion_2 VALUES IN('Vimmerby', 'Hultsfred', 'Västervik'), PARTITION pRegion_3 VALUES IN('Nässjö', 'Eksjö', 'Vetlanda'), PARTITION pRegion_4 VALUES IN('Uppvidinge', 'Alvesta', 'Växjo') );
例子:
CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT ) PARTITION BY HASH(store_id) PARTITIONS 4;
使用哈希分区,只需要指定根据哪个字段进行分区(该字段必须为整数类型),共有几个分区。mysql会帮你自动的把数据均匀的分布到各个分区。
例子:
CREATE TABLE k1 ( id INT NOT NULL PRIMARY KEY, name VARCHAR(20) ) PARTITION BY KEY() PARTITIONS 2;
和hash partition类似,不同的是key不能使用YEAR(expr)这类的表达式,如果是空默认使用primary key作分区,如果没有primary key 则使用unique key做分区,primary key和unique key必须设置为not null。字段不限制为整数类型。
CREATE TABLE tm1 ( s1 CHAR(32) PRIMARY KEY ) PARTITION BY KEY(s1) PARTITIONS 10;
如果设置了key partition 且按照主键分区,则无法删除主键,运行ALTER TABLE DROP PRIMARY KEY将会报错。
对分区再分区叫做Subpartition或者叫做composite partition。
例子1:
CREATE TABLE ts (id INT, purchased DATE) PARTITION BY RANGE( YEAR(purchased) ) SUBPARTITION BY HASH( TO_DAYS(purchased) ) SUBPARTITIONS 2 ( PARTITION p0 VALUES LESS THAN (1990), PARTITION p1 VALUES LESS THAN (2000), PARTITION p2 VALUES LESS THAN MAXVALUE );
表ts有3个Range分区p0、p1和p2,每个分区又被分了2个HASH子分区,则总共有3*2=6个分区;
另外一种更灵活的写法:
CREATE TABLE ts (id INT, purchased DATE) PARTITION BY RANGE( YEAR(purchased) ) SUBPARTITION BY HASH( TO_DAYS(purchased) ) ( PARTITION p0 VALUES LESS THAN (1990) ( SUBPARTITION s0, SUBPARTITION s1 ), PARTITION p1 VALUES LESS THAN (2000) ( SUBPARTITION s2, SUBPARTITION s3 ), PARTITION p2 VALUES LESS THAN MAXVALUE ( SUBPARTITION s4, SUBPARTITION s5 ) );
注意:每个分区必须有相同数目的子分区;如果显示的定义了一个子分区,则其它分区也必须显示的定义;SUBPARTITION后一定要有子分区的名字;子分区名不能重复;
在数据比较大的情况下,如果说你想把数据放到多块硬盘中,可以这样:
CREATE TABLE ts (id INT, purchased DATE) PARTITION BY RANGE( YEAR(purchased) ) SUBPARTITION BY HASH( TO_DAYS(purchased) ) ( PARTITION p0 VALUES LESS THAN (1990) ( SUBPARTITION s0 DATA DIRECTORY = '/disk0/data' INDEX DIRECTORY = '/disk0/idx', SUBPARTITION s1 DATA DIRECTORY = '/disk1/data' INDEX DIRECTORY = '/disk1/idx' ), PARTITION p1 VALUES LESS THAN (2000) ( SUBPARTITION s2 DATA DIRECTORY = '/disk2/data' INDEX DIRECTORY = '/disk2/idx', SUBPARTITION s3 DATA DIRECTORY = '/disk3/data' INDEX DIRECTORY = '/disk3/idx' ), PARTITION p2 VALUES LESS THAN MAXVALUE ( SUBPARTITION s4 DATA DIRECTORY = '/disk4/data' INDEX DIRECTORY = '/disk4/idx', SUBPARTITION s5 DATA DIRECTORY = '/disk5/data' INDEX DIRECTORY = '/disk5/idx' ) );
在Range partition中,mysql认为null永远小于非null值,和order by是一样的。
mysql> CREATE TABLE t1 ( -> c1 INT, -> c2 VARCHAR(20) -> ) -> PARTITION BY RANGE(c1) ( -> PARTITION p0 VALUES LESS THAN (0), -> PARTITION p1 VALUES LESS THAN (10), -> PARTITION p2 VALUES LESS THAN MAXVALUE -> );Query OK, 0 rows affected (0.09 sec)
如果c1是null,则被分到p0分区,也就是最小的那个分区。
在List partition中,
mysql> CREATE TABLE ts2 ( -> c1 INT, -> c2 VARCHAR(20) -> ) -> PARTITION BY LIST(c1) ( -> PARTITION p0 VALUES IN (0, 3, 6), -> PARTITION p1 VALUES IN (1, 4, 7), -> PARTITION p2 VALUES IN (2, 5, 8), -> PARTITION p3 VALUES IN (NULL) -> ); Query OK, 0 rows affected (0.01 sec)
可以定义以NULL为key的分区,NULL自然被分到该分区,如果么有定义NULL分区,则插入时遇到NULL会报错。
在Hash 和Key partition中,NULL被认为是0。
参考文档:http://dev.mysql.com/doc/refman/5.7/en/partitioning.html