mysql分区笔记

mysql分区笔记

内容包含阿里云的数据库内核月报、其他博客和我自己手敲的,不过放心大胆的看,即使不是我写的,我也验证过了

1、mock数据

DELIMITER $$
CREATE PROCEDURE generate_data()
BEGIN
  DECLARE i INT DEFAULT 0;
  WHILE i < 200000 DO
    INSERT INTO `data` (`datetime`,`value`,`channel`) VALUES (
      FROM_UNIXTIME(UNIX_TIMESTAMP('2014-01-01 01:00:00')+FLOOR(RAND()*31536000)),
      ROUND(RAND()*100,2),
      1
    );
    SET i = i + 1;
  END WHILE;
END$$
DELIMITER ;

然后执行 CALL generate_data();

2、执行分区

​ 2.1、按range分区,即制定范围

alter table `data`
PARTITION BY RANGE (id)
(
PARTITION p1 VALUES LESS THAN (50000)
DATA DIRECTORY = '/usr/mysql_data1'
INDEX DIRECTORY = '/usr/mysql_index1',
PARTITION p2  VALUES LESS THAN (100000)
DATA DIRECTORY = '/usr/mysql_data2'
INDEX DIRECTORY = '/usr/mysql_index2',
PARTITION p3  VALUES LESS THAN (120000)
DATA DIRECTORY = '/usr/mysql_data3'
INDEX DIRECTORY = '/usr/mysql_index3',
PARTITION p4  VALUES LESS THAN MAXVALUE
DATA DIRECTORY = '/usr/mysql_data4'
INDEX DIRECTORY = '/usr/mysql_index4'
);

DATA DIRECTORY 数据存储路径

INDEX DIRECTORY 索引存储路径

​ 2.2、按list分区

# 在创建表的时候直接分区
create table t_list( 
a int(11), 
b int(11) 
)(partition by list (b) 
partition p0 values in (1,3,5,7,9), 
partition p1 values in (2,4,6,8,0) 
);

​ 2.3、按hash分区

CREATE TABLE my_member (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    created DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH(id)
PARTITIONS 4;
  1. HASH分区可以不用指定PARTITIONS子句,如上文中的PARTITIONS 4,则默认分区数为1。
  2. 不允许只写PARTITIONS,而不指定分区数。
  3. 同RANGE分区和LIST分区一样,PARTITION BY HASH (expr)子句中的expr返回的必须是整数值。
  4. HASH分区的底层实现其实是基于MOD函数。譬如,对于下表

​ 2.4、按linear hash分区

CREATE TABLE my_members (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY LINEAR HASH( id )
PARTITIONS 4;

linear hash分区是hash分区的一种特殊类型,与HASH分区是基于MOD函数不同的是,它基于的是另外一种算法。

说明: 它的优点是在数据量大的场景,譬如TB级,增加、删除、合并和拆分分区会更快,缺点是,相对于HASH分区,它数据分布不均匀的概率更大。

​ 2.5、按key分区

CREATE TABLE k1 (
    id INT NOT NULL PRIMARY KEY,    
    name VARCHAR(20)
)
PARTITION BY KEY()
PARTITIONS 2;

KEY分区其实跟HASH分区差不多,不同点如下:

  1. KEY分区允许多列,而HASH分区只允许一列。
  2. 如果在有主键或者唯一键的情况下,key中分区列可不指定,默认为主键或者唯一键,如果没有,则必须显性指定列。
  3. KEY分区对象必须为列,而不能是基于列的表达式。
  4. KEY分区和HASH分区的算法不一样,PARTITION BY HASH (expr),MOD取值的对象是expr返回的值,而PARTITION BY KEY (column_list),基于的是列的MD5值。

​ 2.6、在range/list基础上按照hash分区

CREATE TABLE users (
     uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
     name VARCHAR(30) NOT NULL DEFAULT '',
     email VARCHAR(30) NOT NULL DEFAULT ''
)
PARTITION BY RANGE (uid) SUBPARTITION BY HASH (uid % 4) SUBPARTITIONS 2(
     PARTITION p0 VALUES LESS THAN (3000000),
     PARTITION p1 VALUES LESS THAN (6000000)
);

抄的,不过验证过了,list雷同

3、针对其他字段分区事件

​ 3.1、针对时间戳分区

CREATE TABLE my_range_timestamp (
    id INT,
    hiredate TIMESTAMP
)
PARTITION BY RANGE ( UNIX_TIMESTAMP(hiredate) ) (
    PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-02 00:00:00') ),
    PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-03 00:00:00') ),
    PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-04 00:00:00') ),
    PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-05 00:00:00') ),
    PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-06 00:00:00') ),
    PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-07 00:00:00') ),
    PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-08 00:00:00') ),
    PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-09 00:00:00') ),
    PARTITION p9 VALUES LESS THAN ( UNIX_TIMESTAMP('2017-12-10 00:00:00') ),
    PARTITION p10 VALUES LESS THAN (UNIX_TIMESTAMP('2017-12-11 00:00:00') )
);

4、删除分区

alter table data drop partition p0;

5、重建分区

alter table data reorganize partition p0, p1 into (partition p0 value less than (20000));

把原本的p0和p1合并起来放到新的p0内

你可能感兴趣的:(mysql分区笔记)