[ClickHouse新版本]Atomic database

背景

  • 避免CREATE/DROP/RENAME TABLE锁
  • 实现RENAME数据库操作
  • 正在执行的query发现表不存在异常
  • rename errors(rename table A to B, C to A is not atomic)
  • Atomic数据库支持non-blocking DROP/RENAME表,支持原子交换表名EXCHANGE TABLES t1 AND t2

准备

  • 启动ClickHouse server
docker run -d --name test-clickhouse-server-20-12 --ulimit nofile=262144:262144 yandex/clickhouse-server:20.12.4.5
  • 启动ClickHouse client
docker run -it --rm --link test-clickhouse-server-20-12:clickhouse-server yandex/clickhouse-client:20.12.4.5 --host clickhouse-server
  • 查看ClickHouse server文件系统
docker exec -ti test-clickhouse-server-20-12 /bin/bash
  • 创建数据库
CREATE DATABASE atom ENGINE = Atomic
select * from system.databases where name = 'atom' format Vertical;
Row 1:
──────
name:          atom
engine:        Atomic
data_path:     /var/lib/clickhouse/store/
metadata_path: /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/
uuid:          5db78374-217a-42bf-97d2-9ceaddc176af
  • 创建表
create table atom.test(`id` Int64, `name` String, `birthday` DateTime DEFAULT toDateTime('0000-00-00 00:00:00')) ENGINE = MergeTree() PRIMARY KEY(id) ORDER BY(id);
insert into atom.test (id, name, birthday) VALUES (1, 'test1', '2019-10-02 01:00:00'), (2, 'test2', '2019-10-02 02:00:00'),(3, 'test3', '2019-10-02 03:00:00'),(4, 'test4', '2019-10-02 04:00:00');

Atomic数据库的结构

数据库有唯一的uuid,数据库中的每个表有唯一的uuid

metadata存储

root@0cd2cde6e19a:/var/lib/clickhouse/metadata# tree
.
├── atom -> /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af
├── atom.sql
├── default -> /var/lib/clickhouse/store/0c9/0c9063bd-26d8-438b-b3e9-5d8cb5745e3d
├── default.sql
├── system -> /var/lib/clickhouse/store/d39/d39d0db0-08e8-4752-9661-5034256821e9
└── system.sql

根目录和老版本相同,仍然是/var/lib/clickhouse/metadata,其下目录结构也相同,只不过每个数据库名目录实际上是一个超链接,指向实际的存储目录/var/lib/clickhouse/store/(prefix of database_uuid)/(database_uuid)

数据存储

root@0cd2cde6e19a:/var/lib/clickhouse/data# tree
.
├── atom
│   ├── test -> /var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0
│   └── test2 -> /var/lib/clickhouse/store/8c2/8c284554-1125-4bcf-8af0-5c697a735de4
├── default
└── system
    ├── asynchronous_metric_log -> /var/lib/clickhouse/store/7ed/7ed107ce-00c0-4080-96ac-8264d82a7e7d
    ├── metric_log -> /var/lib/clickhouse/store/b7f/b7ff5e00-b567-4e14-a89c-177d80808583
    ├── query_log -> /var/lib/clickhouse/store/50b/50b0ed02-df82-4109-a550-13d559cc28e3
    ├── query_thread_log -> /var/lib/clickhouse/store/6b5/6b58386c-da69-4136-baad-b3911a49fcc8
    └── trace_log -> /var/lib/clickhouse/store/91a/91ad6abd-1e9f-44b6-a8e9-012d31dbb165

根目录和老版本相同,仍然是/var/lib/clickhouse/data,其下目录结构也相同,只不过每个表名目录实际上是一个超链接,指向实际的存储目录/var/lib/clickhouse/store/(prefix of table_uuid)/(table_uuid)

以test表为例

以test表为例,其基本信息如下:

select * from system.tables where database = 'atom' and name = 'test' format Vertical;
Row 1:
──────
database:                   atom
name:                       test
uuid:                       6bb77284-2a07-4fff-aa84-932baba39fc0
engine:                     MergeTree
is_temporary:               0
data_paths:                 ['/var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0/']
metadata_path:              /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/test.sql
metadata_modification_time: 2021-03-16 12:23:11
dependencies_database:      []
dependencies_table:         []
create_table_query:         CREATE TABLE atom.test (`id` Int64, `name` String, `birthday` DateTime DEFAULT toDateTime('0000-00-00 00:00:00')) ENGINE = MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
engine_full:                MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
partition_key:
sorting_key:                id
primary_key:                id
sampling_key:
storage_policy:             default
total_rows:                 4
total_bytes:                266
lifetime_rows:              ᴺᵁᴸᴸ
lifetime_bytes:             ᴺᵁᴸᴸ

设计的好处

ClickHouse将逻辑目录和数据存储目录分开,只是使用超链接地方式进行链接,这样使得表逻辑结构地更改独立与数据目录,比如说重新命名数据库,重新命名表等变得非常简单。

基础操作

重命名数据库

0cd2cde6e19a :) rename database atom to atom2;

RENAME DATABASE atom TO atom2

Query id: 15f6dba4-1485-487a-91c7-447fa1f13155

Ok.

0 rows in set. Elapsed: 0.002 sec.

重命名数据库,操作非常简单,并且其底层uuid并不发生变化。

0cd2cde6e19a :) select * from system.databases where name = 'atom2' format Vertical;

SELECT *
FROM system.databases
WHERE name = 'atom2'
FORMAT Vertical

Query id: de4f0867-61ab-4fe4-9e1b-7394ec04da3c

Row 1:
──────
name:          atom2
engine:        Atomic
data_path:     /var/lib/clickhouse/store/
metadata_path: /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/
uuid:          5db78374-217a-42bf-97d2-9ceaddc176af

相当于重新创建了atom2的超链接,同时将文件atom.sql重命名为atom2。

root@0cd2cde6e19a:/var/lib/clickhouse/metadata# tree
.
├── atom2 -> /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af
├── atom2.sql
├── default -> /var/lib/clickhouse/store/0c9/0c9063bd-26d8-438b-b3e9-5d8cb5745e3d
├── default.sql
├── system -> /var/lib/clickhouse/store/d39/d39d0db0-08e8-4752-9661-5034256821e9
└── system.sql

重命名表

0cd2cde6e19a :) rename table test to new;

RENAME TABLE test TO new

Query id: f2faf21a-5190-4cae-9127-30941a1bf35d

Ok.

0 rows in set. Elapsed: 0.002 sec.

uuid不会改变,不会发生数据移动。

0cd2cde6e19a :) select * from system.tables where database = 'atom2' and name = 'new' format Vertical;

SELECT *
FROM system.tables
WHERE (database = 'atom2') AND (name = 'new')
FORMAT Vertical

Query id: 21ce2574-009a-409b-81a1-b80a036cf722

Row 1:
──────
database:                   atom2
name:                       new
uuid:                       6bb77284-2a07-4fff-aa84-932baba39fc0
engine:                     MergeTree
is_temporary:               0
data_paths:                 ['/var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0/']
metadata_path:              /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/new.sql
metadata_modification_time: 2021-03-16 12:23:11
dependencies_database:      []
dependencies_table:         []
create_table_query:         CREATE TABLE atom2.new (`id` Int64, `name` String, `birthday` DateTime DEFAULT toDateTime('0000-00-00 00:00:00')) ENGINE = MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
engine_full:                MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
partition_key:
sorting_key:                id
primary_key:                id
sampling_key:
storage_policy:             default
total_rows:                 4
total_bytes:                266
lifetime_rows:              ᴺᵁᴸᴸ
lifetime_bytes:             ᴺᵁᴸᴸ

1 rows in set. Elapsed: 0.005 sec.

相当于修改了表的超链接,同时修改表的metedata信息

root@0cd2cde6e19a:/var/lib/clickhouse/data# tree
.
├── atom2
│   ├── new -> /var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0
│   └── test2 -> /var/lib/clickhouse/store/8c2/8c284554-1125-4bcf-8af0-5c697a735de4
├── default
└── system
    ├── asynchronous_metric_log -> /var/lib/clickhouse/store/7ed/7ed107ce-00c0-4080-96ac-8264d82a7e7d
    ├── metric_log -> /var/lib/clickhouse/store/b7f/b7ff5e00-b567-4e14-a89c-177d80808583
    ├── query_log -> /var/lib/clickhouse/store/50b/50b0ed02-df82-4109-a550-13d559cc28e3
    ├── query_thread_log -> /var/lib/clickhouse/store/6b5/6b58386c-da69-4136-baad-b3911a49fcc8
    └── trace_log -> /var/lib/clickhouse/store/91a/91ad6abd-1e9f-44b6-a8e9-012d31dbb165

删除表

  • 删除表不会发生数据移动,通过把metadata信息移动到/var/lib/clickhouse/metadata_dropped标记删除
  • 真正运行的query可以继续使用被删除表,指导表不被使用,才会被真正删除
  • 删除表的超链接
  • 删除表的metadata信息
0cd2cde6e19a :) drop table test2;

DROP TABLE test2

Query id: ec8aa65f-b4de-41ff-8e51-6b0ad4a59c76

Ok.

0 rows in set. Elapsed: 0.002 sec.
root@0cd2cde6e19a:/var/lib/clickhouse/metadata_dropped# ll
total 12
drwxr-x---  2 clickhouse clickhouse 4096 Mar 16 13:35 ./
drwx------ 13 clickhouse clickhouse 4096 Mar 16 11:58 ../
-rw-r-----  1 clickhouse clickhouse  246 Mar 16 12:39 atom2.test2.8c284554-1125-4bcf-8af0-5c697a735de4.sql

参考

避免CREATE/DROP/RENAME表时的锁,数据库重命名
Atomic数据库和Ordinary数据库的不同

你可能感兴趣的:([ClickHouse新版本]Atomic database)