背景
- 避免CREATE/DROP/RENAME TABLE锁
- 实现RENAME数据库操作
- 正在执行的query发现表不存在异常
- rename errors(rename table A to B, C to A is not atomic)
- Atomic数据库支持non-blocking DROP/RENAME表,支持原子交换表名EXCHANGE TABLES t1 AND t2
准备
- 启动ClickHouse server
docker run -d --name test-clickhouse-server-20-12 --ulimit nofile=262144:262144 yandex/clickhouse-server:20.12.4.5
- 启动ClickHouse client
docker run -it --rm --link test-clickhouse-server-20-12:clickhouse-server yandex/clickhouse-client:20.12.4.5 --host clickhouse-server
- 查看ClickHouse server文件系统
docker exec -ti test-clickhouse-server-20-12 /bin/bash
- 创建数据库
CREATE DATABASE atom ENGINE = Atomic
select * from system.databases where name = 'atom' format Vertical;
Row 1:
──────
name: atom
engine: Atomic
data_path: /var/lib/clickhouse/store/
metadata_path: /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/
uuid: 5db78374-217a-42bf-97d2-9ceaddc176af
- 创建表
create table atom.test(`id` Int64, `name` String, `birthday` DateTime DEFAULT toDateTime('0000-00-00 00:00:00')) ENGINE = MergeTree() PRIMARY KEY(id) ORDER BY(id);
insert into atom.test (id, name, birthday) VALUES (1, 'test1', '2019-10-02 01:00:00'), (2, 'test2', '2019-10-02 02:00:00'),(3, 'test3', '2019-10-02 03:00:00'),(4, 'test4', '2019-10-02 04:00:00');
Atomic数据库的结构
数据库有唯一的uuid,数据库中的每个表有唯一的uuid
metadata存储
root@0cd2cde6e19a:/var/lib/clickhouse/metadata# tree
.
├── atom -> /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af
├── atom.sql
├── default -> /var/lib/clickhouse/store/0c9/0c9063bd-26d8-438b-b3e9-5d8cb5745e3d
├── default.sql
├── system -> /var/lib/clickhouse/store/d39/d39d0db0-08e8-4752-9661-5034256821e9
└── system.sql
根目录和老版本相同,仍然是/var/lib/clickhouse/metadata
,其下目录结构也相同,只不过每个数据库名目录实际上是一个超链接,指向实际的存储目录/var/lib/clickhouse/store/(prefix of database_uuid)/(database_uuid)
数据存储
root@0cd2cde6e19a:/var/lib/clickhouse/data# tree
.
├── atom
│ ├── test -> /var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0
│ └── test2 -> /var/lib/clickhouse/store/8c2/8c284554-1125-4bcf-8af0-5c697a735de4
├── default
└── system
├── asynchronous_metric_log -> /var/lib/clickhouse/store/7ed/7ed107ce-00c0-4080-96ac-8264d82a7e7d
├── metric_log -> /var/lib/clickhouse/store/b7f/b7ff5e00-b567-4e14-a89c-177d80808583
├── query_log -> /var/lib/clickhouse/store/50b/50b0ed02-df82-4109-a550-13d559cc28e3
├── query_thread_log -> /var/lib/clickhouse/store/6b5/6b58386c-da69-4136-baad-b3911a49fcc8
└── trace_log -> /var/lib/clickhouse/store/91a/91ad6abd-1e9f-44b6-a8e9-012d31dbb165
根目录和老版本相同,仍然是/var/lib/clickhouse/data
,其下目录结构也相同,只不过每个表名目录实际上是一个超链接,指向实际的存储目录/var/lib/clickhouse/store/(prefix of table_uuid)/(table_uuid)
以test表为例
以test表为例,其基本信息如下:
select * from system.tables where database = 'atom' and name = 'test' format Vertical;
Row 1:
──────
database: atom
name: test
uuid: 6bb77284-2a07-4fff-aa84-932baba39fc0
engine: MergeTree
is_temporary: 0
data_paths: ['/var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0/']
metadata_path: /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/test.sql
metadata_modification_time: 2021-03-16 12:23:11
dependencies_database: []
dependencies_table: []
create_table_query: CREATE TABLE atom.test (`id` Int64, `name` String, `birthday` DateTime DEFAULT toDateTime('0000-00-00 00:00:00')) ENGINE = MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
engine_full: MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
partition_key:
sorting_key: id
primary_key: id
sampling_key:
storage_policy: default
total_rows: 4
total_bytes: 266
lifetime_rows: ᴺᵁᴸᴸ
lifetime_bytes: ᴺᵁᴸᴸ
设计的好处
ClickHouse将逻辑目录和数据存储目录分开,只是使用超链接地方式进行链接,这样使得表逻辑结构地更改独立与数据目录,比如说重新命名数据库,重新命名表等变得非常简单。
基础操作
重命名数据库
0cd2cde6e19a :) rename database atom to atom2;
RENAME DATABASE atom TO atom2
Query id: 15f6dba4-1485-487a-91c7-447fa1f13155
Ok.
0 rows in set. Elapsed: 0.002 sec.
重命名数据库,操作非常简单,并且其底层uuid并不发生变化。
0cd2cde6e19a :) select * from system.databases where name = 'atom2' format Vertical;
SELECT *
FROM system.databases
WHERE name = 'atom2'
FORMAT Vertical
Query id: de4f0867-61ab-4fe4-9e1b-7394ec04da3c
Row 1:
──────
name: atom2
engine: Atomic
data_path: /var/lib/clickhouse/store/
metadata_path: /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/
uuid: 5db78374-217a-42bf-97d2-9ceaddc176af
相当于重新创建了atom2的超链接,同时将文件atom.sql重命名为atom2。
root@0cd2cde6e19a:/var/lib/clickhouse/metadata# tree
.
├── atom2 -> /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af
├── atom2.sql
├── default -> /var/lib/clickhouse/store/0c9/0c9063bd-26d8-438b-b3e9-5d8cb5745e3d
├── default.sql
├── system -> /var/lib/clickhouse/store/d39/d39d0db0-08e8-4752-9661-5034256821e9
└── system.sql
重命名表
0cd2cde6e19a :) rename table test to new;
RENAME TABLE test TO new
Query id: f2faf21a-5190-4cae-9127-30941a1bf35d
Ok.
0 rows in set. Elapsed: 0.002 sec.
uuid不会改变,不会发生数据移动。
0cd2cde6e19a :) select * from system.tables where database = 'atom2' and name = 'new' format Vertical;
SELECT *
FROM system.tables
WHERE (database = 'atom2') AND (name = 'new')
FORMAT Vertical
Query id: 21ce2574-009a-409b-81a1-b80a036cf722
Row 1:
──────
database: atom2
name: new
uuid: 6bb77284-2a07-4fff-aa84-932baba39fc0
engine: MergeTree
is_temporary: 0
data_paths: ['/var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0/']
metadata_path: /var/lib/clickhouse/store/5db/5db78374-217a-42bf-97d2-9ceaddc176af/new.sql
metadata_modification_time: 2021-03-16 12:23:11
dependencies_database: []
dependencies_table: []
create_table_query: CREATE TABLE atom2.new (`id` Int64, `name` String, `birthday` DateTime DEFAULT toDateTime('0000-00-00 00:00:00')) ENGINE = MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
engine_full: MergeTree() PRIMARY KEY id ORDER BY id SETTINGS index_granularity = 8192
partition_key:
sorting_key: id
primary_key: id
sampling_key:
storage_policy: default
total_rows: 4
total_bytes: 266
lifetime_rows: ᴺᵁᴸᴸ
lifetime_bytes: ᴺᵁᴸᴸ
1 rows in set. Elapsed: 0.005 sec.
相当于修改了表的超链接,同时修改表的metedata信息
root@0cd2cde6e19a:/var/lib/clickhouse/data# tree
.
├── atom2
│ ├── new -> /var/lib/clickhouse/store/6bb/6bb77284-2a07-4fff-aa84-932baba39fc0
│ └── test2 -> /var/lib/clickhouse/store/8c2/8c284554-1125-4bcf-8af0-5c697a735de4
├── default
└── system
├── asynchronous_metric_log -> /var/lib/clickhouse/store/7ed/7ed107ce-00c0-4080-96ac-8264d82a7e7d
├── metric_log -> /var/lib/clickhouse/store/b7f/b7ff5e00-b567-4e14-a89c-177d80808583
├── query_log -> /var/lib/clickhouse/store/50b/50b0ed02-df82-4109-a550-13d559cc28e3
├── query_thread_log -> /var/lib/clickhouse/store/6b5/6b58386c-da69-4136-baad-b3911a49fcc8
└── trace_log -> /var/lib/clickhouse/store/91a/91ad6abd-1e9f-44b6-a8e9-012d31dbb165
删除表
- 删除表不会发生数据移动,通过把metadata信息移动到
/var/lib/clickhouse/metadata_dropped
标记删除 - 真正运行的query可以继续使用被删除表,指导表不被使用,才会被真正删除
- 删除表的超链接
- 删除表的metadata信息
0cd2cde6e19a :) drop table test2;
DROP TABLE test2
Query id: ec8aa65f-b4de-41ff-8e51-6b0ad4a59c76
Ok.
0 rows in set. Elapsed: 0.002 sec.
root@0cd2cde6e19a:/var/lib/clickhouse/metadata_dropped# ll
total 12
drwxr-x--- 2 clickhouse clickhouse 4096 Mar 16 13:35 ./
drwx------ 13 clickhouse clickhouse 4096 Mar 16 11:58 ../
-rw-r----- 1 clickhouse clickhouse 246 Mar 16 12:39 atom2.test2.8c284554-1125-4bcf-8af0-5c697a735de4.sql
参考
避免CREATE/DROP/RENAME表时的锁,数据库重命名
Atomic数据库和Ordinary数据库的不同