5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等

ClickHouse系列文章

1、ClickHouse介绍
2、clickhouse安装与简单验证(centos)
3、ClickHouse表引擎-MergeTree引擎
4、clickhouse的Log系列表引擎、外部集成表引擎和其他特殊的表引擎介绍及使用
5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等


文章目录

  • ClickHouse系列文章
  • 一、查看数据库容量
  • 二、查看表的各个指标
  • 三、查看表分区
  • 四、跟踪分区
  • 五、检查数据大小
  • 六、查看表中列的数据大小


本文主要介绍ClickHouse查看数据库容量、表的指标、表分区、数据大小等,每种都有具体的使用示例。
本文使用前提参考该系列文章中的部署与验证。
本文主要分为六部分,即查看数据库容量、查看表的各个指标查看表分区、跟踪分区、检查数据大小和查看表中列的数据大小。

Clickhouse是一个高性能且开源的数据库管理系统,主要用于在线分析处理(OLAP)业务。它采用列式存储结构,可使用SQL语句实时生成数据分析报告,另外它还支持索引,分布式查询以及近似计算等特性,凭借其优异的表现,ClickHouse在各大互联网公司均有广泛地应用。

官网:https://clickhouse.com/
中文官网:https://clickhouse.com/docs/zh

clickhouse有system.parts系统表记录表相关元数据,可以通过该表对clickhouse上所有表进行查询表大小、行数等操作。

一、查看数据库容量

select
    sum(rows) as row,--总行数
    formatReadableSize(sum(data_uncompressed_bytes)) as ysq,--原始大小
    formatReadableSize(sum(data_compressed_bytes)) as ysh,--压缩大小
    round(sum(data_compressed_bytes) / sum(data_uncompressed_bytes) * 100, 0) ys_rate--压缩率
from system.parts

5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等_第1张图片

二、查看表的各个指标

select database,
       table,
       sum(bytes) as size,
       sum(rows) as rows,
       min(min_date) as min_date,
       max(max_date) as max_date,
       sum(bytes_on_disk) as bytes_on_disk,
       sum(data_uncompressed_bytes) as data_uncompressed_bytes,
       sum(data_compressed_bytes) as data_compressed_bytes,
       (data_compressed_bytes / data_uncompressed_bytes) * 100 as compress_rate,
       max_date - min_date as days,
       size / (max_date - min_date) as avgDaySize
  from system.parts
 where active
   and database = 'default'
   and table = 'user'
 group by database, table

5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等_第2张图片
这种结果显示的大小size是字节,我们如何转换为常见的MB和GB呢?


select
    database,
    table,
    formatReadableSize(size) as size,
    formatReadableSize(bytes_on_disk) as bytes_on_disk,
    formatReadableSize(data_uncompressed_bytes) as data_uncompressed_bytes,
    formatReadableSize(data_compressed_bytes) as data_compressed_bytes,
    compress_rate,
    rows,
    days,
    formatReadableSize(avgDaySize) as avgDaySize
from
(
    select
        database,
        table,
        sum(bytes) as size,
        sum(rows) as rows,
        min(min_date) as min_date,
        max(max_date) as max_date,
        sum(bytes_on_disk) as bytes_on_disk,
        sum(data_uncompressed_bytes) as data_uncompressed_bytes,
        sum(data_compressed_bytes) as data_compressed_bytes,
        (data_compressed_bytes / data_uncompressed_bytes) * 100 as compress_rate,
        max_date - min_date as days,
        size / (max_date - min_date) as avgDaySize
    from system.parts
    where active 
     and database = 'default'
     and table = 'user'
    group by
        database,
        table
)

这就转换为常见的单位了
5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等_第3张图片

三、查看表分区

select partition
  from system.parts
 where active
   and database = 'default'
   and table = 'user'

5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等_第4张图片

四、跟踪分区

SELECT database,
       table,
       count() AS parts,
       uniq(partition) AS partitions,
       sum(marks) AS marks,
       sum(rows) AS rows,
       formatReadableSize(sum(data_compressed_bytes)) AS compressed,
       formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed,
       round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100.,2) AS percentage
  FROM system.parts
 WHERE active
   and database = 'default'
   and table = 'user'
 GROUP BY database, table

5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等_第5张图片

五、检查数据大小

SELECT table,
       formatReadableSize(sum(data_compressed_bytes)) AS tc,
       formatReadableSize(sum(data_uncompressed_bytes)) AS tu,
       round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100,2) AS ratio
  FROM system.columns
 WHERE database = 'default'
   and table = 'user'
 GROUP BY table
 ORDER BY sum(data_compressed_bytes) ASC

5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等_第6张图片

六、查看表中列的数据大小

SELECT column,
       any(type),
       sum(column_data_compressed_bytes) AS compressed,
       sum(column_data_uncompressed_bytes) AS uncompressed,
       sum(rows)
  FROM system.parts_columns
 WHERE database = 'default'
   and table = 'user'
   AND active
 GROUP BY column
 ORDER BY column ASC

5、ClickHouse查看数据库容量、表的指标、表分区、数据大小等_第7张图片

你可能感兴趣的:(大数据相关组件介绍,数据库,clickhouse,java,大数据,bigdata)