ClickHouse的优点
以下是ClickHouse功能的完整列表
ClickHouse的缺点
此处为clickhouse的sql写法详解:
clickhouse的SQL查询语句与mysql,presto的SQL大致相同,但是也有少许不同的地方,此文仅记录与我日常编写的clickhouseSQL查询语句,包含计算函数,聚合函数,关联语句写法,和一些需要注意的地方。
clickhouse官方网站链接地址:https://clickhouse.yandex/
一:简单查询语句
简单查询语句与mysql等数据库并无差异
select * from 库名.表名 where 条件
ex:select snow,sname,sage from student where sno= 1
需要注意的地方:clickhouse 是用于实时大数据分析引擎,所以,一般使用clickhouse表内数据很大,也有很多分区,所以,查询时如果查询数据太大,where条件限制作用很小的情况下需要使用 limit做查询限制。否则会提示查询数据超过XXGB的报错提示。
二:关联查询语句
首先:clickhouse查询不支持大于两个表以上的直接join,像如下这种mysql等常用的多表关联写法
第二:关联条件从on改为 using ,using 字段必须在各表中名称一致,如果不一致可以通过select 字段 as 别名,将字段名统一
第三:关联 关键字 常用的有一下几种
1.ALL LEFT JOIN
2.ANY LEFT JOIN
3.ALL FULL JOIN
一般的sql关联查询语句如下
ex:select * from table as a left join table as b on a.id=b.aid left join table as c on a.id=c.aid
这种写法在clickhouse中会报错 ,两张表以上的表关联,可以通过子查询的方式来处理
ex: 先将TABLEA表与TABLEB表通过left join 关联之后的结果再left join TABLEC 关联条件从on改为 using using 字段必须在各表中名称一致,如果不一致可以通过select 字段 as 别名,将字段名统一
SELECT
*
FROM
( SELECT
*
FROM
(
(select *
FROM
TABLEA )
ALL LEFT JOIN
(select *
FROM
TABLEB )using aid
)
ALL left JOIN
(select *
FROM
TABLEC) USING aid
)
三:常用函数或表达式
1.sum(字段) 求和
2.avg(字段) 求平均
3.round(字段/sum(字段)/(计算公式a/b),2) 四舍五入取2位小数
4.case when 字段B= 0 then null/0 else round( 字段A/字段B ,4) 判断语句 如果被除数为0 那么返回null或者0中的一个
5.toString()转化为字符串
6.concat(字段值,'要拼接的内容如%')
示例:concat(toString(round(round(a/b,4) * 100 ,2)),'%')
四:相同部分
在where条件与group by ,order by,limit 这些的使用上同mysql的SQL写法
五:实际示例:
SELECT
account_id,
'2018-12-12~2018-12-15' AS date,
account,
ad_click
FROM
(
SELECT
account_id,
fr,
fr_name,
account,
account_balance,
account_budget,
account_exclude_ip,
account_budget_offline_time,
account_status
FROM
marketing.sem_account_type
WHERE
1 = 1
AND lower(fr) IN ('bd_sem')
AND account_id IN (
'18091503',
'18091505',
'18091501'
)
) ALL
LEFT JOIN (
SELECT
account_id,
fr,
round(sum(ad_cost) / 3, 2) AS ad_cost,
round(sum(ad_cost_real) / 3, 2) AS ad_cost_real,
round(sum(ad_impression) / 3, 2) AS ad_impression,
round(sum(ad_click) / 3, 2) AS ad_click,
round(sum(clue_all) / 3, 2) AS clue_all,
round(sum(clue_all_new) / 3, 2) AS clue_all_new,
round(
sum(
c1_kpi_daily_new_customer_amount
) / 3,
2
) AS c1_kpi_daily_new_customer_amount,
round(
sum(c1_kpi_new_customer_amount) / 3,
2
) AS c1_kpi_new_customer_amount,
round(
sum(
c2_kpi_daily_new_customer_amount
) / 3,
2
) AS c2_kpi_daily_new_customer_amount,
round(
sum(c2_kpi_new_customer_amount) / 3,
2
) AS c2_kpi_new_customer_amount,
round(sum(c2c_c1_create) / 3, 2) AS c2c_c1_create,
round(sum(c2b_c1_create) / 3, 2) AS c2b_c1_create,
round(sum(c2c_c1_onsite) / 3, 2) AS c2c_c1_onsite,
round(sum(c2b_c1_onsite) / 3, 2) AS c2b_c1_onsite,
round(sum(c2c_c1_onsale) / 3, 2) AS c2c_c1_onsale,
round(sum(c2b_c1_onsale) / 3, 2) AS c2b_c1_onsale,
round(sum(c2c_c2_appoint) / 3, 2) AS c2c_c2_appoint,
round(sum(b2c_c2_appoint) / 3, 2) AS b2c_c2_appoint,
round(sum(ssss_c2_appoint) / 3, 2) AS ssss_c2_appoint,
round(
sum(c2c_c2_finish_appoint) / 3,
2
) AS c2c_c2_finish_appoint,
round(
sum(b2c_c2_finish_appoint) / 3,
2
) AS b2c_c2_finish_appoint,
round(
sum(ssss_c2_finish_appoint) / 3,
2
) AS ssss_c2_finish_appoint,
round(sum(c2c_c2_order) / 3, 2) AS c2c_c2_order,
round(sum(b2c_c2_order) / 3, 2) AS b2c_c2_order,
round(sum(weighting_number) / 3, 2) AS weighting_number,
round(sum(ssss_c2_order) / 3, 2) AS ssss_c2_order
FROM
(
SELECT
fr,
keyword_id,
account_id,
cost AS ad_cost,
cost_real AS ad_cost_real,
impression AS ad_impression,
click AS ad_click
FROM
marketing.sem_keyword_report
WHERE
1 = 1
AND the_day >= '2018-12-12'
AND the_day <= '2018-12-15'
AND lower(fr) IN ('bd_sem')
AND account_id IN (
'18091503',
'18091505',
'18091501'
)
AND (
campaign_city IN (
'上海',
'东莞',
'中山',
'临沂',
'乌鲁木齐',
'伊犁',
'佛山',
'保定',
'全国',
'兰州',
'包头',
'北京',
'南京',
'南宁',
'南昌',
'南通',
'南阳',
'厦门',
'合肥',
'呼和浩特',
'咸阳',
'哈尔滨',
'唐山',
'嘉兴',
'大同',
'大连',
'天津',
'太原',
'宁波',
'宜昌',
'宿迁',
'常州',
'广州',
'廊坊',
'徐州',
'惠州',
'成都',
'扬州',
'新乡',
'无锡',
'昆明',
'杭州',
'武汉',
'沈阳',
'泉州',
'泰州',
'泸州',
'洛阳',
'济南',
'济宁',
'淮安',
'深圳',
'温州',
'澳门',
'烟台',
'珠海',
'盐城',
'石家庄',
'福州',
'绵阳',
'芜湖',
'苏州',
'襄阳',
'西安',
'许昌',
'贵阳',
'达州',
'郑州',
'重庆',
'金华',
'银川',
'镇江',
'长春',
'长沙',
'青岛'
)
)
) ALL
FULL JOIN (
SELECT
fr,
keyword_id,
account_id,
clue_all,
clue_all_new,
c1_kpi_daily_new_customer_amount,
c1_kpi_new_customer_amount,
c2_kpi_daily_new_customer_amount,
c2_kpi_new_customer_amount,
c2c_c1_create,
c2b_c1_create,
c2c_c1_onsite,
c2b_c1_onsite,
c2c_c1_onsale,
c2b_c1_onsale,
c2c_c2_appoint,
b2c_c2_appoint,
ssss_c2_appoint,
c2c_c2_finish_appoint,
b2c_c2_finish_appoint,
ssss_c2_finish_appoint,
c2c_c2_order,
b2c_c2_order,
weighting_number,
ssss_c2_order
FROM
(
SELECT
fr,
kid AS keyword_id,
sum(clue_all) AS clue_all,
sum(clue_all_new) AS clue_all_new,
sum(
c1_kpi_daily_new_customer_amount
) AS c1_kpi_daily_new_customer_amount,
sum(c1_kpi_new_customer_amount) AS c1_kpi_new_customer_amount,
sum(
c2_kpi_daily_new_customer_amount
) AS c2_kpi_daily_new_customer_amount,
sum(c2_kpi_new_customer_amount) AS c2_kpi_new_customer_amount,
sum(c2c_c1_create) AS c2c_c1_create,
sum(c2b_c1_create) AS c2b_c1_create,
sum(c2c_c1_onsite) AS c2c_c1_onsite,
sum(c2b_c1_onsite) AS c2b_c1_onsite,
sum(c2c_c1_onsale) AS c2c_c1_onsale,
sum(c2b_c1_onsale) AS c2b_c1_onsale,
sum(c2c_c2_appoint) AS c2c_c2_appoint,
sum(b2c_c2_appoint) AS b2c_c2_appoint,
sum(ssss_c2_appoint) AS ssss_c2_appoint,
sum(c2c_c2_finish_appoint) AS c2c_c2_finish_appoint,
sum(b2c_c2_finish_appoint) AS b2c_c2_finish_appoint,
sum(ssss_c2_finish_appoint) AS ssss_c2_finish_appoint,
sum(c2c_c2_order) AS c2c_c2_order,
sum(b2c_c2_order) AS b2c_c2_order,
sum(weighting_number) AS weighting_number,
sum(ssss_c2_order) AS ssss_c2_order
FROM
marketing.market_kid_stat_new_v5
WHERE
1 = 1
AND dts >= '2018-12-12'
AND dts <= '2018-12-15'
AND lower(fr) IN ('bd_sem')
AND (
city IN (
'上海',
'东莞',
'中山',
'临沂',
'乌鲁木齐',
'伊犁',
'佛山',
'保定',
'全国',
'兰州',
'包头',
'北京',
'南京',
'南宁',
'南昌',
'南通',
'南阳',
'厦门',
'合肥',
'呼和浩特',
'咸阳',
'哈尔滨',
'唐山',
'嘉兴',
'大同',
'大连',
'天津',
'太原',
'宁波',
'宜昌',
'宿迁',
'常州',
'广州',
'廊坊',
'徐州',
'惠州',
'成都',
'扬州',
'新乡',
'无锡',
'昆明',
'杭州',
'武汉',
'沈阳',
'泉州',
'泰州',
'泸州',
'洛阳',
'济南',
'济宁',
'淮安',
'深圳',
'温州',
'澳门',
'烟台',
'珠海',
'盐城',
'石家庄',
'福州',
'绵阳',
'芜湖',
'苏州',
'襄阳',
'西安',
'许昌',
'贵阳',
'达州',
'郑州',
'重庆',
'金华',
'银川',
'镇江',
'长春',
'长沙',
'青岛'
)
)
GROUP BY
keyword_id,
fr
) ANY
LEFT JOIN (
SELECT
fr,
keyword_id,
account_id
FROM
marketing.sem_keyword_type
) USING keyword_id,
fr
WHERE
1 = 1
AND lower(fr) IN ('bd_sem')
AND account_id IN (
'18091503',
'18091505',
'18091501'
)
) USING keyword_id,
fr
GROUP BY
account_id,
fr
) USING account_id,
fr
ORDER BY
account_id
LIMIT 0,
50