Hive常用语法

1、concat_ws

指定分隔符,拼接字符串

SELECT CONCAT_WS('-','a','b','c');

2、row_num()

SELECT a.*
FROM
(
  SELECT t.*,row_num()over(partition BY package_name,order BY row) as row_num
  from t
) a
WHERE a.row_num=1

3、加载jar包函数

add jar path/xxxxx.jar;
create temporary function function_name as 'com.company.function_name';

4、中文UTF-8转码

reflect('java.net.URLDecoder', 'decode',XXXXXX ,"UTF-8")

5、Hive加载数据到表

load DATA LOCAL inpath ''
​​​​​​​overwrite into TABLE table_name;

6、避免科学计数法表示浮点数

cast(column as decimal(a, b))

a表示总的位数,b表示保留几位小数

7、修改hive分区

alter table hue_xingqi5_users partition(group_key='未匹配机型') rename to partition(group_key='-1');

8、小数or整数数据提取

 regexp_extract('4个2.0GHz','[0-9]+([.]{1}[0-9]+){0,1}',0);

9、URL去参数

select regexp_extract('http://tool.chinaz.com/regex','(.*)/$',1);
或
select CONCAT(parse_url(wap_url, 'PROTOCOL'),'://',parse_url(wap_url, 'HOST'),parse_url(wap_url, 'PATH'))

10、提取版本号

select regexp_extract('android 9.1.0','[0-9]+(.[0-9]+)*',0)

11、正则判断IP是否合法

正则解释:https://xingqijiang.blog.csdn.net/article/details/102478927

select '61.135.152.137' regexp '^((2(5[0-5]|[0-4]\\d))|[0-1]?\\d{1,2})(\\.((2(5[0-5]|[0-4]\\d))|[0-1]?\\d{1,2})){3}$';
select '2400:89c0:6000:2:643f:b740:28e3:df35' regexp '([a-zA-Z0-9]{1,4}(\\:[a-zA-Z0-9]{1,4}){7})';

 

你可能感兴趣的:(BigData)