19-Hive函数

explode

create table hive_wc(sentence string);

load data local inpath '/home/hadoop/data/hive-wc.txt' into table hive_wc;

hello,world,welcome

hello,welcome

求每个单词出现的个数

1) 获取每个单词  split(sentence,",")

["hello","world","welcome"]

["hello","welcome"]

"hello"

"world"

"welcome"

"hello"

"welcome"

select word, count(1) as c

from (select explode(split(sentence,",")) as word from hive_wc) t

group by word ;

json

create table rating_json(json string);

load data local inpath '/home/hadoop/data/rating.json' into table rating_json;

select

json_tuple(json,"movie","rate","time","userid") as (movie,rate,time,userid)

from rating_json limit 10;

create table hive_rownumber(id int,age int, name string, sex string)

row format delimited fields terminated by ',';

load data local inpath '/home/hadoop/data/hive_row_number.txt' into table hive_rownumber;

查询出每种性别中年龄最大的2条数据 ==> topn

select * from hive_rownumber

group by sex order by age limit 2;

order by是全局的排序,是做不到分组内的排序的!!!

分析函数

select id,age,name,sex

from

(select id,age,name,sex,

row_number() over(partition by sex order by age desc) as rank

from hive_rownumber) t

where rank<=2;

User-Defined Functions : UDF

UDF: 一进一出  upper  lower substring

UDAF:Aggregation  多进一出  count max min sum ...

UDTF: Table-Generation  一进多出

IDEA + Maven

功能:输入xxx,输出Hello xxx

自定义UDF函数的步骤

1) 定义一个类 extends UDF

add jar /home/hadoop/lib/hive-1.0.jar;

CREATE TEMPORARY FUNCTION sayHello AS 'com.ruozedata.bigdata.HelloUDF';

TEMPORARY:仅对当前session(黑窗口)有效

CREATE FUNCTION sayRuozeHello AS 'com.ruozedata.bigdata.HelloUDF'

USING JAR 'hdfs://hadoop000:8020/lib/hive-1.0.jar';

源码面前,了无秘密

你可能感兴趣的:(19-Hive函数)