UDF : User-Defined Function (用户自定义函数)一进一出
UDAF : User-Defined Aggregation Function(用户自定义聚合函数) 多进一出
UDTF : User-Defined Table-Generating Function(用户自定义表生成函数)一进多出
<dependency>
<groupId>org.apache.hivegroupId>
<artifactId>hive-execartifactId>
<version>${hive.version}version>
dependency>
自定义函数需要继承org.apache.hadoop.hive.ql.exec.UDF
类,然后重写evaluate
方法,这里写一个简单的给字段添加随机数字前缀的函数
package com.suddev.hadoop.hive;
import org.apache.hadoop.hive.ql.exec.UDF;
import java.util.Random;
/**
* @author Rand
* @date 2019/10/4 0004
*/
public class AddPrefixUDF extends UDF{
public String evaluate(String input) {
Random random = new Random();
int num = random.nextInt(10);
return num + "_" + input;
}
}
此步略
注册函数分为两种方式:
1.添加jar到hive的classpath
hive> add jar /home/hadoop/lib/hadoop-learn-1.0.jar;
2.创建临时方法
语法: CREATE TEMPORARY FUNCTION 函数名 AS “自定义UDF函数类名”;
hive> CREATE TEMPORARY FUNCTION add_prefix AS "com.suddev.hadoop.hive.AddPrefixUDF";
3.测试
hive> select add_prefix('hello');
OK
2_hello
Time taken: 0.054 seconds, Fetched: 1 row(s)
# 使用show functions也能够查看到add_prefix函数
hive> show functions;
OK
...
add_prefix
...
退出hive重新进入
hive> quit;
hive> show functions;
OK
# 此时已经看不到add_prefix函数了,可以看出刚才只是临时注册
...
hive> drop temporary function add_prefix;
OK
Time taken: 0.003 seconds
1.将jar上传至HDFS
[hadoop@hadoop001 lib]$ hdfs dfs -put hadoop-learn-1.0.jar /lib/
2.永久注册函数
格式: CREATE FUNCTION 函数名 AS “自定义UDF函数类名” USING JAR “jar包所在hdfs路径”;
hive> CREATE FUNCTION add_prefix AS "com.suddev.hadoop.hive.AddPrefixUDF" USING JAR 'hdfs://hadoop001:9000/lib/hadoop-learn-1.0.jar';
converting to local hdfs://hadoop001:9000/lib/hadoop-learn-1.0.jar
Added [/tmp/965db9b7-670e-4109-8891-e919c7d9a5aa_resources/hadoop-learn-1.0.jar] to class path
Added resources: [hdfs://hadoop001:9000/lib/hadoop-learn-1.0.jar]
OK
Time taken: 0.169 seconds
3.测试
hive> select add_prefix('hello');
OK
5_hello
Time taken: 0.566 seconds, Fetched: 1 row(s)
永久注册的函数使用show functions是不能够看到add_prefix函数的!!!!
hive> show functions;
OK
...
# add_prefix 没有这一行
...
但是我们进入mysql查看Hive元数据,在FUNCS
表中能够查询到我们定义的函数信息
mysql> select * from FUNCS;
+---------+-------------------------------------+-------------+-------+------------+-----------+------------+------------+
| FUNC_ID | CLASS_NAME | CREATE_TIME | DB_ID | FUNC_NAME | FUNC_TYPE | OWNER_NAME | OWNER_TYPE |
+---------+-------------------------------------+-------------+-------+------------+-----------+------------+------------+
| 1 | com.suddev.hadoop.hive.AddPrefixUDF | 1570196465 | 1 | add_prefix | 1 | NULL | USER |
+---------+-------------------------------------+-------------+-------+------------+-----------+------------+------------+
1 row in set (0.00 sec)
hive> drop function add_prefix;
converting to local hdfs://hadoop001:9000/lib/hadoop-learn-1.0.jar
Added [/tmp/e2820e2f-7852-4186-8f5e-5542b9bfe5ac_resources/hadoop-learn-1.0.jar] to class path
Added resources: [hdfs://hadoop001:9000/lib/hadoop-learn-1.0.jar]
OK
Time taken: 3.14 seconds