3、Apache Kudu集成impala(shell和java操作)的详细操作

Apache Kudu 系列文章

1、Apache Kudu介绍及架构、工作原理、两种部署方式、使用限制详解
2、Apache Kudu-java api操作kudu详细示例以及kudu的三种实现示例
3、Apache Kudu集成impala(shell和java操作)的详细操作


文章目录

  • Apache Kudu 系列文章
  • 一、kudu集成impala
    • 1、impala配置修改
  • 二、通过impala操作kudu
    • 1、创建kudu表
    • 2、创建内部表
    • 3、创建外部表
      • 1)、使用java api 创建kudu表:
      • 2)、使用impala创建外部表 , 将kudu的表映射到impala上
    • 4、使用impala对kudu进行DML
      • 1)、插入数据
        • 1 、此示例插入单行
        • 2、此示例插入3行
        • 3、批量导入数据
      • 2)、更新数据
      • 3)、删除数据
    • 5、更改表属性
      • 1)、重命名impala表
      • 2)、重新命名内部表的基础kudu表
      • 2)、将外部表重新映射kudu表
    • 6、更改kudu master地址
    • 7、将内部表改为外部表
    • 8、impala使用java操作kudu
      • 1)、maven依赖
      • 2)、jdbc连接impala操作kudu


本文简单的介绍了通过impala操作kudu,包括通过shell和java api。
本文依赖impala和kudu的环境好用。
本文分为2个部分,即集成impala和通过impala操作kudu。

一、kudu集成impala

最佳实践我认为是CDH环境中安装kudu、impala、hue一并使用,或在cdh环境中安装kudu和impala。

1、impala配置修改

在每一个服务器的impala的配置文件中添加如下配置。

vim /etc/default/impala
在IMPALA_SERVER_ARGS下添加:
-kudu_master_hosts=server-1:7051,server-2:7051,server-3:7051

二、通过impala操作kudu

以下示例是kudu在cdh环境中创建的,即kudu的服务与之前系列文章中的地址不一致,实际使用的时候以实际环境为准。

1、创建kudu表

创建表时,经常遇到的问题及解决办法,具体示例如下

ERROR: AnalysisException: Columns cannot be specified with an external Kudu table.
#原因:创建外部表,不能指定表字段
ERROR: AnalysisException: Not allowed to set 'kudu.table_name' manually for managed Kudu tables .
#原因:创建内部表的时候,加了TBLPROPERTIE的kudu.table_name属性
ERROR: ImpalaRuntimeException: Error loading schema of table 'parquet_test.tank_test_2'
#原因:创建外部表时,找不到kudu表,impala::parquet_test.tank_test_2 
#需要先启动hdfs、hive、kudu、impala。使用impala的shell控制台。

以下是创建表示例

[root@server8 kudu]# impala-shell
Starting Impala Shell without Kerberos authentication
Opened TCP connection to server8:21000
Connected to server8:21000
Server version: impalad version 3.2.0-cdh6.2.1 RELEASE (build 525e372410dd2ce206e2ad0f21f57cae7380c0cb)
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v3.2.0-cdh6.2.1 (525e372) built on Wed Sep 11 01:30:44 PDT 2019)

Run the PROFILE command after a query has finished to see a comprehensive summary
of all the performance and diagnostic information that Impala gathered for that
query. Be warned, it can be very long!
***********************************************************************************
[server8:21000] default> show databases;
Query: show databases
+------------------+----------------------------------------------+
| name             | comment                                      |
+------------------+----------------------------------------------+
| _impala_builtins | System database for Impala builtin functions |
| default          | Default Hive database                        |
| test             |                                              |
+------------------+----------------------------------------------+
Fetched 3 row(s) in 0.01s
[server8:21000] default> use test;

2、创建内部表

内部表由Impala管理,当您从Impala中删除时,数据和表确实被删除。
当您使用Impala创建新表时,它通常是内部表。
java api创建内部表详见该系列文章的2、Apache Kudu-java api操作kudu详细示例以及kudu的三种实现示例

CREATE TABLE kudu_table1
(
    id BIGINT,
    name STRING,
    PRIMARY KEY(id)
)
PARTITION BY HASH PARTITIONS 16
STORED AS KUDU
TBLPROPERTIES (
    'kudu.master_addresses' = 'server6:7051,server7:7051'
);

在 CREATE TABLE 语句中,必须首先列出构成主键的列。
3、Apache Kudu集成impala(shell和java操作)的详细操作_第1张图片

3、Apache Kudu集成impala(shell和java操作)的详细操作_第2张图片
3、Apache Kudu集成impala(shell和java操作)的详细操作_第3张图片

3、创建外部表

外部表(创建者CREATE EXTERNAL TABLE)不受Impala管理,并且删除此表不会将表从其源位置(此处为Kudu)丢弃。
它只会去除Impala和Kudu之间的映射。
这是Kudu提供的用于将现有表映射到Impala的语法。

1)、使用java api 创建kudu表:

import java.util.LinkedList;
import java.util.List;

import org.apache.kudu.ColumnSchema;
import org.apache.kudu.Schema;
import org.apache.kudu.Type;
import org.apache.kudu.client.CreateTableOptions;
import org.apache.kudu.client.KuduClient;
import org.apache.kudu.client.KuduException;

/**
 * @author alanchan
 *
 */
public class CreateTable {
	private static ColumnSchema newColumn(String name, Type type, boolean iskey) {
		ColumnSchema.ColumnSchemaBuilder column = new ColumnSchema.ColumnSchemaBuilder(name, type);
		column.key(iskey);
		return column.build();
	}

	/**
	 * @param args
	 * @throws KuduException
	 */
	public static void main(String[] args) throws KuduException {
		String tableName = "person";
		// master地址,如果不设置端口,则使用默认端口
		final String masteraddr = "server6:7051,server7";
		// 创建kudu的数据库链接
		KuduClient client = new KuduClient.KuduClientBuilder(masteraddr).defaultSocketReadTimeoutMs(6000).build();

		// 设置表的schema
		List<ColumnSchema> columns = new LinkedList<ColumnSchema>();
		columns.add(newColumn("CompanyId", Type.INT32, true));
		columns.add(newColumn("WorkId", Type.INT32, false));
		columns.add(newColumn("Name", Type.STRING, false));
		columns.add(newColumn("Gender", Type.STRING, false));
		columns.add(newColumn("Photo", Type.STRING, false));
		Schema schema = new Schema(columns);
		// 创建表时提供的所有选项
		CreateTableOptions options = new CreateTableOptions();

		// 设置表的replica备份和分区规则
		List<String> parcols = new LinkedList<String>();

		parcols.add("CompanyId");
		// 设置表的备份数
		options.setNumReplicas(1);
		// 设置range分区
		options.setRangePartitionColumns(parcols);

		// 设置hash分区和数量
		options.addHashPartitions(parcols, 3);
		try {
			if (client.tableExists(tableName)) {
				client.deleteTable(tableName);
			}
			client.createTable(tableName, schema, options);
		} catch (KuduException e) {
			e.printStackTrace();
		}
		client.close();
	}

}

2)、使用impala创建外部表 , 将kudu的表映射到impala上

CREATE EXTERNAL TABLE `person` STORED AS KUDU
TBLPROPERTIES(
    'kudu.table_name' = 'person',
    'kudu.master_addresses' = 'server6:7051,server7:7051'
    );

3、Apache Kudu集成impala(shell和java操作)的详细操作_第4张图片

4、使用impala对kudu进行DML

-- 建表
CREATE TABLE kudu_table1
(
    id BIGINT,
    name STRING,
    PRIMARY KEY(id)
)
PARTITION BY HASH PARTITIONS 16
STORED AS KUDU
TBLPROPERTIES (
    'kudu.master_addresses' = 'server6:7051,server7:7051'
);

1)、插入数据

impala 允许使用标准 SQL 语句将数据插入 Kudu 。

1 、此示例插入单行

INSERT INTO kudu_table1 VALUES (50, "zhangsan");

[server8:21000] test> INSERT INTO kudu_table1 VALUES (50, "zhangsan");
Query: INSERT INTO kudu_table1 VALUES (50, "zhangsan")
Query submitted at: 2022-11-11 14:30:56 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=e449581d898ee83f:2ddf06a300000000
Modified 1 row(s), 0 row error(s) in 0.14s
[server8:21000] test> select * from kudu_table1;
Query: select * from kudu_table1
Query submitted at: 2022-11-11 14:31:12 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=1f4cd70eb363b6bf:92def44a00000000
+----+----------+
| id | name     |
+----+----------+
| 50 | zhangsan |
+----+----------+
Fetched 1 row(s) in 0.13s

2、此示例插入3行

INSERT INTO kudu_table1  VALUES (1, "john"), (2, "jane"), (3, "jim");

[server8:21000] test> INSERT INTO kudu_table1  VALUES (1, "john"), (2, "jane"), (3, "jim");
Query: INSERT INTO kudu_table1  VALUES (1, "john"), (2, "jane"), (3, "jim")
Query submitted at: 2022-11-11 14:31:50 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=cd4d697217e97363:381fb56400000000
Modified 3 row(s), 0 row error(s) in 0.11s
[server8:21000] test> select * from kudu_table1;
Query: select * from kudu_table1
Query submitted at: 2022-11-11 14:31:55 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=4e4afd2b4b2e2a89:99c642fe00000000
+----+----------+
| id | name     |
+----+----------+
| 50 | zhangsan |
| 2  | jane     |
| 3  | jim      |
| 1  | john     |
+----+----------+
Fetched 4 row(s) in 0.13s

3、批量导入数据

从 Impala 和 Kudu 的角度来看,通常表现最好的方法通常是使用 Impala 中的 SELECT FROM 语句导入数据。
INSERT INTO kudu_table1 SELECT id,user_name FROM users;


[server8:21000] test> select * from users limit 10;
Query: select * from users limit 10
Query submitted at: 2022-11-11 14:34:15 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=fe4624da33e365e2:88eb9d5500000000
+-----------+------------+-----------+-------------+-----------------------+-----------------------+
| id        | user_name  | pass_word | phone       | email                 | create_day            |
+-----------+------------+-----------+-------------+-----------------------+-----------------------+
| 303222524 | alan92602  | 702026    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 303222550 | alan29109  | 576817    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 303222671 | alan85191  | 202457    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 303222821 | alan154030 | 26640     | 13977776789 | alan.chan.chn@163.com | 2021-12-25 00:00:00.0 |
| 303222824 | alan84839  | 174201    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 303222861 | alan814791 | 524257    | 13977776789 | alan.chan.chn@163.com | 2021-12-28 00:00:00.0 |
| 303223003 | alan84570  | 603142    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 303223132 | alan430534 | 365717    | 13977776789 | alan.chan.chn@163.com | 2021-12-28 00:00:00.0 |
| 303223275 | alan97120  | 728773    | 13977776789 | alan.chan.chn@163.com | 2021-12-28 00:00:00.0 |
| 303223421 | alan282347 | 80452     | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
+-----------+------------+-----------+-------------+-----------------------+-----------------------+
Fetched 10 row(s) in 3.78s

[server8:21000] test> INSERT INTO kudu_table1  SELECT id,user_name FROM users;
Query: INSERT INTO kudu_table1  SELECT id,user_name FROM users
Query submitted at: 2022-11-11 14:53:40 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=92460edd11d62e92:967f08ea00000000
Modified 12606948 row(s), 0 row error(s) in 272.03s
[server8:21000] test> select count(*) from kudu_table1;
Query: select count(*) from kudu_table1
Query submitted at: 2022-11-11 15:03:46 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=194d9a11af5b15b9:1779d64d00000000
+----------+
| count(*) |
+----------+
| 12606948 |
+----------+
Fetched 1 row(s) in 0.24s

[server8:21000] test> select * from kudu_table1 where name = 'alan97120';
Query: select * from kudu_table1 where name = 'alan97120'
Query submitted at: 2022-11-11 15:05:23 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=d64b6b0831fb7f11:ac6676ce00000000
+------------+-----------+
| id         | name      |
+------------+-----------+
| 271918676  | alan97120 |
| 736341555  | alan97120 |
| 337357284  | alan97120 |
| 1210838326 | alan97120 |
| 1215865751 | alan97120 |
| 1216782033 | alan97120 |
| 281544285  | alan97120 |
| 355536558  | alan97120 |
| 714570045  | alan97120 |
| 1119152879 | alan97120 |
| 141570028  | alan97120 |
| 855540844  | alan97120 |
| 1036835142 | alan97120 |
| 438568568  | alan97120 |
| 446164366  | alan97120 |
| 270182462  | alan97120 |
| 732285721  | alan97120 |
| 1273525173 | alan97120 |
| 804762168  | alan97120 |
| 1260341260 | alan97120 |
| 761634175  | alan97120 |
| 105433884  | alan97120 |
| 282301624  | alan97120 |
| 268439775  | alan97120 |
| 290796160  | alan97120 |
| 402267344  | alan97120 |
| 915687259  | alan97120 |
| 126474806  | alan97120 |
| 303223275  | alan97120 |
+------------+-----------+
Fetched 29 row(s) in 0.24s

[server8:21000] test> select u.* from users u join kudu_table1 k on u.id = k.id limit 10;
Query: select u.* from users u join kudu_table1 k on u.id = k.id limit 10
Query submitted at: 2022-11-11 15:08:17 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=9846b2d60bc8b552:2fa57b3f00000000
+-----------+------------+-----------+-------------+-----------------------+-----------------------+
| id        | user_name  | pass_word | phone       | email                 | create_day            |
+-----------+------------+-----------+-------------+-----------------------+-----------------------+
| 265710012 | alan975912 | 851707    | 13977776789 | alan.chan.chn@163.com | 2021-12-25 00:00:00.0 |
| 265710060 | alan496138 | 562512    | 13977776789 | alan.chan.chn@163.com | 2021-12-28 00:00:00.0 |
| 265710102 | alan4132   | 242234    | 13977776789 | alan.chan.chn@163.com | 2021-12-25 00:00:00.0 |
| 265710182 | alan895680 | 286871    | 13977776789 | alan.chan.chn@163.com | 2021-12-25 00:00:00.0 |
| 265710207 | alan118225 | 603765    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 265710208 | alan5987   | 113283    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 265710277 | alan6786   | 650406    | 13977776789 | alan.chan.chn@163.com | 2021-12-27 00:00:00.0 |
| 265710334 | alan894394 | 708884    | 13977776789 | alan.chan.chn@163.com | 2021-12-25 00:00:00.0 |
| 265710376 | alan596645 | 188770    | 13977776789 | alan.chan.chn@163.com | 2021-12-28 00:00:00.0 |
| 265710402 | alan913881 | 447452    | 13977776789 | alan.chan.chn@163.com | 2021-12-28 00:00:00.0 |
+-----------+------------+-----------+-------------+-----------------------+-----------------------+
Fetched 10 row(s) in 12.10s

2)、更新数据

UPDATE kudu_table1 SET name="xiaowang" where id =265710402 ;

[server8:21000] test> UPDATE kudu_table1 SET name="xiaowang" where id =265710402 ;
Query: UPDATE kudu_table1 SET name="xiaowang" where id =265710402
Query submitted at: 2022-11-11 15:10:01 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=11401f685f23ec73:9ad837ed00000000
Modified 1 row(s), 0 row error(s) in 0.14s

[server8:21000] test> select * from kudu_table1 where id =265710402 ;
Query: select * from kudu_table1 where id =265710402
Query submitted at: 2022-11-11 15:10:48 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=d4451118cea9d3e9:f1e5f67d00000000
+-----------+----------+
| id        | name     |
+-----------+----------+
| 265710402 | xiaowang |
+-----------+----------+
Fetched 1 row(s) in 0.13s

3)、删除数据

delete from kudu_table1 where id =265710402 ;

[server8:21000] test> delete from kudu_table1 where id =265710402 ;
Query: delete from kudu_table1 where id =265710402
Query submitted at: 2022-11-11 15:11:59 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=9b462d35e713e823:d14ffc6e00000000
Modified 1 row(s), 0 row error(s) in 0.12s
[server8:21000] test> select * from kudu_table1 where id =265710402 ;
Query: select * from kudu_table1 where id =265710402
Query submitted at: 2022-11-11 15:12:18 (Coordinator: http://server8:25000)
Query progress can be monitored at: http://server8:25000/query_plan?query_id=d348b59b1455e646:7ba432dd00000000
Fetched 0 row(s) in 0.13s

5、更改表属性

1)、重命名impala表

ALTER TABLE PERSON RENAME TO person_temp;

[server8:21000] test> ALTER TABLE PERSON RENAME TO person_temp;
Query: ALTER TABLE PERSON RENAME TO person_temp
+--------------------------+
| summary                  |
+--------------------------+
| Renaming was successful. |
+--------------------------+
Fetched 1 row(s) in 3.69s
[server8:21000] test> show tables;
Query: show tables
+----------------------+
| name                 |
+----------------------+
| author               |
| dx_user              |
| emp_add              |
| employee             |
| kudu_table1          |
| person_temp          |
| t_user               |
| tb_login             |
| the_nba_championship |
| users                |
+----------------------+
Fetched 10 row(s) in 0.00s

2)、重新命名内部表的基础kudu表

--创建内部表
CREATE TABLE kudu_student
(
    CompanyId INT,
    WorkId INT,
    Name STRING,
    Gender STRING,
    Photo STRING,
    PRIMARY KEY(CompanyId)
)
PARTITION BY HASH PARTITIONS 16
STORED AS KUDU
TBLPROPERTIES (
    'kudu.master_addresses' = 'server6:7051,server7:7051'
);



------------------------------好像执行出错---------------------------------
如果表是内部表,则可以通过更改 kudu.table_name 属性重命名底层的 Kudu 表。
ALTER TABLE kudu_student SET TBLPROPERTIES('kudu.table_name' = 'kudu_student1');
------------------------------好像执行出错---------------------------------

---------------------------------以下操作,表修改成功-------------------------
-- 1、将内部表更改为外部表(impala-shell操作)
alter table kudu_student set tblproperties('EXTERNAL'='true');
[server8:21000] test> alter table kudu_student set tblproperties('EXTERNAL'='true');
Query: alter table kudu_student set tblproperties('EXTERNAL'='true')
+----------------+
| summary        |
+----------------+
| Updated table. |
+----------------+
Fetched 1 row(s) in 0.40s

-- 2、更改表名(impala-shell操作)
ALTER TABLE kudu_student rename to kudu_student1 ;
[server8:21000] test> ALTER TABLE kudu_student rename to kudu_student1 ;
Query: ALTER TABLE kudu_student rename to kudu_student1
+--------------------------+
| summary                  |
+--------------------------+
| Renaming was successful. |
+--------------------------+
Fetched 1 row(s) in 0.20s
[server8:21000] test> show tables;
Query: show tables
+----------------------+
| name                 |
+----------------------+
| kudu_student1        |
| kudu_table1          |
| person_temp          |
+----------------------+
Fetched 11 row(s) in 0.00s

-- 3、查看新表的建表语句,必须执行(impala-shell操作)
show create table kudu_student1;
[server8:21000] test> show create table kudu_student1;
Query: show create table kudu_student1
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| result                                                                                                                                                                          |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CREATE EXTERNAL TABLE test.kudu_student1                                                                                                                                        |
| STORED AS KUDU                                                                                                                                                                  |
| TBLPROPERTIES ('kudu.master_addresses'='server6:7051,server7:7051', 'kudu.table_name'='impala::test.kudu_student', 'numFiles'='0', 'numFilesErasureCoded'='0', 'totalSize'='0') |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Fetched 1 row(s) in 4.60s

-- 4、更改kudu 里面imala::kudu表名(kudu master服务器操作)
kudu table rename_table master2 impala::test.kudu_student impala::test.kudu_student1
[root@server6 data]# kudu table rename_table server6 impala::test.kudu_student impala::test.kudu_student1
[root@server6 data]#

-- 5、更改impala 建表语句中的kudu.name(impala-shell操作)
alter table kudu_student1 set tblproperties('kudu.table_name'='impala::test.kudu_student1');
[server8:21000] test> alter table kudu_student1 set tblproperties('kudu.table_name'='impala::test.kudu_student1');
Query: alter table kudu_student1 set tblproperties('kudu.table_name'='impala::test.kudu_student1')
+----------------+
| summary        |
+----------------+
| Updated table. |
+----------------+
Fetched 1 row(s) in 0.10s

-- 6、更改回表为内部表(impala-shell操作)
alter table kudu_student1 set tblproperties('EXTERNAL'='false');
[server8:21000] test> alter table kudu_student1 set tblproperties('EXTERNAL'='false');
Query: alter table kudu_student1 set tblproperties('EXTERNAL'='false')
+----------------+
| summary        |
+----------------+
| Updated table. |
+----------------+
Fetched 1 row(s) in 0.10s

2)、将外部表重新映射kudu表

如果用户在使用过程中发现其他应用程序重新命名了kudu表,那么此时的外部表需要重新映射到kudu上。

-- 创建一个外部表:
CREATE EXTERNAL TABLE external_table
    STORED AS KUDU
    TBLPROPERTIES (
    'kudu.master_addresses' = 'server6:7051,server7:7051',
    'kudu.table_name' = 'person'
);
-- 重新映射外部表,指向不同的kudu表
ALTER TABLE external_table SET TBLPROPERTIES('kudu.table_name' = 'student')

上面的操作是:将external_table映射的PERSON表重新指向hashTable表。

6、更改kudu master地址

ALTER TABLE my_table SET TBLPROPERTIES('kudu.master_addresses' = 'kudu-new-master.example.com:7051');

7、将内部表改为外部表

ALTER TABLE my_table SET TBLPROPERTIES('EXTERNAL' = 'TRUE');

8、impala使用java操作kudu

对于impala而言,开发人员是可以通过JDBC连接impala的,有了JDBC,开发人员可以通过impala来间接操作 kudu。

1)、maven依赖

<dependencies>
		<dependency>
			<groupId>org.apache.kudugroupId>
			<artifactId>kudu-clientartifactId>
			<version>1.9.0version>
		dependency>
		<dependency>
			<groupId>junitgroupId>
			<artifactId>junitartifactId>
			<version>4.12version>
		dependency>
		

		
		<dependency>
			<groupId>org.apache.thriftgroupId>
			<artifactId>libfb303artifactId>
			<version>0.9.3version>
			<type>pomtype>
		dependency>
		
		<dependency>
			<groupId>org.apache.thriftgroupId>
			<artifactId>libthriftartifactId>
			<version>0.9.3version>
			<type>pomtype>
		dependency>
		<dependency>
			<groupId>org.apache.hivegroupId>
			<artifactId>hive-jdbcartifactId>
			<exclusions>
				<exclusion>
					<groupId>org.apache.hivegroupId>
					<artifactId>hive-service-rpcartifactId>
				exclusion>
				<exclusion>
					<groupId>org.apache.hivegroupId>
					<artifactId>hive-serviceartifactId>
				exclusion>
			exclusions>
			<version>1.1.0version>
		dependency>
		
		<dependency>
			<groupId>org.apache.hivegroupId>
			<artifactId>hive-serviceartifactId>
			<version>1.1.0version>
		dependency>
	dependencies>

2)、jdbc连接impala操作kudu

使用JDBC连接impala操作kudu。

  • 创建实体类
import lombok.Data;

/**
 * @author alanchan
 *
 */
@Data
public class Person {
	private int companyId;
	private int workId;
	private String name;
	private String gender;
	private String photo;

	public Person(int companyId, int workId, String name, String gender, String photo) {
		this.companyId = companyId;
		this.workId = workId;
		this.name = name;
		this.gender = gender;
		this.photo = photo;
	}

	public Person() {
	}
}
  • JDBC连接impala对kudu进行增删改查
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

/**
 * @author alanchan
 *
 */
public class TestImpalaOpKudu {
	private static String JDBC_DRIVER = "com.cloudera.impala.jdbc41.Driver";
	private static String CONNECTION_URL = "jdbc:impala://server7:21050/test;auth=noSasl";
	// 定义数据库连接
	static Connection conn = null;
	// 定义PreparedStatement对象
	static PreparedStatement ps = null;
	// 定义查询的结果集
	static ResultSet rs = null;

	// 数据库连接
	public static Connection getConn() {
		try {
			Class.forName(JDBC_DRIVER);
			conn = DriverManager.getConnection(CONNECTION_URL);
		} catch (Exception e) {
			e.printStackTrace();
		}
		return conn;
	}

	// 创建一个表
	public static void createTable() {
		conn = getConn();
		StringBuilder sql = new StringBuilder();
		sql.append("CREATE TABLE impala_kudu_test").append("( companyId BIGINT,").append("workId BIGINT,").append("name STRING,").append("gender STRING,").append("photo STRING,")
				.append("PRIMARY KEY(companyId) ) ").append("PARTITION BY HASH PARTITIONS 16 ").append("STORED AS KUDU ").append("TBLPROPERTIES ( ")
				.append("'kudu.master_addresses' = 'server6:7051,server7:7051' )");
//		.append("'kudu.table_name' = 'impala_kudu_test' "))

		try {
			ps = conn.prepareStatement(sql.toString());
			ps.execute();
		} catch (SQLException e) {
			e.printStackTrace();
		}
	}

	// 查询数据
	public static ResultSet queryRows() {
		try {
			String sql = "select * from impala_kudu_test";
			ps = getConn().prepareStatement(sql);
			rs = ps.executeQuery();
		} catch (SQLException e) {
			e.printStackTrace();
		}
		return rs;
	}

	// 插入数据
	public static void insertRows(Person person) {
		conn = getConn();
		String sql = "insert into table impala_kudu_test(companyId,workId,name,gender,photo) values(?,?,?,?,?)";
		try {
			ps = conn.prepareStatement(sql);
			ps.setInt(1, person.getCompanyId());
			ps.setInt(2, person.getWorkId());
			ps.setString(3, person.getName());
			ps.setString(4, person.getGender());
			ps.setString(5, person.getPhoto());
			ps.execute();
		} catch (SQLException e) {
			e.printStackTrace();
		} finally {
			if (ps != null) {
				try {
					ps.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
			}
			if (conn != null) {
				try {
					conn.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
			}
		}
	}

	// 打印结果
	public static void printRows(ResultSet rs) {
		try {
			while (rs.next()) {
				// 获取表的每一行字段信息
				int companyId = rs.getInt("companyId");
				int workId = rs.getInt("workId");
				String name = rs.getString("name");
				String gender = rs.getString("gender");
				String photo = rs.getString("photo");

				System.out.print("companyId:" + companyId + " ");
				System.out.print("workId:" + workId + " ");
				System.out.print("name:" + name + " ");
				System.out.print("gender:" + gender + " ");
				System.out.println("photo:" + photo);

			}
		} catch (SQLException e) {
			e.printStackTrace();
		} finally {
			if (ps != null) {
				try {
					ps.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
			}

			if (conn != null) {
				try {
					conn.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
			}
		}
	}

	// 更新数据
	public static void updateRows(Person person) {
		// 定义执行的sql语句
		String sql = "update impala_kudu_test set workId=" + person.getWorkId() + ",name='" + person.getName() + "' ," + "gender='" + person.getGender() + "' ," + "photo='"
				+ person.getPhoto() + "' where companyId=" + person.getCompanyId();

		try {
			ps = getConn().prepareStatement(sql);
			ps.execute();
		} catch (SQLException e) {
			e.printStackTrace();
		} finally {
			if (ps != null) {
				try {
					ps.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
			}

			if (conn != null) {
				try {
					conn.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
			}
		}
	}

	// 删除数据
	public static void deleteRows(int companyId) {
		String sql = "delete from impala_kudu_test where companyId=" + companyId;
		try {
			ps = getConn().prepareStatement(sql);
			ps.execute();
		} catch (SQLException e) {
			e.printStackTrace();

		}
	}

	// 删除表
	public static void dropTable() {
		String sql = "drop table if exists impala_kudu_test";
		try {
			ps = getConn().prepareStatement(sql);
			ps.execute();
		} catch (SQLException e) {
			e.printStackTrace();
		}
	}

	public static void main(String[] args) {
		Connection conn = TestImpalaOpKudu.getConn();
		// 创建一个表
		TestImpalaOpKudu.createTable();
		// 插入数据
		TestImpalaOpKudu.insertRows(new Person(1, 100, "lisi", "male", "lisi-photo"));
		// 查询表的数据
		ResultSet rs = TestImpalaOpKudu.queryRows();
		TestImpalaOpKudu.printRows(rs);
		// 更新数据
		TestImpalaOpKudu.updateRows(new Person(1, 200, "zhangsan", "male", "zhangsan-photo"));
		// 删除数据
		TestImpalaOpKudu.deleteRows(1);
		// 删除表
		TestImpalaOpKudu.dropTable();
	}
}

以上,简单的介绍了通过impala操作kudu,包括通过shell和java api。

你可能感兴趣的:(#,kudu专栏,java,apache,kudu,impala,kudu,impala_shell,impala,java,kudu,java,kudu集成impala)