配置hadoop HIVE元数据保存在mysql中

先确保已经成功安装了HIVEMYSQL


hive-site.xml中添加如下内容,指定METASTORE的地址以及连接方式

刚安装好hive,conf下是没有hive-site.xml文件的,需要复制 

hive-default.xml为hive-site.xml。然后再进行修改。

     
javax.jdo.option.ConnectionURL     
jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8     
JDBC connect string for a JDBC metastore  
  
     
javax.jdo.option.ConnectionDriverName     
com.mysql.jdbc.Driver     
Driver class name for a JDBC metastore  
  
     
javax.jdo.option.ConnectionUserName     
root     
username to use against metastore database  
  
     
javax.jdo.option.ConnectionPassword     
123     
password to use against metastore database  

 

然后登陆到HIVE客户端,创建一个表试试

[gpadmin1@hadoop5 hive-0.6.0]$ bin/hive
Hive history file=/tmp/gpadmin1/hive_job_log_gpadmin1_201106081130_1156785421.txt
hive> show tables;
FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Unknown database 'hive'
NestedThrowables:
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown database 'hive'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

 

报错了,提示很明显,识别不到名称为hivedatabase,需要手动在mysql中创建相应的数据库

[Intranet root@hadoop6 /var/lib/mysql]
#mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or /g.
Your MySQL connection id is 41
Server version: 5.5.12 MySQL Community Server (GPL)

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '/h' for help. Type '/c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

mysql> create database hive;
Query OK, 1 row affected (0.00 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
| test               |
+--------------------+
5 rows in set (0.00 sec)

 

再登陆到HIVE里看看

[gpadmin1@hadoop5 hive-0.6.0]$ bin/hive
Hive history file=/tmp/gpadmin1/hive_job_log_gpadmin1_201106081130_544334815.txt
hive> show table;                       
FAILED: Parse Error: line 0:-1 mismatched input '' expecting EXTENDED in show statement

hive> show tables;
OK
Time taken: 5.173 seconds
hive>         CREATE TABLE cite (id1 INT,
    >            id2 int
    >            )
    >          ROW FORMAT DELIMITED
    >          FIELDS TERMINATED BY ',';
OK
Time taken: 0.266 seconds
hive> show tables;                         
OK
cite
Time taken: 0.197 seconds
hive>

 

OK了,果然是这个问题

另外,HIVE会在数据库中创建一些保存元数据的表,我们可以看下都有哪些

mysql> use hive;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+----------------+
| Tables_in_hive |
+----------------+
| BUCKETING_COLS |
| COLUMNS        |
| DBS            |
| PARTITION_KEYS |
| SDS            |
| SD_PARAMS      |
| SEQUENCE_TABLE |
| SERDES         |
| SERDE_PARAMS   |
| SORT_COLS      |
| TABLE_PARAMS   |
| TBLS           |
+----------------+
12 rows in set (0.00 sec)
mysql> select * from TBLS;     
+--------+-------------+-------+------------------+----------+-----------+-------+----------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER    | RETENTION | SD_ID | TBL_NAME | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+----------+-----------+-------+----------+---------------+--------------------+--------------------+
|      1 |  1307504073 |     1 |                0 | gpadmin1 |         0 |     1 | cite | MANAGED_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+----------+-----------+-------+----------+---------------+--------------------+--------------------+
1 row in set (0.00 sec)

mysql>

 

刚才我们创建的表cite也可以查到

 

另外有一个需要注意的地方是,使用mysql存储元数据,hive需要能够访问到mysql,需要mysql jdbc的驱动包,需要把一个jarmysql-connector-java-5.1.15-bin.jar拷贝到hivelib目录下才行,否则执行语句的时候会报错,类似下面这样

hive> show tables;
FAILED: Error in metadata: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
NestedThrowables:
java.lang.reflect.InvocationTargetException
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

拷贝相应的jar包既可以

你可能感兴趣的:(Hadoop,Linux,大数据)