hive操作特别慢,报错,卡顿
现象:
查看所有库拨错
hive> show databases;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketException: Broken pipe)
查看hiveserver2.log的日志
2016-03-09 11:13:26,865 ERROR [BoneCP-pool-watch-thread]: bonecp.BoneCP (BoneCP.java:obtainInternalConnection(292)) -
Failed to acquire connection to jdbc:mysql://nn02/hive?createDatabaseIfNotExist=true. Sleeping for 7000 ms. Attempts left: 5
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.GeneratedConstructorAccessor67.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1116)
at com.mysql.jdbc.MysqlIO.(MysqlIO.java:344)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2336)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2370)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2154)
at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:792)
at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:49)
at sun.reflect.GeneratedConstructorAccessor36.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
pl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:381)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:305)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
at com.jolbox.bonecp.BoneCP.obtainInternalConnection(BoneCP.java:269)
at com.jolbox.bonecp.ConnectionHandle.(ConnectionHandle.java:242)
at com.jolbox.bonecp.PoolWatchThread.fillConnections(PoolWatchThread.java:115)
at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:85)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.NoRouteToHostException: Cannot assign requested address
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.(Socket.java:425)
at java.net.Socket.(Socket.java:241)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:258)
at com.mysql.jdbc.MysqlIO.(MysqlIO.java:295)
... 21 more
查看hivemetastore.log日志报错
2016-03-09 00:00:06,466 ERROR [BoneCP-pool-watch-thread]: bonecp.BoneCP (BoneCP.java:obtainInternalConnection(292)) -
Failed to acquire connection to jdbc:mysql://nn02/hive?createDatabaseIfNotExist=true. Sleeping for 7000 ms. Attempts left: 1
java.sql.SQLException: null, message from server: "Host 'nn02' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'"
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1120)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2337)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2370)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2154)
at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:792)
at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:49)
at sun.reflect.GeneratedConstructorAccessor64.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:381)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:305)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
at com.jolbox.bonecp.BoneCP.obtainInternalConnection(BoneCP.java:269)
at com.jolbox.bonecp.ConnectionHandle.(ConnectionHandle.java:242)
at com.jolbox.bonecp.PoolWatchThread.fillConnections(PoolWatchThread.java:115)
at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:85)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
从日志中判断mysql的max_connect_errors超出了,导致数据库无法进行访问,
在短时间内容访问mysql通信中断多次后,会导致这样的问题,可以将max_connect_errors设置大些,
但不能解决根本问题,需要查看哪些机器访问mysql
设置max_connect_errors为1000,默认是10
set global max_connect_errors=1000;
查看下当前max_connect_errors
show variables like '%connect%';
最后建议在重启下hive服务和mysql库
相关命令:
1、mysql> show status like '%connect%';
Connections,试图连接到(不管是否成功)MySQL服务器的连接数。
Max_used_connections,服务器启动后已经同时使用的连接的最大数量。
Threads_connected,当前的连接数。
2、mysql> show variables like '%connect%';
max_connections,最大连接数。
3、修改max_connections
在配置文件(my.cnf或my.ini)在最下面,天加一句:
max_connections=32000
然后,用命令重启:/etc/init.d/mysqld restart
虽然这里写的32000,实际MySQL服务器允许的最大连接数16384;
添加了最大允许连接数,对系统消耗增加不大。
4、mysql> show processlist;
显示当前正在执行的MySQL连接。
------------------------------------------------
之后又出现了这个问题,
hive中的20160308_men 这个表删不了,进行删除后报错:
hive> drop table 20160308_men;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
最后的想到的方法是将mysql中的hive表metastore信息删除,并将hfds数据删除,
这个风险比较大,很容易删除其他有用的数据,操作前建议备份数据库。
下面罗列出主要的命令:
20160308_men表的ID为TBL_ID = 303
删除的过程中有很多的外部表关联的信息,需要先删除其他表中内容,在进行TBLS表内容删除,
其中mysql中hive主要涉及到的表如下:
表名 |
说明 |
关联键 |
TBLS |
所有hive表的基本信息(表名,创建时间,所属者等) |
TBL_ID,SD_ID |
TABLE_PARAM |
表级属性,(如是否外部表,表注释,最后修改时间等) |
TBL_ID |
COLUMNS |
Hive表字段信息(字段注释,字段名,字段类型,字段序号) |
SD_ID |
SDS |
所有hive表、表分区所对应的hdfs数据目录和数据格式 |
SD_ID,SERDE_ID |
SERDE_PARAM |
序列化反序列化信息,如行分隔符、列分隔符、NULL的表示字符等 |
SERDE_ID |
PARTITIONS |
Hive表分区信息(所属表,分区值) |
PART_ID,SD_ID,TBL_ID |
PARTITION_KEYS |
Hive分区表分区键(即分区字段) |
TBL_ID |
PARTITION_KEY_VALS |
Hive表分区名(键值) |
PART_ID |
>delete from TBLS where TBL_ID = 303;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`hive`.`PARTITIONS`, CONSTRAINT `PARTITIONS_FK1` FOREIGN KEY (`TBL_ID`) REFERENCES `TBLS` (`TBL_ID`))
mysql> select * from PARTITIONS where TBL_ID = 303;
+---------+-------------+------------------+----------------+-------+--------+----------------+
| PART_ID | CREATE_TIME | LAST_ACCESS_TIME | PART_NAME | SD_ID | TBL_ID | LINK_TARGET_ID |
+---------+-------------+------------------+----------------+-------+--------+----------------+
| 66982 | 1457426207 | 0 | keyword=??? | 67280 | 303 | NULL |
| 66989 | 1457494791 | 0 | keyword=funv | 67289 | 303 | NULL |
| 66990 | 1457494791 | 0 | keyword=nvren | 67290 | 303 | NULL |
| 66991 | 1457494792 | 0 | keyword=nvshen | 67291 | 303 | NULL |
+---------+-------------+------------------+----------------+-------+--------+----------------+
4 rows in set (0.00 sec)
mysql> delete from PARTITIONS where TBL_ID = 303;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`hive`.`PARTITION_KEY_VALS`, CONSTRAINT `PARTITION_KEY_VALS_FK1` FOREIGN KEY
(`PART_ID`) REFERENCES `PARTITIONS` (`PART_ID`))
mysql> select * from PARTITION_KEY_VALS where PART_ID=66982;
+---------+--------------+-------------+
| PART_ID | PART_KEY_VAL | INTEGER_IDX |
+---------+--------------+-------------+
| 66982 | ??? | 0 |
+---------+--------------+-------------+
1 row in set (0.00 sec)
mysql> delete from PARTITION_KEY_VALS where PART_ID=66982;
delete from PARTITION_KEY_VALS where PART_ID=66989;
delete from PARTITION_KEY_VALS where PART_ID=66990;
delete from PARTITION_KEY_VALS where PART_ID=66991;
mysql> delete from PARTITIONS where TBL_ID = 303;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`hive`.`PARTITION_PARAMS`, CONSTRAINT `PARTITION_PARAMS_FK1` FOREIGN KEY (`PART_ID`) REFERENCES `PARTITIONS` (`PART_ID`))
mysql> select * from PARTITION_PARAMS where PART_ID=66982;
+---------+-----------------------+-------------+
| PART_ID | PARAM_KEY | PARAM_VALUE |
+---------+-----------------------+-------------+
| 66982 | COLUMN_STATS_ACCURATE | false |
| 66982 | numFiles | 507 |
| 66982 | numRows | -1 |
| 66982 | rawDataSize | -1 |
| 66982 | totalSize | 838378 |
| 66982 | transient_lastDdlTime | 1457426207 |
+---------+-----------------------+-------------+
6 rows in set (0.00 sec)
mysql> delete from PARTITION_PARAMS where PART_ID=66982;
Query OK, 6 rows affected (0.06 sec)
mysql> delete from TBLS where TBL_ID = 303;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`hive`.`PARTITION_KEYS`, CONSTRAINT `PARTITION_KEYS_FK1` FOREIGN KEY
(`TBL_ID`) REFERENCES `TBLS` (`TBL_ID`))
mysql> select * from PARTITION_KEYS where TBL_ID = 303;
+--------+--------------+-----------+-----------+-------------+
| TBL_ID | PKEY_COMMENT | PKEY_NAME | PKEY_TYPE | INTEGER_IDX |
+--------+--------------+-----------+-----------+-------------+
| 303 | NULL | keyword | string | 0 |
+--------+--------------+-----------+-----------+-------------+
1 row in set (0.00 sec)
mysql> delete from TBLS where TBL_ID = 303;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`hive`.`TABLE_PARAMS`, CONSTRAINT `TABLE_PARAMS_FK1` FOREIGN KEY (`TBL_ID`) REFERENCES `TBLS` (`TBL_ID`))
mysql> select * from TABLE_PARAMS where TBL_ID = 303;
+--------+-----------------------+-------------+
| TBL_ID | PARAM_KEY | PARAM_VALUE |
+--------+-----------------------+-------------+
| 303 | transient_lastDdlTime | 1457425902 |
+--------+-----------------------+-------------+
1 row in set (0.00 sec)
mysql> delete from TABLE_PARAMS where TBL_ID = 303;
Query OK, 1 row affected (0.05 sec)
mysql> delete from TBLS where TBL_ID = 303;
Query OK, 1 row affected (0.06 sec)
这个问题可能是因为在partitioned分区的时候插入了 keyword=??? 这样的数据,导致出现这样的问题,处理起来比较繁琐。