Python连接Impala/Hive

测试环境:Python3.5、Impala2.10.0、Impyla0.15.0

Impyla是用于分布式查询引擎的HiveServer2实现(如Impala、Hive)的python客户端。

1、安装Impyla
安装依赖包:

sudo pip install six
sudo pip install bit_array
sudo pip install thriftpy

安装Impyla:

sudo pip install impyla

2、测试连接impala

#-*- coding: utf-8 -*-

from impala.dbapi import connect

conn = connect(host='192.168.1.188', port=21050)
#conn = connect(host=host, port=prot_impala, user='', password='', auth_mechanism='')
cur = conn.cursor()
cur.execute('select name from user limit 10')
data_list=cur.fetchall()

for data in data_list:
    print("用户名称:" + str(data[0]))

3、安装thrift_sasl
thrift_sasl是连接Hive的依赖包,此处需要安装0.2.1版本(默认安装的0.3.0会报错’TSocket’ object has no attribute ‘isOpen’)

sudo pip install thrift-sasl==0.2.1

4、测试连接Hive

#-*- coding: utf-8 -*-

from impala.dbapi import connect

conn = connect(host='192.168.1.188',port=10000)
#conn = connect(host=host, port=prot_impala, user='', password='', auth_mechanism='')
cur = conn.cursor()

cur.execute("select * from abc where date='2019-05-28'")
data_list=cur.fetchall()

for data in data_list:
    print(data)

你可能感兴趣的:(Python连接Impala/Hive)