(连接未成功,可跳过,直接看用 impala连接hive )
安装好连接hive所需的包
pip install pyhive
pip install thrift
pip install sasl
pip install thrift_sasl
安装sasl不成功的话,可以去官网 https://www.lfd.uci.edu/~gohlke/pythonlibs/#sasl
然后选择自己python对应的文件下载 python 3.7就选cp37
把下载好的文件放在你的python安装包路径下的Scripts文件夹下,
我的是 C:\Users\mgxx\AppData\Local\Programs\Python\Python37\Scripts
然后在cmd下安装
pip install sasl-0.2.1-cp37-cp37m-win_amd64.whl
好啦,安装好之后连接hive
from pyhive import hive
conn = hive.Connection(host='localhost', port=10000, username='root', database='default')
运行时报错:
thrift.transport.TTransport.TTransportException: Could not start SASL: b’Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2’
找了一大圈答案,未解决,哭了,此路不通,那么选择impala连接hive
我的版本是python 3.7
安装前需要把之前安装的包卸载干净,然后重新安装对应版本
pip uninstall sasl #运行时报错module 'sasl' has no attribute 'Client',说明该包没有删除干净,需要手动删除文件
pip install impyla
pip install pure-sasl
pip install thrift_sasl==0.2.1 --no-deps
尝试连接
from impala.dbapi import connect
conn = connect(host='localhost', port=10000, auth_mechanism='PLAIN', user='root', password='simon123')
报错:TypeError: can’t concat str to bytes 需要在File “C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\thrift_sasl_init_.py”
第94行代码
然后修改这个__init__.py在94行的代码,之前是这样的
def _send_message(self, status, body):
header = struct.pack(">BI", status, len(body))
self._trans.write(header + body)
self._trans.flush()
修改为:
def _send_message(self, status, body):
header = struct.pack(">BI", status, len(body))
if(type(body) is str):
body = body.encode()
self._trans.write(header + body)
self._trans.flush()
之后就可以连接啦,完整测试代码如下:
from impala.dbapi import connect
conn = connect(host='localhost', port=10000, auth_mechanism='PLAIN', user='root', password='simon123')
cur=conn.cursor()
cur.execute('SHOW databases')
for result in cur.fetchall():
print(result)
输出:
>>>
('default',)
('hive_db',)