WINDOWS下python3连接hive(踩坑填坑)

python3连接hive

1.使用pyhive连接hive (没成功 (╥╯^╰╥) )

(连接未成功,可跳过,直接看用 impala连接hive )

安装好连接hive所需的包

pip install pyhive
pip install thrift
pip install sasl
pip install thrift_sasl

安装sasl不成功的话,可以去官网 https://www.lfd.uci.edu/~gohlke/pythonlibs/#sasl
然后选择自己python对应的文件下载 python 3.7就选cp37
WINDOWS下python3连接hive(踩坑填坑)_第1张图片
把下载好的文件放在你的python安装包路径下的Scripts文件夹下,
我的是 C:\Users\mgxx\AppData\Local\Programs\Python\Python37\Scripts
WINDOWS下python3连接hive(踩坑填坑)_第2张图片
然后在cmd下安装

pip install sasl-0.2.1-cp37-cp37m-win_amd64.whl

好啦,安装好之后连接hive

from pyhive import hive
conn = hive.Connection(host='localhost', port=10000, username='root', database='default')

运行时报错:
thrift.transport.TTransport.TTransportException: Could not start SASL: b’Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2’
找了一大圈答案,未解决,哭了,此路不通,那么选择impala连接hive

2.使用impala连接hive

我的版本是python 3.7
安装前需要把之前安装的包卸载干净,然后重新安装对应版本

pip uninstall sasl  #运行时报错module 'sasl' has no attribute 'Client',说明该包没有删除干净,需要手动删除文件
pip install impyla      
pip install pure-sasl
pip install thrift_sasl==0.2.1 --no-deps

尝试连接

from impala.dbapi import connect
conn = connect(host='localhost', port=10000, auth_mechanism='PLAIN', user='root', password='simon123')

报错:TypeError: can’t concat str to bytes 需要在File “C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\thrift_sasl_init_.py”
第94行代码
然后修改这个__init__.py在94行的代码,之前是这样的

  def _send_message(self, status, body):
    header = struct.pack(">BI", status, len(body))
    self._trans.write(header + body)
    self._trans.flush()

修改为:

  def _send_message(self, status, body):
    header = struct.pack(">BI", status, len(body))
    if(type(body) is str):
        body = body.encode()
    self._trans.write(header + body)
    self._trans.flush()

之后就可以连接啦,完整测试代码如下:

from impala.dbapi import connect
conn = connect(host='localhost', port=10000, auth_mechanism='PLAIN', user='root', password='simon123')
cur=conn.cursor()
cur.execute('SHOW databases')
for result in cur.fetchall():
    print(result)

输出:

>>>
('default',)
('hive_db',)

你可能感兴趣的:(HIVE,hive,python)