用python3 用implala连接hive中遇到的一下问题。其中的报错主要参考了:
https://blog.csdn.net/Xiblade/article/details/82318294
https://blog.csdn.net/wx0628/article/details/86550582
https://blog.csdn.net/woay2008/article/details/79905627
代码很简单:
from impala.dbapi import connect
# 需要注意的是这里的auth_mechanism必须有,但database不必须
conn = connect(host='172.26.xxx.xxx', port=10000 ,auth_mechanism='PLAIN')
cur = conn.cursor()
cur.execute('SHOW DATABASES')
print(cur.fetchall())
cur.execute('SHOW Tables')
print(cur.fetchall())
安装包:
pip install pure-sasl
pip install thrift_sasl==0.2.1 --no-deps
pip install thrift==0.9.3
pip install impyla
1. 安装impla的时候报错
error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools":
有人说使用二进制包安装 :pip install impyla-0.14.1-py3-none-any.whl ,我试过了不行,依旧报错。
解决办法:
安装Visual Studio 2015(C盘至少需要6G空间!没办法,不得不装)
下载地址在这里
安装界面,只选择”Common Tools for VisualC++2015“。
安装vs后继续安装impala,安装成功:
报错:
ThriftParserError: ThriftPy does not support generating module with path in protocol ‘c’
解决办法:
定位到 \Lib\site-packages\thriftpy\parser\parser.py的
if url_scheme == '':
with open(path) as fh:
data = fh.read()
elif url_scheme in ('http', 'https'):
data = urlopen(path).read()
else:
raise ThriftParserError('ThriftPy does not support generating module '
'with path in protocol \'{}\''.format(
url_scheme))
更改为:
if url_scheme == '':
with open(path) as fh:
data = fh.read()
elif url_scheme in ('c', 'd','e','f''):
with open(path) as fh:
data = fh.read()
elif url_scheme in ('http', 'https'):
data = urlopen(path).read()
else:
raise ThriftParserError('ThriftPy does not support generating module '
'with path in protocol \'{}\''.format(
url_scheme))
报错:
thriftpy.transport.TTransportException: TTransportException(type=1, message="Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2'")
搞了好久,记得之前安装pyhive也是这个报错。
主要原因其实还是因为sasl和pure-sasl有冲突,这种情况下,直接卸载sasl包就可能了。
解决办法:
pip uninstall SASL
报错:TypeError: can't concat str to bytes
解决办法:
定位到错误的最后一条,在init.py第94行 (注意代码的缩进)
header = struct.pack(">BI", status, len(body))
self._trans.write(header + body)
更改为:
header = struct.pack(">BI", status, len(body))
if(type(body) is str):
body = body.encode()
self._trans.write(header + body)
修改代码的时候一定注意缩进,不然你都不知所云了。
至此,可以访问hive数据库了。