HivePython 客户端查询示例

HivePython 客户端查询示例

Hive对外接口使用的是thrift,所以默认就提供了多语言支持,python,perl等语言自然不在话下。本文就hive 0.9版本做一个查询的示例。

Hive编译的时候就提供了Python的类库,所以从Hive社区下载tar包,解压缩,在lib的py目录下即可看到Python的类库。

         首先需要安装Python。这里,我们选择Python2.7版本。在Eclipse下面开发测试的时候,推荐使用Pydev插件。

默认提供的类库文件名命名和社区提供的示例代码有一些不一样,要做一些修改才可以使用。需要将hive_service目录名称修改为hive。整体目录结构如下图所示;


         其中,pytools为Hive测试代码路径。

下面直接粘贴示例代码,直接运行即可。

'''
Created on 2012-12-18
 
@author: Ransom
'''
 
from hive import ThriftHive
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from locale import str
 
class HiveClient:
    """
    hive thrift client
    """
   
    transport = None
    client = None
    def __init__(self):
        pass
   
    def openTranslation(self, host="localhost", port=10000):
        """
            open thrift translation
        """
        try:
            self.transport = TSocket.TSocket(host, port)
            self.transport = TTransport.TBufferedTransport(self.transport)
            protocol = TBinaryProtocol.TBinaryProtocol(self.transport)
 
            self.client = ThriftHive.Client(protocol)
            self.transport.open()
            print "success connect to " + host + " " + str(port)
        except Exception, tx:
            self.reset()
            print '%s' % (tx.message)
           
    def closeTranslation(self):
        """
            close thrift translation
        """
       
        try:
            if self.client is not None:
                self.client.clean()
            if self.transport is not None:
                self.transport.close()
        except Exception, tx:
            self.reset()
            print '%s' % (tx.message)
           
    def excuteAndPrint(self, sql):
        """
            excute sql and print result
        """
       
        if(self.client is None):
            print "client is null, translation should be opened."
            return
           
        try:  
            self.client.execute(sql)
            self.resultPrint()  
        except Exception, tx:
            print '%s' % (tx.message)
        finally:
            self.client.clean()
  
  
    def resultPrint(self):
        try: 
            cols = self.getColums()
            if cols is not None:
                print cols
               
            while (1):
                row = self.client.fetchN(10)
                if (row == []):
                    break
                for i in range(0, len(row)):
                    if row[i] is not None:
                        print row[i] 
        except Exception, tx:
            print '%s' % (tx.message)
        finally:
            self.client.clean()           
                          
    def getColums(self):   
        cols = ""
        schemas = self.client.getSchema().fieldSchemas
       
        if schemas is None:
            return None
       
        for i in range(0, len(schemas)):
            if i is not 0:
                cols = '%s\t%s' % (cols, schemas[i].name) 
            else:
                cols = '%s%s' % (cols, schemas[i].name)  
                   
        return cols  
           
    def reset(self):
        self.client = None
        self.transport = None
       
    def isOpen(self):
        try:
            if self.client is not None and self.transport isnot None:
                return self.transport.isOpen()
        except Exception, tx:
            print '%s' % (tx.message)
            return False
       
if __name__ == '__main__':
    client = HiveClient()
    client.openTranslation("localhost",10000)
    client.excuteAndPrint("show tables")
    client.closeTranslation()

如果需要在windows下面或者linux下面运行,则需要将Python的路径添加到当前的path中。另外再将当前根路径添加到PYTHONPATH环境变量即可

下面是一个bat脚本的示例:

mode con cols=120 lines=30
set PYTHOME_HOME=C:\Python27
set PATH=%PYTHOME_HOME%;%PATH%
set PYTHONPATH=%PYTHONPATH%;%~dp0
echo will excute cmd: 'python %~dp0\pytools\HiveClient.py'
python "%~dp0\pytools\HiveClient.py"



后面可以基于这个基本的客户端示例实现一个简单的客户端工具,这样每次测试的时候,就不用每次都用putty连接或者些jdbc代码了,方便了不少。

你可能感兴趣的:(云计算,DataBase,hive)