Python脚本通过unixODBC驱动访问Greenplum(4.3.8.2)数据库安装指导

本篇文档主要用来描述:

1.    搭建unixODBC驱动,用来通过odbc方式访问数据库

2.    搭建Pyodbc驱动,用来使用Python脚本通过系统的odbc方式来对数据库进行操作

3.    这种架构的好处:

l  通过odbc访问数据库的性能要好于jdbc的方式

l  通过Python脚本开发,后续可以通过调度平台来执行Python执行数据分析,比如设置定时任务,周期性地执行Python脚本

 

安装前准备

(1)操作系统(系统上面要安装一些必备的开发工具(比如gcc等))

[root@CDHA ~]# cat /etc/redhat-release

CentOS release 6.7 (Final)

 

[root@CDHA ~]$ python
Python 2.6.2 (r262:71600, May 12 2009, 15:34:31)
[GCC 4.1.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

 

(2)安装所需的软件包

greenplum-connectivity-4.3.8.2-build-1-RHEL5-x86_64.zip

--GP官网下载,GP的JDBC和ODBC驱动

 

pyodbc-3.0.10.tar.gz                        

 --Python连接GP需要pyodbc驱动包

 

unixODBC-2.2.12.tar.gz               

--unixODBC的驱动管理器

 

(3)将上面的包上传到搭建环境的CDHA服务器上面,比如/software/

 

安装GP驱动包

1.    解压greenplum-connectivity-4.3.8.2-build-1-RHEL5-x86_64.zip

unzip greenplum-connectivity-4.3.8.2-build-1-RHEL5-x86_64.zip

 

2.    执行解压后得到greenplum-connectivity-4.3.8.2-build-1-RHEL5-x86_64.bin可执行文件

bash greenplum-connectivity-4.3.8.2-build-1-RHEL5-x86_64.bin

 略部分内容

********************************************************************

    Do you accept the EMC Connectivity license agreement? [yes | no]

********************************************************************

 

yes     ---------同意许可

********************************************************************

Providethe installation path for Greenplum Connectivity or press ENTER to

Accept the default installation path:greenplum-connectivity-4.3.8.2-build-1

********************************************************************

********************************************************************

InstallGreenplum Connectivity into</usr/local/greenplum-connectivity-4.3.8.2-build-1>? [yes | no]

********************************************************************

 

yes      ----------------保持默认的安装路径,你也可以自由指定安装路径

 

********************************************************************

/usr/local/greenplum-connectivity-4.3.8.2-build-1 does not exist.

Create/usr/local/greenplum-connectivity-4.3.8.2-build-1 ? [ yes | no ]

(Selectingno will exit the installer)

********************************************************************

 

yes      ----------------创建安装目录           

 

Extractingproduct to /usr/local/greenplum-connectivity-4.3.8.2-build-1

 

********************************************************************

Installationcomplete.

GreenplumConnectivity is installed in /usr/local/greenplum-connectivity-4.3.8.2-build-1

************************************************************************

 

3.    配置Greenplum DB数据库驱动

查看安装目录时,如下:

[root@CDHA software]# ll /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc

total 24

drwxr-xr-x 3 gpadmin gpadmin 4096 May 10 13:34 psqlodbc-08.02.0400

drwxr-xr-x 6 gpadmin gpadmin 4096 May 10 13:37 psqlodbc-08.02.0500

drwxr-xr-x 3 gpadmin gpadmin 4096 May 10 13:37 psqlodbc-08.03.0400

drwxr-xr-x 3 gpadmin gpadmin 4096 May 10 13:38 psqlodbc-08.04.0200

drwxr-xr-x 3 gpadmin gpadmin 4096 May 10 13:38 psqlodbc-09.00.0200

drwxr-xr-x 3 gpadmin gpadmin 4096 May 10 13:38 psqlodbc-09.02.0100

[root@CDHA software]#

我们会看到有好几个版本的驱动,我们可以选择psqlodbc-08.02.0500版本的,再查看如下目录:

[root@CDHA software]# ll /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/

total 48

drwxr-xr-x 2 gpadmin gpadmin  4096 May 10 13:37 datadirect-51sp2_64

drwxr-xr-x 2 gpadmin gpadmin  4096 May 10 13:37 datadirect-52_64

drwxr-xr-x 2 gpadmin gpadmin  4096 May 10 13:37 datadirect-53sp2_64

-rwxr-xr-x 1 gpadmin gpadmin 25746 May 10 13:36 license.txt

-rwxr-xr-x 1 gpadmin gpadmin  1383 May 10 13:36 readme.txt

drwxr-xr-x 3 gpadmin gpadmin  4096 Jul  6 11:38 unixodbc-2.2.12

 

同样我们可以看到驱动管理器。

鉴于GP基于Postgresql8.2版本,我们这里面选择驱动为psqlodbc-08.02.0500,驱动管理器选择为datadirect-52_64。

所以,我们修改greenplum_connectivity_path.sh文件中的内容:

GP_ODBC_DRIVER=psqlodbc-08.02.0500          --值与实际目录名称相同

GP_ODBC_DRIVER_MANAGER=datadirect-52_64    --值与实际目录名称相同

 

注:该文件默认权限位444,是不允许编辑的,你可以手动修改文件的权限,也可以修改整个安装目录的权限位755,如下:

chmod -R 755 /usr/local/greenplum-connectivity-4.3.8.2-build-1

 

保存greenplum_connectivity_path.sh后,要记得source,使环境变量生效,如下:

source greenplum_connectivity_path.sh

 

安装unixODBC驱动

1.    编译和安装unixODBC驱动包

tar -zxvf unixODBC-2.2.12.tar.gz

./configure --prefix=/etc/unixODBC --enable-fdb  --disable-gui

make

make install

2.    查看unixODBC安装目录

[root@CDHA software]# ll /etc/unixODBC/

total 16

drwxr-xr-x 2 root root 4096 Jul  5 23:19 bin

drwxr-xr-x 3 root root 4096 Jul  6 10:09 etc

drwxr-xr-x 2 root root 4096 Jul  5 23:19 include

drwxr-xr-x 2 root root 4096 Jul  6 11:47 lib

 

3.    编辑unixODBC的etc目录下面的两个配置文件,如下:

[root@CDHA unixODBC]# cat /etc/unixODBC/etc/odbc.ini

[GreenplumDSN]

Driver =Greenplum   ----值要和/etc/unixODBC/etc/odbcinst.ini中名字一致

Trace = 1

Debug=1

Database = zhangyun_db_safe    ----GP数据库名

Servername = 192.168.1.24   ----GPIP地址

UserName = zhangyun    ----GP用户名

Password = xxxxxx       ----GP用户密码

Port = 5432                ----GP访问端口号

ReadOnly = No

RowVersioning = No

DisallowPremature = No

ShowSystemTables = Yes

ShowOidColumn = No

FakeOidIndex = No

useDeclareFetch = 1

Fetch = 4096

UpdatableCursors = Yes

Protocol = 7.4-1

 

[root@CDHA unixODBC]# cat /etc/unixODBC/etc/odbcinst.ini

[Greenplum]

Description=PostgreSQL driver forGreenplum

Driver=/usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/unixodbc-2.2.12/psqlodbcw.so   ----GPODBC驱动

UsageCount=1

FileUsage=1

 

4.    使用isql测试

[root@CDHA unixODBC]# isql GreenplumDSN zhangyun xxxxxx

+---------------------------------------+

| Connected!                            |

|                                       |

| sql-statement                         |

| help [tablename]                      |

| quit                                  |

|                                       |

+---------------------------------------+

SQL> select count(1) from test_bigtable;

+---------------------+

| count               |

+---------------------+

| 1791144834          |

+---------------------+

SQLRowCount returns -1

1 rows fetched

SQL> select user;  

+-----------------------------------------------------------------+

| current_user                                                    |

+-----------------------------------------------------------------+

| zhangyun                                                      |

+-----------------------------------------------------------------+

SQLRowCount returns -1

1 rows fetched

SQL>

 

注:如果你在执行isql时,出现如下情况:

[root@CDHA unixODBC]# isql GreenplumDSN

[ISQL]ERROR: Could not SQLConnect

 

这个问题很大情况下是你没有source文件greenplum_connectivity_path.sh导致的,执行source greenplum_connectivity_path.sh文件后,再执行就OK了,最好的办法是将source该文件加入到系统环境变量中。

如果还是不可以的话,请加入-v选项查看详细信息,也可参考最后的错误的解决方法。

 

安装pyodbc驱动

1.    编译和安装pyodbc驱动

在编译之前说明一下,如下的一些库需要安装好,如下:

[root@CDHA pyodbc-3.0.10]# yum install gcc-c*

[root@CDHA pyodbc-3.0.10]# yum install compat-gcc-34-c++

[root@CDHA pyodbc-3.0.10]# yum install unixODBC-devel

[root@CDHA pyodbc-3.0.10]# yum install python-devel

如果还有其他库没有安装,请执行yum来进行安装。

下面开始编译pyodbc

tar -zxvf pyodbc-3.0.10.tar.gz

cd pyodbc-3.0.10

python setup.py build

python setup.py install

 

2.    查看pyodbc安装目录

[root@CDHA pyodbc-3.0.10]# ll /usr/lib64/python2.6/site-packages/pyodbc*

-rwxr-xr-x 1 root root    913 Jul  6 14:34 /usr/lib64/python2.6/site-packages/pyodbc-3.0.10-py2.6.egg-info

-rwxr-xr-x 1 root root 391676 Jul  6 14:34 /usr/lib64/python2.6/site-packages/pyodbc.so

 

测试python脚本

1.    准备python测试脚本,如下:

[zhangyun@CDHA ~]$ cat helloworld.py

#!/usr/bin/python

#-*- encoding: utf-8 -*-

####################################################################

# name: helloworld.py   

# describe: 测试python访问Greenplum数据库

########################################################################

import pyodbc

import sys

reload(sys)

sys.setdefaultencoding('utf8')

 

class GreenplumTest:

    debug = 1

    def __init__(self,dbinfo):

        self.UID = dbinfo[1]

        self.PWD = dbinfo[2]

        odbcinfo    ='DSN=%s;UID=%s;PWD=%s'%(dbinfo[0],dbinfo[1],dbinfo[2])

        self.cnxn   =pyodbc.connect(odbcinfo,autocommit=True,ansi=True)

        self.cursor =self.cnxn.cursor()

 

    def __del__(self):

        if self.cursor:

            self.cursor.close()

        if self.cnxn:

            self.cnxn.close()

 

    def _printinfo(self,msg):

        print"%s"%(msg)

        print "\n"

 

    def testsql(self):

        # 类似的业务逻辑,可以放到sql中执行

        # 示例:创建表,插入数据

        sql_1 = '''

          drop table if exists helloworld;

          create table helloworld(id int,name text) distributed by (id);

          insert into helloworld values(1,'Spark'),(2,'Hadoop'),(3,'Apache');

          '''

        self.cursor.execute(sql_1.strip())

       

        #查询结果,并返回

        sql_2 = '''

          select * from helloworld;

          ''' 

        self.cursor.execute(sql_2.strip())             

        row = self.cursor.fetchall()

        return row 

 

#Main

def main():

    # 检查传入参数个数    

    if len(sys.argv) < 4 :

        print 'usage: python GreenplumDSN username password\n'

        sys.exit(1)

 

    # 定义连接GP的信息     

    dbinfo = []

    dbinfo.append(sys.argv[1])

    dbinfo.append(sys.argv[2])

    dbinfo.append(sys.argv[3])

 

    GPT= GreenplumTest(dbinfo)

    ret = GPT.testsql()

    return ret

 

if __name__ == '__main__':

    sys.exit(main())

 

2.    测试过程:

python helloworld.py GreenplumDSN zhangyun xxxxxx

上面执行后返回结果如下:

[(3, 'Apache'), (1, 'Spark'), (2, 'Hadoop')]

 

3.    我们登录GP数据库查看数据

[gpadmin@CDHA ~]$ psql -d zhangyun_db_safe -U zhangyun -h CDHA

psql (8.2.15)

Type "help" for help.

 

zhangyun_db_safe=# select * from helloworld ;

 id |  name 

----+--------

  2 | Hadoop

  1 | Spark

  3 | Apache

(3 rows)

 

zhangyun_db_safe=# 

 

 

 

问题汇总:

1.       如果unixODBC安装好后,在使用isql时,显示如下错误

/usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/unixodbc-2.2.12/psqlodbcw.so文件找不到,如:

 

isql GreenplumDSN -v

 

[01000][unixODBC][Driver Manager]Can't open lib '/usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/unixodbc-2.2.12/psqlodbcw.so' : file not found

[ISQL]ERROR: Could not SQLConnect

 

解决办法:

首先使用ldd查看缺失什么文件

ldd /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/unixodbc-2.2.12/psqlodbcw.so

 

linux-vdso.so.1 =>  (0x00007ffdd9106000)

libpq.so.5 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libpq.so.5 (0x00002addb8e3f000)

         libpthread.so.0 => /lib64/libpthread.so.0 (0x00002addb908c000)

         libodbcinst.so.1 => not found

         libodbc.so.1 => not found

         libc.so.6 => /lib64/libc.so.6 (0x00002addb92aa000)

         libssl.so.0.9.8 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libssl.so.0.9.8 (0x00002addb963e000)

         libcrypto.so.0.9.8 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libcrypto.so.0.9.8 (0x00002addb9893000)

         libgssapi_krb5.so.2 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libgssapi_krb5.so.2 (0x00002addb9c26000)

         libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00002addb9e50000)

         libldap_r-2.3.so.0 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libldap_r-2.3.so.0 (0x00002addba088000)

         /lib64/ld-linux-x86-64.so.2 (0x0000003f8bc00000)

         libdl.so.2 => /lib64/libdl.so.2 (0x00002addba2dc000)

         libkrb5.so.3 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libkrb5.so.3 (0x00002addba4e1000)

         libk5crypto.so.3 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libk5crypto.so.3 (0x00002addba76b000)

         libcom_err.so.3 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libcom_err.so.3 (0x00002addba990000)

         libkrb5support.so.0 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libkrb5support.so.0 (0x00002addbab95000)

         libresolv.so.2 => /lib64/libresolv.so.2 (0x00002addbad9c000)

         libfreebl3.so => /lib64/libfreebl3.so (0x00002addbafb7000)

         liblber-2.3.so.0 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/liblber-2.3.so.0 (0x00002addbb1ba000)

 

那么我们就创建缺失的文件,如下:

ln -s /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/unixodbc-2.2.12/libodbcinst.so.1 /lib64/libodbcinst.so.1 

ln -s /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/unixodbc-2.2.12/libodbc.so.1 /lib64/libodbc.so.1

 

 

再次查看时就没有问题了,如下:

ldd /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/unixodbc-2.2.12/psqlodbcw.so

 

         linux-vdso.so.1 =>  (0x00007ffcc97d4000)

         libpq.so.5 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libpq.so.5 (0x00002b10ed9ee000)

         libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b10edc3b000)

         libodbcinst.so.1 => /lib64/libodbcinst.so.1 (0x00002b10ede58000)

         libodbc.so.1 => /lib64/libodbc.so.1 (0x00002b10ee071000)

         libc.so.6 => /lib64/libc.so.6 (0x00002b10ee300000)

         libssl.so.0.9.8 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libssl.so.0.9.8 (0x00002b10ee694000)

         libcrypto.so.0.9.8 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libcrypto.so.0.9.8 (0x00002b10ee8e9000)

         libgssapi_krb5.so.2 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libgssapi_krb5.so.2 (0x00002b10eec7c000)

         libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00002b10eeea6000)

         libldap_r-2.3.so.0 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libldap_r-2.3.so.0 (0x00002b10ef0de000)

         /lib64/ld-linux-x86-64.so.2 (0x0000003f8bc00000)

         libdl.so.2 => /lib64/libdl.so.2 (0x00002b10ef332000)

         libkrb5.so.3 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libkrb5.so.3 (0x00002b10ef537000)

         libk5crypto.so.3 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libk5crypto.so.3 (0x00002b10ef7c1000)

         libcom_err.so.3 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libcom_err.so.3 (0x00002b10ef9e6000)

         libkrb5support.so.0 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/libkrb5support.so.0 (0x00002b10efbeb000)

         libresolv.so.2 => /lib64/libresolv.so.2 (0x00002b10efdf2000)

         libfreebl3.so => /lib64/libfreebl3.so (0x00002b10f000d000)

         liblber-2.3.so.0 => /usr/local/greenplum-connectivity-4.3.8.2-build-1/drivers/odbc/psqlodbc-08.02.0500/datadirect-52_64/liblber-2.3.so.0 (0x00002b10f0210000)

你可能感兴趣的:(Python脚本通过unixODBC驱动访问Greenplum(4.3.8.2)数据库安装指导)