python 读写hive

最近正在 做一个 项目,需要把 算法模型的结果持久化 至hive.

目前 使用的 pyhive,切记 在windows上不能使用,我目前在centos6.5上使用,官方说再macos和linux上可用。

 

from pyhive import hive
import pandas as pd
# from sqlalchemy import create_engine

# from pyspark.sql import sqlContext

conn = hive.Connection(host='xxx', port=10000, username='xxx', database='default')
cur = conn.cursor()

#读取hive

dftt=pd.read_sql("select * from dw.ml_catalog limit 10",con=conn)
print(dftt)

# test data
listpandas=[[456,'test456'],[789,'test456'],[123,'test123'],[110,'test110']]
# engine=create_engine('hive://xxx@xxx:10000/default')
df=pd.DataFrame(listpandas,columns=['id','name'])
# must use the follow to write hive,to_sql 目前有bug,只能存入一条语句https://github.com/dropbox/PyHive/issues/50
for index, row in df.iterrows():
    strsql="insert into default.test100(id,name) values("+str(row[0])+",'"+str(row[1])+"'"+")"
    cur.execute(strsql)


# with engine.connect() as conn, conn.begin():
#     for index, row in df.iterrows():
#         row.to_sql('default.test100', engine, if_exists='append',index=False, index_label=None, chunksize=None, dtype=None)
#     # df.to_sql('default.test100', engine, if_exists='append',index=False, index_label=None, chunksize=None, dtype=None)

# print(df)

# connect=hive.Connection(host='10.15.4.161', port=10000, username='zhouzhou', database='default')

# df.to_sql("default.test100", con=conn)


# for index, row in df.iterrows():
#     row.to_sql('default.test100', con=connect, if_exists='append',index=False, index_label=None, chunksize=None, dtype=None)

# cursor = conn.cursor()
# cursor.execute('select * from dw.ml_catalog limit 10')
# for result in cursor.fetchall():
#     print(result)
 

知乎: https://zhuanlan.zhihu.com/albertwang

微信公众号:AI-Research-Studio

https://img-blog.csdnimg.cn/20190110102516916.png ​​

下面是赞赏码

python 读写hive_第1张图片

 

你可能感兴趣的:(python,Deep,Learning,Machine,Learning)