CDH6.3.2版本pyspark-sql通过hive访问hbase

1、添加所需要包环境

cp /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive/lib/hive-hbase-handler-2.1.1-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-2.1.0-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-common-2.1.0-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-2.1.0-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-server-2.1.0-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark/jars/
cp /opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-mapreduce-2.1.0-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark/jars/

2、测试代码

import findspark
findspark.init(spark_home='/opt/cloudera/parcels/CDH/lib/spark',python_path='/opt/cloudera/anaconda3/bin/python')
import pyspark
from pyspark.context import SparkContext
from pyspark.sql import SparkSession
from pyspark.sql.functions import *

spark = SparkSession.builder.appName('feat-eng')\
.enableHiveSupport().getOrCreate()

spark.sql('select * from hive_hbase_emp_table').show()

你可能感兴趣的:(hadoop,hive,hbase,sql)