PySpark读取mysql数据库

创建SparkContent和SqlContent

from pyspark import SparkContext
from pyspark.sql import SQLContext
url = "jdbc:mysql://172.20.51.134:3308/test"
table = "backend_dataset"
properties = {"user":"root","password":"123456"}
sc = SparkContext() #创建spark上线文
sqlContext = SQLContext(sc) #创建sqlContext

spark执行数据查询

#获取目标表信息访问对象
df = sqlContext.read.jdbc(url,table,properties =  properties)
query_sql = "select name,create_date from backend_dataset order by name"
#指定数据查询表
df.registerTempTable("backend_dataset") #跟上面指定的表名一致
# 执行sql查询
df2 = sqlContext.sql(query_sql)
#转换pandas的df
pd_df = df2.toPandas()

遇到错误

  • pyspark java.sql.SQLException: No suitable driver

你可能感兴趣的:(PySpark读取mysql数据库)