pyspark 连接mysql

pyspark 连接mysql

1:载mysql-connector 放入 jars下
2:在spark-env.sh中 配置EXTRA_SPARK_CLASSPATH环境变量
3:export SPARK_CLASSPATH=/opt/spark/spark-2.4.0-bin-hadoop2.7/jars

from pyspark.sql import SparkSession
from pyspark.sql import SQLContext

sc = SparkSession.builder.appName("Python Spark SQL basic example")\
    .config('spark.some.config,option0','some-value')\
    .getOrCreate()
ctx = SQLContext(sc)
jdbcDf=ctx.read.format("jdbc").options(url="jdbc:mysql://localhost:3306/stock",
                                       driver="com.mysql.jdbc.Driver",
                                       dbtable="(SELECT * FROM stock) tmp",user="root",
                                       password="0845").load()
print(jdbcDf.printSchema())

print(jdbcDf.show())
root
 |-- open: double (nullable = true)
 |-- high: double (nullable = true)
 |-- close: double (nullable = true)
 |-- low: double (nullable = true)
 |-- volume: double (nullable = true)
 |-- price_change: double (nullable = true)
 |-- p_change: double (nullable = true)
 |-- ma5: double (nullable = true)
 |-- ma10: double (nullable = true)
 |-- ma20: double (nullable = true)
 |-- v_ma5: double (nullable = true)
 |-- v_ma10: double (nullable = true)
 |-- v_ma20: double (nullable = true)

None
+-----+-----+-----+-----+----------+------------+--------+------+------+------+----------+----------+----------+
| open| high|close|  low|    volume|price_change|p_change|   ma5|  ma10|  ma20|     v_ma5|    v_ma10|    v_ma20|
+-----+-----+-----+-----+----------+------------+--------+------+------+------+----------+----------+----------+
|13.06|13.15|13.08|12.84|1245760.12|        0.02|    0.15| 12.85| 12.43|11.831| 1587358.4|1669658.01|1338502.95|
|12.91|13.29|13.06|12.84|1408422.25|        0.07|    0.54|12.714|12.263|11.725|1734009.15|1628344.98| 1311815.0|
| 12.7|13.38|12.99|12.62| 2456830.5|        0.23|     1.8|12.542|12.084|11.622| 1855833.7|1614590.97|1282725.44|
|12.48|12.78|12.76| 12.2|1712679.25|         0.4|    3.24|12.454|11.921|11.519| 1929454.6|1494460.14|1211679.41|
|12.44|12.61|12.36|12.27|1113099.88|       -0.04|   -0.32| 12.21| 11.74|11.431| 1823728.3|1433204.97|1231463.54|
|12.24|12.66| 12.4|12.15|1979013.88|         0.2|    1.64| 12.01|11.629|11.339|1751957.63| 1408813.5|1209770.59|
|12.55|12.57| 12.2|12.15| 2017545.0|       -0.35|   -2.79|11.812|11.527|11.237|1522680.81|1324648.69|1137713.73|
| 11.7|12.64|12.55|11.57| 2824935.0|        1.01|    8.75|11.626|11.426|11.141|1373348.24|1218887.88|1058057.16|
|11.35|11.57|11.54|11.15|1184047.75|        0.18|    1.58|11.388|11.292| 11.03|1059465.69|1016588.04|  949778.2|
|11.38|11.55|11.36|11.28|  754246.5|       -0.05|   -0.44| 11.27|11.258|10.975|1042681.64| 999610.27| 927515.47|
|11.33|11.53|11.41|11.25| 832629.81|        0.14|    1.24|11.248|11.232|10.919|1065669.37|1007347.89| 933943.74|
|11.36|11.62|11.27|11.24|1270882.12|       -0.09|   -0.79|11.242|11.186|10.873|1126616.56| 995285.02| 941197.22|
|11.05|11.36|11.36|11.04|1255522.25|        0.41|    3.74|11.226|11.159|10.821|1064427.51| 950859.92| 904761.14|
|11.23|11.23|10.95| 10.9| 1100127.5|        -0.3|   -2.67|11.196|11.117|10.759| 973710.39| 928898.67| 867007.21|
| 11.3|11.33|11.25|11.19| 869185.19|       -0.13|   -1.14|11.246|11.122|10.721|  956538.9|1029722.12| 846819.06|
|11.19|11.41|11.38|11.12|1137365.75|        0.19|     1.7|11.216|11.049|10.664| 949026.41|1010727.69| 856950.69|
| 11.2|11.31|11.19|11.03| 959936.88|       -0.02|   -0.18| 11.13|10.946|10.592| 863953.47| 950778.77| 861756.72|
|11.08|11.21|11.21|10.97| 801936.62|        0.01|    0.09|11.092|10.855|10.515| 837292.32| 897226.44| 833879.28|
| 11.2|11.25| 11.2|10.96|1014270.06|         0.1|     0.9|11.038|10.768|10.442| 884086.95| 882968.36| 837066.83|
|10.98| 11.2| 11.1|10.94| 831622.75|        0.15|    1.37|10.998|10.691|10.369|1102905.34| 855420.67| 860411.28|
+-----+-----+-----+-----+----------+------------+--------+------+------+------+----------+----------+----------+
only showing top 20 rows

None

posted on 2019-03-07 09:44 luoganttcc 阅读(...) 评论(...) 编辑 收藏

你可能感兴趣的:(pyspark 连接mysql)