2018北京积分落户数据,用pyspark、pyecharts大数据可视化分析,按用户所在省份分析

2018北京积分落户数据,用pyspark、pyecharts大数据可视化分析,按用户所在省份分析。

#导入积分落户人员名单数据
df = spark.read.csv('jifenluohu.csv', header='true', inferSchema='true')
df.cache()
df.createOrReplaceTempView("jflh")
#df.show()
spCount = agecount = spark.sql("select provincename as name,count(*) as ct from jflh group by provincename order by ct desc").collect()
name = [row.name for row in spCount]
count = [row.ct for row in spCount]

#图表展示
from pyecharts import Bar
bar = Bar("2018北京积分落户用户数据分析", "按用户所在省份汇总统计用户数量")
bar.add("用户数量", name, count)
bar

2018北京积分落户数据,用pyspark、pyecharts大数据可视化分析,按用户所在省份分析_第1张图片

#-*- coding:utf-8 -*
#导入积分落户人员名单数据
df = spark.read.csv('jifenluohu.csv', header='true', inferSchema='true')
df.createOrReplaceTempView("jflh")
#sqlContext.clearCache()
df.cache()
#df.show()
spCount = agecount = spark.sql("select provincename as name,count(*) as ct from jflh group by provincename order by ct desc").collect()
name = [row.name for row in spCount]
count = [row.ct for row in spCount]

#图表展示
from pyecharts import Map
map=Map("2018北京积分落户用户数据分析", width=800, height=600)
map.add("用户数量", name, count, maptype='china', is_visualmap=True,
        visual_text_color='#000')
map.render()

 2018北京积分落户数据,用pyspark、pyecharts大数据可视化分析,按用户所在省份分析_第2张图片

你可能感兴趣的:(Spark)