Apache Superset是一个开源的、现代的、轻量级BI分析工具,能够对接多种数据源、拥有丰富的图标展示形式、支持自定义仪表盘,且拥有友好的用户界面,十分易用。
由于Superset能够对接常用的大数据分析工具,如Hive、Kylin、Druid等,且支持自定义仪表盘,故可作为数仓的可视化工具。
Superset官网地址: https://superset.apache.org/
github源码地址: https://github.com/apache/superset
Superset是由Python语言编写的Web应用,项目开发团队是在3.6版本,所以Python3.6的环境最稳定。
conda是一个开源的包、环境管理器,可以用于在同一个机器上安装不同Python版本的软件包及其依赖,并能够在不同的Python环境之间切换,Anaconda包括Conda、Python以及一大堆安装好的工具包,比如:numpy、pandas等,Miniconda包括Conda、Python。
下载Miniconda3最新版
wangting@ops04:/opt/software >wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
安装Miniconda
wangting@ops04:/opt/software >bash Miniconda3-latest-Linux-x86_64.sh
In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue # [是否继续] 回车
>>>
Do you accept the license terms? [yes|no]
[no] >>>
Please answer 'yes' or 'no':' # [是否同意一些条款] yes
>>> yes
Miniconda3 will now be installed into this location:
/home/wangting/miniconda3
- Press ENTER to confirm the location
- Press CTRL-C to abort the installation
- Or specify a different location below
[/home/wangting/miniconda3] >>> /opt/module/miniconda3 # [自定义安装路径 默认家目录]
Do you wish the installer to initialize Miniconda3
by running conda init? [yes|no] # 运行conda初始化 yes
[no] >>> yes
Thank you for installing Miniconda3! # 出现这条提示,即为安装完成
脚本运行过程中自动在用户家目录的bashrc环境文件中添加环境参数
__conda_setup="$('/opt/module/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/module/miniconda3/etc/profile.d/conda.sh" ]; then
. "/opt/module/miniconda3/etc/profile.d/conda.sh"
else
export PATH="/opt/module/miniconda3/bin:$PATH"
fi
fi
unset __conda_setup
引用脚本修改的家目录的bashrc文件
wangting@ops04:/opt/module/miniconda3 >source ~/.bashrc
(base) wangting@ops04:/opt/module/miniconda3 >
退出base环境方式
(base) wangting@ops04:/opt/module/miniconda3 >conda deactivate
取消每次登陆激活base环境(每次登陆终端,采用命令行登陆环境)
Miniconda安装完成后,每次打开终端都会激活默认的base环境,通过以下命令,禁止自动激活默认base环境。
wangting@ops04:/opt/module/miniconda3 >conda config --set auto_activate_base false
配置conda国内镜像
wangting@ops04:/opt/module/miniconda3 >conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
wangting@ops04:/opt/module/miniconda3 >conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
wangting@ops04:/opt/module/miniconda3 >conda config --set show_channel_urls yes
创建Python3.6环境 --name(-n) 后面自定义虚拟环境的登陆名称 python=后面自定义python版本
wangting@ops04:/home/wangting >conda create --name superset python=3.6
Proceed ([y]/n)? y # 等待创建完毕,过程中会下载安装包以及依赖
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate superset
#
# To deactivate an active environment, use
#
# $ conda deactivate
#看到上面的内容创建完成
【注意:】 如果在执行上面conda create创建时提示 WARNING: A newer version of conda exists.,那就update更新一下conda,成功安装无需update
wangting@ops04:/home/wangting >conda create --name superset python=3.6
Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.7.12
latest version: 4.10.1
Please update conda by running
$ conda update -n base -c defaults conda
Segmentation fault (core dumped)
wangting@ops04:/home/wangting >conda update -n base -c defaults conda
conda环境管理常用命令
查看所有conda环境
wangting@ops04:/opt/module/miniconda3/pkgs >conda info --envs
# conda environments:
#
base * /opt/module/miniconda3
superset /opt/module/miniconda3/envs/superset
激活登陆对应conda环境
wangting@ops04:/opt/module/miniconda3/pkgs >conda activate superset
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >python --version
Python 3.6.13 :: Anaconda, Inc.
退出当前conda环境
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >conda deactivate
wangting@ops04:/opt/module/miniconda3/pkgs >
验证功能
# 再次登录
wangting@ops04:/opt/module/miniconda3/pkgs >conda activate superset
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >
# 登录conda的python命令行
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >python
Python 3.6.13 |Anaconda, Inc.| (default, Jun 4 2021, 14:25:59)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
# 在conda中使用pip命令安装模块 -i指定资源地址,默认是官方地址;验证pip
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >pip install gunicorn -i https://pypi.douban.com/simple/
Looking in indexes: https://pypi.douban.com/simple/
Collecting gunicorn
Downloading https://pypi.doubanio.com/packages/e4/dd/5b190393e6066286773a67dfcc2f9492058e9b57c4867a95f1ba5caf0a83/gunicorn-20.1.0-py3-none-any.whl (79 kB)
|████████████████████████████████| 79 kB 2.3 MB/s
Requirement already satisfied: setuptools>=3.0 in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (from gunicorn) (52.0.0.post20210125)
Installing collected packages: gunicorn
Successfully installed gunicorn-20.1.0
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >
再添加一个conda环境验证 --name可以简写-n;不指定python版本时,默认是2.7
# 流程和上面加superset的conda环境一样
wangting@ops04:/opt/module/miniconda3/pkgs >conda create -n wangting python=3.6
wangting@ops04:/opt/module/miniconda3/pkgs >
wangting@ops04:/opt/module/miniconda3/pkgs >conda info --envs
# conda environments:
#
base * /opt/module/miniconda3
superset /opt/module/miniconda3/envs/superset
wangting /opt/module/miniconda3/envs/wangting
wangting@ops04:/opt/module/miniconda3/pkgs >conda activate wangting
(wangting) wangting@ops04:/opt/module/miniconda3/pkgs >
Superset官网地址:http://superset.apache.org/
安装Superset之前,先安装以下所需依赖
wangting@ops04:/opt/software >sudo yum install -y python-setuptools
wangting@ops04:/opt/software >sudo yum install -y gcc gcc-c++ libffi-devel python-devel python-pip python-wheel openssl-devel cyrus-sasl-devel openldap-devel
登陆conda-superset环境下安装部署
wangting@ops04:/opt/software >conda activate superset
(superset) wangting@ops04:/opt/software >
安装(更新)setuptools和pip
(superset) wangting@ops04:/opt/software >pip install --upgrade setuptools pip -i https://pypi.douban.com/simple/
Looking in indexes: https://pypi.douban.com/simple/
Requirement already satisfied: setuptools in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (52.0.0.post20210125)
Collecting setuptools
Downloading https://pypi.doubanio.com/packages/4e/78/56aa1b5f4d8ac548755ae767d84f0be54fdd9d404197a3d9e4659d272348/setuptools-57.0.0-py3-none-any.whl (821 kB)
|████████████████████████████████| 821 kB 2.4 MB/s
Requirement already satisfied: pip in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (21.1.2)
Installing collected packages: setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 52.0.0.post20210125
Uninstalling setuptools-52.0.0.post20210125:
Successfully uninstalled setuptools-52.0.0.post20210125
Successfully installed setuptools-57.0.0
(superset) wangting@ops04:/opt/software >
安装superset
# apache-superset会安装一系列依赖模块等待安装完成
(superset) wangting@ops04:/opt/software >pip install apache-superset -i https://pypi.douban.com/simple/
初始化Supetset数据库
(superset) wangting@ops04:/opt/software >superset db upgrade
Traceback (most recent call last):
File "/opt/module/miniconda3/envs/superset/bin/superset", line 5, in <module>
from superset.cli import superset
File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/__init__.py", line 21, in <module>
from superset.app import create_app
File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/app.py", line 45, in <module>
from superset.security import SupersetSecurityManager
File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/security/__init__.py", line 17, in <module>
from superset.security.manager import SupersetSecurityManager # noqa: F401
File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/security/manager.py", line 44, in <module>
from superset import sql_parse
File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/sql_parse.py", line 18, in <module>
from dataclasses import dataclass
ModuleNotFoundError: No module named 'dataclasses'
# 提示ERROR报错,找不到dataclasses模块,根据报错安装
(superset) wangting@ops04:/opt/software >pip install dataclasses
Collecting dataclasses
Downloading dataclasses-0.8-py3-none-any.whl (19 kB)
Installing collected packages: dataclasses
Successfully installed dataclasses-0.8
# 再次尝试初始化,完成
(superset) wangting@ops04:/opt/software >superset db upgrade
创建管理员用户
(superset) wangting@ops04:/opt/software >export FLASK_APP=superset
(superset) wangting@ops04:/opt/software >flask fab create-admin
Username [admin]: # 回车 用于登陆管理页面的管理用户
User first name [admin]: # 回车 用户信息
User last name [user]: # 回车 用户信息
Email [[email protected]]: # 回车 邮箱信息
Password: # 设置密码 123456 用于登陆管理页面的管理用户密码
Repeat for confirmation: # 重复密码 123456
logging was configured successfully
INFO:superset.utils.logging_configurator:logging was configured successfully
/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/flask_caching/__init__.py:202: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled.
"Flask-Caching: CACHE_TYPE is set to null, "
No PIL installation found
INFO:superset.utils.screenshots:No PIL installation found
Recognized Database Authentications.
Admin User admin created.
Superset初始化
(superset) wangting@ops04:/opt/software >superset init
logging was configured successfully
INFO:superset.utils.logging_configurator:logging was configured successfully
...
...
INFO:superset.security.manager:Creating missing metrics permissions
Cleaning faulty perms
INFO:superset.security.manager:Cleaning faulty perms
(superset) wangting@ops04:/opt/software >
安装gunicorn用来提供http服务
(superset) wangting@ops04:/opt/software >pip install gunicorn -i https://pypi.douban.com/simple/
Looking in indexes: https://pypi.douban.com/simple/
Requirement already satisfied: gunicorn in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (20.0.4)
Requirement already satisfied: setuptools>=3.0 in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (from gunicorn) (57.0.0)
启动Supterset
(superset) wangting@ops04:/opt/software >gunicorn --workers 5 --timeout 120 --bind ops04:8787 "superset.app:create_app()" --daemon
(superset) wangting@ops04:/opt/software >
【注意:】 ops04是主机名,在/etc/hosts中有主机名的ip解析;
查看superset运行状态
![002](C:\Users\33450\Desktop\大数据文档\superset\002.png)(superset) wangting@ops04:/opt/software >netstat -tnlpu|grep 8787
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 11.8.38.86:8787 0.0.0.0:* LISTEN 18884/python
(superset) wangting@ops04:/opt/software >ps -ef | grep 8787 | grep -v grep
wangting 18884 1 0 11:32 ? 00:00:00 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting 18887 18884 5 11:32 ? 00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting 18888 18884 5 11:32 ? 00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting 18890 18884 5 11:32 ? 00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting 18892 18884 5 11:32 ? 00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting 18893 18884 5 11:32 ? 00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
停止superset服务 (需要操作时再停止)
# 相当于把上面对应的进程id逐个kill掉,服务自身没有提供命令行停服
(superset) wangting@ops04:/opt/software >ps -ef | awk '/gunicorn/ && !/awk/{print $2}' | xargs kill -9
http://ops04:8787/login/
【注意:】
用户名密码为刚才create-admin定义的用户名密码、
url地址用http://ops04:8787/login/能访问因为在 C:\Windows\System32\drivers\etc\hosts 文件中做了ops04的ip解析,也可以直接换成ip:8787访问
superset对接不同的数据源,需安装不同的依赖,以下地址为官网说明
https://superset.apache.org/docs/databases/installing-database-drivers
常用的数据源pip安装方式和连接格式
Database | PyPI package | Connection String |
---|---|---|
Apache Hive | pip install pyhive |
hive://hive@{hostname}:{port}/{database} |
Apache Impala | pip install impyla |
impala://{hostname}:{port}/{database} |
Apache Kylin | pip install kylinpy |
kylin:// |
Apache Spark SQL | pip install pyhive |
hive://hive@{hostname}:{port}/{database} |
Big Query | pip install pybigquery |
bigquery://{project_id} |
Elasticsearch | pip install elasticsearch-dbapi |
elasticsearch+http://{user}:{password}@{host}:9200/ |
MySQL | pip install mysqlclient |
mysql:// |
Oracle | pip install cx_Oracle |
oracle:// |
PostgreSQL | pip install psycopg2 |
postgresql:// |
Presto | pip install pyhive |
presto:// |
SQLite | sqlite:// |
|
SQL Server | pip install pymssql |
mssql:// |
安装mysqlclient依赖
(superset) wangting@ops04:/opt/software >conda install mysqlclient
Proceed ([y]/n)? # y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
安装完重启Superset
(superset) wangting@ops04:/opt/software >ps -ef | awk '/gunicorn/ && !/awk/{print $2}' | xargs kill -9
(superset) wangting@ops04:/opt/software >netstat -tnlpu|grep 8787
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
(superset) wangting@ops04:/opt/software >gunicorn --workers 5 --timeout 120 --bind ops04:8787 "superset.app:create_app()" --daemon
(superset) wangting@ops04:/opt/software >netstat -tnlpu|grep 8787
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 11.8.38.86:8787 0.0.0.0:* LISTEN 44883/python
(superset) wangting@ops04:/opt/software >
Database配置
添加database,保存
Data - databases
gmall # 表示用户database的名称 根据情况修改
mysql://root:[email protected]/gmall?charset=utf8
root # 用户名
123456 # 数据库密码
11.8.38.86 # 数据库ip
gmall # 数据库
charset=utf8 # 指定字符集
默认3306端口
添加测试用例数据表
table: supersetwt
Table配置
Data - Datasets + 号添加
库和表添加完成后,相当于当前已经具备了数据源采集
添加Dashboards +号添加
创建图表,点击表 选择一个模板
编辑模型
点击SAVE保存,转到看板看看初版效果
图片效果不明显,体重数据太过于接近,修改一下mysql数据,体重落差改大一些
点击图标右上角有个小菜单,可以刷新数据图
继续添加数据模板元素(阅读统计)
继续添加数据模板元素(近一周最高体温)
dashboard可以编辑排版