【pandas】【可视化】一 SQL 数据可视化简单实例 - 使用 pandas.io.sql, MySQL, matplotlib

Overview

    • 1 depends
    • 2 Prepare
      • 2.1 SQL
      • 2.2 matplotlib
    • 3 pandas 实现简单可视化
    • Reference

1 depends

$ pip install numpy, pandas
$
$ sudo apt install python3.6-tk  # maybe need
$ pip install matplotlib  # x
$ pip uninstall matplotlib
$ sudo apt install python3-matplotlib 
$
$ sudo apt install -y python3-dev libpython3.<?>-minimal  # maybe need
$ sudo apt install -y libmysqlclient-dev  # maybe need
$ pip install mysqlclient
$ 

注意上面 matplotlib 使用 APT 安装1



2 Prepare

2.1 SQL

$ mysql -u <the username> -p
: <enter the password>

mysql> USE <the database>;
mysql> SELECT i.yuanchuan, i.zhuanzai, i.visitor, i.rank, i.date, u.username
       FROM userinfors AS i
       INNER JOIN users AS u
       ON i.user_id =  u.id WHERE u.id = 1;
+-----------+----------+---------+-------+------------+----------+
| yuanchuan | zhuanzai | visitor | rank  | date       | username |
+-----------+----------+---------+-------+------------+----------+
|       103 |       19 |   36176 | 41440 | 2019-05-07 | RDpWTeHM |
|       103 |       20 |   36402 | 41138 | 2019-05-08 | RDpWTeHM |
|       103 |       20 |   36607 | 40748 | 2019-05-09 | RDpWTeHM |
|       103 |       20 |   37205 | 40478 | 2019-05-13 | RDpWTeHM |
|       103 |       20 |   37205 | 40423 | 2019-05-14 | RDpWTeHM |
|       103 |       20 |   36776 | 40423 | 2019-05-15 | RDpWTeHM |
|       104 |       20 |   36325 | 40090 | 2019-05-16 | RDpWTeHM |
|       104 |       20 |   36325 | 40062 | 2019-05-17 | RDpWTeHM |
|       104 |       20 |   38371 | 40015 | 2019-05-18 | RDpWTeHM |
|       105 |       20 |   38652 | 39437 | 2019-05-20 | RDpWTeHM |
|       105 |       20 |   38921 | 39403 | 2019-05-21 | RDpWTeHM |
|       105 |       20 |   39068 | 39420 | 2019-05-22 | RDpWTeHM |
+-----------+----------+---------+-------+------------+----------+
12 rows in set (0.00 sec)

mysql>
mysql> quit;
$ 

2.2 matplotlib

运行下面脚本测试 plt.show() 能否工作2

#!/usr/bin/env python3
"""test_plt_show.py

Environment:
    - Ubuntu 18.04 + Gnome(Mac Themes)
    - sudo apt install python3-matplotlib
    - $ export DISPLAY=:0

Usage:
    $ ipython
    >>> from test_plt_show import *
    >>> plt.show()
"""

import matplotlib.pyplot as plt
import numpy as np

t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2*np.pi*t)
plt.plot(t, s)
plt.title('About as simple as it gets, folks')

if __name__ == '__main__':
    plt.show()

如果不能工作则考虑使用 plt.save("") 代替。


如果出现如下错误

Unable to init server: Could not connect: Connection refused
Unable to init server: 无法连接: Connection refused

(ipython:31480): Gdk-CRITICAL **: 16:39:40.005: gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed

则需要注意上面代码中给出的方案3

$ export DISPLAY=:0


3 pandas 实现简单可视化

$ export THE_USERNAME=<the username>
$ export THE_PASSWORD=<the password>
$ export THE_DB = <the database name>
$ ipython

>>> import pandas as pd
>>> import pandas.io.sql as sql
>>> import matplotlib.pyplot as plt
>>> import os, sys
>>> USERNAME = os.environ.get("THE_USERNAME")
>>> PASSWORD = os.environ.get("THE_PASSWORD")
>>> DB_NAME = os.environ.get("THE_DB")
>>> import MySQLdb
>>> conn = MySQLdb.connect( 
   ...:     'localhost', 
   ...:     USERNAME, PASSWORD, DB_NAME, 
   ...:     charset='utf8')
>>> 
>>> dataframe = sql.read_sql_query( 
    ...:     "SELECT i.visitor, i.rank, i.date " 
    ...:     "FROM userinfors AS i " 
    ...:     "INNER JOIN users AS u " 
    ...:     "ON i.user_id =  u.id WHERE u.id = 1;", conn)
>>> # 注意上面每行 "..." 后面没有逗号(,),引号("")里面末尾有空格( )
>>> dataframe                                                                                                                     
    visitor   rank        date
0     36176  41440  2019-05-07
1     36402  41138  2019-05-08
2     36607  40748  2019-05-09
3     37205  40478  2019-05-13
4     37205  40423  2019-05-14
5     36776  40423  2019-05-15
6     36325  40090  2019-05-16
7     36325  40062  2019-05-17
8     38371  40015  2019-05-18
9     38652  39437  2019-05-20
10    38921  39403  2019-05-21
11    39068  39420  2019-05-22

>>> df = dataframe.set_index('date')
>>> df.plot()                                                                                                                    
<matplotlib.axes._subplots.AxesSubplot at 0x7f02eb914c88>

>>> plt.show()
>>> # if .show() not work(open Figure window)
>>> plt.savefig('/home//
.png') >>> exit $

注意 .set_index.plot 的使用4.

最后使用 matplotlib 绘出的图形:

【pandas】【可视化】一 SQL 数据可视化简单实例 - 使用 pandas.io.sql, MySQL, matplotlib_第1张图片



Reference



  1. https://github.com/NVIDIA/DIGITS/issues/187 ↩︎

  2. https://stackoverflow.com/questions/43397162/show-matplotlib-plots-in-ubuntu-windows-subsystem-for-linux ↩︎

  3. https://blog.csdn.net/qq_29757283/article/details/89463085 > SSH 远程启动 xterm ↩︎

  4. https://stackoverflow.com/questions/38274187/plotting-dataframe-column-datetime ↩︎

你可能感兴趣的:(#,Python)