某餐饮店8月份的订单数据分别存储在order1,order2,order3三个表中,现餐饮店经理想要了解8月份本餐厅的营业额情况。
采用pandas库,对三个表格进行分别读取
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
order1 = open('order1.csv')
order2 = open('order2.csv')
order3 = open('order3.csv')
data1 = pd.read_csv(order1)
data2 = pd.read_csv(order2)
data3 = pd.read_csv(order3)
data = pd.concat([data1,data2,data3],axis=0)
data
#打印列标签
data.columns
import pandas as pd
plt.rcParams['font.sans-serif']=['SimHei'] #正常显示中文标签
order1 = open('order1.csv')
order2 = open('order2.csv')
order3 = open('order3.csv')
data1 = pd.read_csv(order1)
data2 = pd.read_csv(order2)
data3 = pd.read_csv(order3)
#合并order1,order2,order3三个表格
data = pd.concat([data1,data2,data3],axis=0)
data
#计算销售额
data['price'] = data['counts'] * data['amounts']
data['price']
# 将订餐日期与星期相对应
week = pd.DatetimeIndex(data['place_order_time'])
data['weekday_name'] = week.weekday_name
#将订餐日期与月份天数相对应
data['day'] = pd.DatetimeIndex(data['place_order_time']).day
计算出8月份餐饮的每日销售额
#导入numpy库
import numpy as np
data_gb = data[['day', 'price']].groupby(by='day')
#number .agg聚合日期和价格
number = data_gb.agg(np.sum)
number
#导入matplotlib.pyplot库
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 7)) # 设置绘图窗口
plt.rcParams['font.sans-serif'] = 'SimHei' # 中文字体
plt.scatter(range(1, 32), number, marker='D')
plt.plot(range(1, 32), number)
plt.title('2016年8月餐饮销售额趋势示意图')
plt.xlabel('日期')
plt.ylabel('销售额')
plt.xticks(range(0, 32)[::7], range(0, 32)[::7])
plt.show()
import numpy as np
data_gb = data[['weekday_name', 'price']].groupby(by='weekday_name')
#number .agg聚合星期和价格
outcome = data_gb.agg(np.sum)
outcome
# 对星期进行排序操作
sort = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
outcome2 = outcome.loc[sort, 'price']
outcome2
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.bar(range(1, len(outcome2)+1),outcome2, width=0.5, alpha=0.5)
plt.xticks(range(1, len(outcome2)+1), outcome2.index)
plt.title('星期与销售额的数量情况')
for i, j in zip(range(1, len(outcome2)+1),outcome2):
plt.text(i, j, '%i'%j, ha='center', va='bottom')
plt.show()
绘制圆环图
import matplotlib.pyplot as plt
plt.figure(figsize=(5, 5))
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.style.use('Solarize_Light2')
plt.pie(outcome2, labels=outcome2.index, autopct='%.2f %%', wedgeprops=dict(width=0.6, edgecolor='w'))
plt.title('星期销售额占比情况')
plt.show()
进行分组聚合
data_gb = data[['order_id', 'price', 'day']].groupby(by='day')
#定义sort函数,去除表格中的的重复数字,并进行排序。
def sort(data):
return len(np.unique(data))
outcome3 = data_gb.agg({'price': np.sum, 'order_id': sort})
outcome3
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.figure(figsize=(10, 6))
plt.scatter(range(1, 32), outcome3['price'], s=outcome3['order_id'])
plt.title('订单量、销售额与时间的关系')
plt.xlabel('时间')
plt.ylabel('销售额')
plt.show()
本文主要采用了Python中的pandas库实现了对表格的读取和合并,在通过numpy库实现了对于表格数据的分组聚合,最后通过numpy处理后的数据,使用matplotlib库,绘制出折线图,柱状图,圆环图和气泡图,完成对于目标的实现。
数据集百度云盘链接:
链接:https://pan.baidu.com/s/1t8K2reUb1cIqIwbMU1XkzA
提取码:wlfn