一、实验目的
1. 掌握pandas的read_csv函数
2. 掌握并使用matplotlib.pyplot中plot函数以及各个参数用法
3. 掌握并使用matplotlib.pyplot中scatter函数以及各个参数用法
4. 掌握并使用matplotlib.pyplot中pie函数以及各个参数用法
5. 掌握并使用matplotlib.pyplot中bar函数以及各个参数用法
6. 查阅并使用matplotlib.pyplot中直方图函数以及各个参数用法
二、实验内容及结果
1. 利用pandas读取以下链接的csv,并保存对象为变量名“df”: (https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv)
- 提示: Pandas不能直接读取string,而只能读取文件对象。例如,文件对象可以利用以下代码抓取并生成:
- import requests, io
- response=requests.get(‘YOUR_LINK’)
- file_object = io.StringIO(response.content.decode(‘utf-8’))
import requests
import io
import pandas as pd
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
print(df)
2. 读取所有月份的总利润并使用折线图显示。折线图应如下所示:
3. 获取所有月份的总利润并显示具有以下样式属性的折线图。折线图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['month_number'].tolist()
y = df['total_profit'].tolist()
plt.plot(x, y, color='r', marker='o', markerfacecolor='black', linestyle='dashed')
plt.xlabel("Month number")
plt.ylabel("profit in dollar")
plt.title("Company Sales data of last year")
plt.ylim(ymin=100000, ymax=500000)
y_major_locator = plt.MultipleLocator(100000)
x_major_locator = plt.MultipleLocator(1)
ax = plt.gca()
ax.xaxis.set_major_locator(x_major_locator)
ax.yaxis.set_major_locator(y_major_locator)
plt.legend(('profit data of last year',), loc='lower right')
plt.show()
4. 读取所有产品销售数据并使用多线图显示。即,每个产品的单独一条线。该图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['month_number'].tolist()
y1 = df.iloc[:, 1].tolist()
y2 = df.iloc[:, 2].tolist()
y3 = df.iloc[:, 3].tolist()
y4 = df.iloc[:, 4].tolist()
y5 = df.iloc[:, 5].tolist()
y6 = df.iloc[:, 6].tolist()
plt.plot(x, y1, marker='o')
plt.plot(x, y2, marker='o')
plt.plot(x, y3, marker='o')
plt.plot(x, y4, marker='o')
plt.plot(x, y5, marker='o')
plt.plot(x, y6, marker='o')
plt.xlabel("Month number")
plt.ylabel("Sales units in number")
plt.title("Sales data")
plt.ylim(ymax=18000)
x_major_locator = plt.MultipleLocator(1)
ax = plt.gca()
ax.xaxis.set_major_locator(x_major_locator)
plt.legend(('Face cream Sales Data', 'Face Wash Sales Data', 'ToothPaste Sales Data',
'BathingSoap Sales Data', 'Shampoo Sales Data', 'Moisturizer Sales Data'), loc='upper left')
plt.show()
5. 读取每个月的牙膏销售数据并使用散点图显示。此外,在图中添加一个网格。网格线样式应为“-”。散点图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['month_number'].tolist()
y3 = df.iloc[:, 3].tolist()
plt.scatter(x, y3, marker='o')
plt.xlabel("Month number")
plt.ylabel("Number of units Sold")
plt.title("Tooth paste Sales data")
y_major_locator = plt.MultipleLocator(500)
x_major_locator = plt.MultipleLocator(1)
ax = plt.gca()
ax.yaxis.set_major_locator(y_major_locator)
ax.xaxis.set_major_locator(x_major_locator)
plt.legend(('Tooth paste Sales data',), loc='upper left')
plt.grid(linestyle='--')
plt.show()
6. 读取面霜和洗面奶产品销售数据并使用条形图显示。条形图应显示每个产品每月售出的单位数量。条形图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['month_number'].tolist()
y1 = df.iloc[:, 1].tolist()
y2 = df.iloc[:, 2].tolist()
bar_width = 0.2
index = np.arange(len(x))+1
plt.bar(x, y1, bar_width)
plt.bar(index+0.2, y2, bar_width)
plt.xlabel("Month number")
plt.ylabel("Sales units in number")
plt.title("Facewash and facecream sales data")
plt.ylim(ymin=0)
x_major_locator = plt.MultipleLocator(1)
y_major_locator = plt.MultipleLocator(500)
ax = plt.gca()
ax.xaxis.set_major_locator(x_major_locator)
ax.yaxis.set_major_locator(y_major_locator)
plt.legend(('Face Cream sales data', 'Face Wash sales data', ), loc='upper left')
plt.grid(linestyle='--')
plt.show()
7. 读取所有月份的沐浴皂销售数据并使用条形图显示,最后将此图保存到硬盘(路径为:D:\7.png,dpi为150)。条形图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['month_number'].tolist()
y4 = df.iloc[:, 4].tolist()
plt.bar(x, y4)
plt.xlabel("Month Number")
plt.ylabel("Sales units in number")
plt.title("bathingsoap sales data")
plt.ylim(ymin=0)
x_major_locator = plt.MultipleLocator(1)
y_major_locator = plt.MultipleLocator(2000)
ax = plt.gca()
ax.xaxis.set_major_locator(x_major_locator)
ax.yaxis.set_major_locator(y_major_locator)
plt.grid(linestyle='--')
plt.savefig('D:/7.png', dpi=150)
plt.show()
8. 计算每种产品去年的总销售数据并使用饼图显示。注意:在饼图中显示每种产品每年售出的单位数量百分比。饼图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['month_number'].tolist()
y1 = sum(df.iloc[:, 1].tolist())
y2 = sum(df.iloc[:, 2].tolist())
y3 = sum(df.iloc[:, 3].tolist())
y4 = sum(df.iloc[:, 4].tolist())
y5 = sum(df.iloc[:, 5].tolist())
y6 = sum(df.iloc[:, 6].tolist())
pienum = [y1, y2, y3, y4, y5, y6]
langs = ['Face cream Sales Data', 'Face Wash Sales Data', 'ToothPaste Sales Data',
'BathingSoap Sales Data', 'Shampoo Sales Data', 'Moisturizer Sales Data']
plt.pie(pienum, labels = langs,autopct='%1.2f%%')
plt.title("Sales data")
plt.legend(('Face cream', 'Face Wash', 'ToothPaste',
'BathingSoap', 'Shampoo', 'Moisturizer'), loc='lower right')
plt.show()
9. 阅读所有月份的沐浴皂洗面奶并使用子图显示它。子图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['month_number'].tolist()
y4 = df.iloc[:, 4].tolist()
y2 = df.iloc[:, 2].tolist()
plt.subplot(211)
plt.plot(x, y4, color='black', marker='o')
plt.title("Sales data of a Bathingsoap")
plt.xticks([])
plt.subplot(212)
plt.plot(x, y2, color='r', marker='o')
plt.title("Sales data of a facewash")
x_major_locator = plt.MultipleLocator(1)
ax = plt.gca()
ax.xaxis.set_major_locator(x_major_locator)
plt.xlabel("Month Number")
plt.ylabel("Sales units in number")
plt.show()
10. 请查阅关于直方图的资料(plt.hist)。之后读取每个月的总利润并使用直方图显示(横轴为利润范围)。直方图应如下所示:
import requests
import io
import pandas as pd
import matplotlib.pyplot as plt
response = requests.get('https://pynative.com/wp-content/uploads/2019/01/company_sales_data.csv')
file_object = io.StringIO(response.content.decode('utf-8'))
df = pd.read_csv(file_object)
x = df['total_profit'].tolist()
bins = [150000, 175000, 200000, 225000, 250000, 300000, 350000]
plt.hist(x, bins)
plt.xticks(bins)
plt.xlabel("profit in dollar")
plt.ylabel("Actual Profit in dollar")
plt.title('Profit data')
plt.show()