本文分别使用商场购物篮数据集和电影数据集来分别针对Apriori和FPgrowth进行实际的运用和学习。
1.dataset:
https://github.com/ywchiu/python_for_data_science 中的Data文件夹下面有具体数据。按照本文的csv文件进行读取即可。
2.Apriori用于购物篮分析
2.1.代码
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#读取交易数据
df = pd.read_csv('F:/python_for_data_science-master/Data/Market_Basket.csv', header = None)
df.head()
df.count()#对各列的非空数值进行计数
#增添交易纪录
transactions = []
for i in range(0, 7501):
transactions.append([str(df.values[i,j]) for j in range(0, 20)]) #第i次购买事物(即第i个客户)购买的商品清单
#transactions[1]
#使用Apriori 产生关联规则
from apyori import apriori
rules = apriori(transactions, min_support = 0.003, min_confidence = 0.2, min_lift = 3, min_length = 2)#套用Apriori算法
# Visualising the results
results = list(rules)
results[0]
results[0].ordered_statistics[0]
#产生关联规则
for rec in results:
left_hands = rec.ordered_statistics[0].items_base
right_hands = rec.ordered_statistics[0].items_add
l = ';'.join([item for item in left_hands])
r = ';'.join([item for item in right_hands])
print('{} => {}'.format(l,r))
#产生频繁交易集
# 'milk', 'spaghetti', 'avocado'
#'milk', 'spaghetti'
#'milk', 'avocado'
#'spaghetti', 'avocado'
import itertools
for ele in itertools.combinations(['milk', 'spaghetti', 'avocado'], 2):
print(ele)
itemsets = []
for rec in results:
#print(rec.items)
for ele in itertools.combinations(rec.items, 2):
itemsets.append(ele)
#itemsets[0:10]
#通过Gephi来可视化网络,通过网络来关联。
import pandas
df2 = pandas.DataFrame(itemsets)
df2.columns = ['Source','Target']
df2['Type'] = 'undirected'
df2.to_csv('F:/python_for_data_science-master/result/transactions.csv')
light cream ====> chicken
mushroom cream sauce ====> escalope
pasta ====> escalope
fromage blanc ====> honey
herb & pepper ====> ground beef
tomato sauce ====> ground beef
light cream ====> olive oil
whole wheat pasta ====> olive oil
pasta ====> shrimp
avocado;spaghetti ====> milk
cake;milk ====> burgers
turkey;chocolate ====> burgers
turkey;milk ====> burgers
cake;frozen vegetables ====> tomatoes
ground beef;cereals ====> spaghetti
ground beef;chicken ====> milk
light cream;nan ====> chicken
chicken;milk ====> olive oil
chicken;spaghetti ====> olive oil
frozen vegetables;chocolate ====> shrimp
herb & pepper;chocolate ====> ground beef
soup;chocolate ====> milk
ground beef;cooking oil ====> spaghetti
eggs;ground beef ====> herb & pepper
eggs;red wine ====> spaghetti
mushroom cream sauce;nan ====> escalope
nan;pasta ====> escalope
ground beef;french fries ====> herb & pepper
fromage blanc;nan ====> honey
green tea;frozen vegetables ====> tomatoes
spaghetti;frozen vegetables ====> ground beef
frozen vegetables;milk ====> olive oil
soup;frozen vegetables ====> milk
milk;tomatoes ====> frozen vegetables
shrimp;mineral water ====> frozen vegetables
spaghetti;frozen vegetables ====> olive oil
spaghetti;frozen vegetables ====> shrimp
frozen vegetables;shrimp ====> tomatoes
spaghetti;frozen vegetables ====> tomatoes
grated cheese;spaghetti ====> ground beef
ground beef;green tea ====> tomatoes
herb & pepper;milk ====> ground beef
herb & pepper;mineral water ====> ground beef
herb & pepper;nan ====> ground beef
herb & pepper;spaghetti ====> ground beef
ground beef;milk ====> olive oil
ground beef;soup ====> milk
tomato sauce;nan ====> ground beef
pepper;spaghetti ====> ground beef
ground beef;shrimp ====> spaghetti
ground beef;tomato sauce ====> spaghetti
light cream;nan ====> olive oil
olive oil;shrimp ====> milk
olive oil;milk ====> soup
spaghetti;milk ====> olive oil
milk;tomatoes ====> soup
whole wheat pasta;spaghetti ====> milk
soup;mineral water ====> olive oil
whole wheat pasta;mineral water ====> olive oil
whole wheat pasta;nan ====> olive oil
nan;pasta ====> shrimp
pancakes;spaghetti ====> olive oil
olive oil;tomatoes ====> spaghetti
whole wheat rice;spaghetti ====> tomatoes
avocado;spaghetti;nan ====> milk
cake;nan;milk ====> burgers
turkey;chocolate;nan ====> burgers
turkey;nan;milk ====> burgers
cake;frozen vegetables;nan ====> tomatoes
ground beef;cereals;nan ====> spaghetti
ground beef;chicken;nan ====> milk
chicken;nan;milk ====> olive oil
chicken;spaghetti;nan ====> olive oil
eggs;chocolate;mineral water ====> ground beef
frozen vegetables;chocolate;mineral water ====> ground beef
frozen vegetables;chocolate;ground beef ====> spaghetti
frozen vegetables;chocolate;mineral water ====> milk
frozen vegetables;spaghetti;chocolate ====> milk
frozen vegetables;chocolate;mineral water ====> shrimp
frozen vegetables;chocolate;nan ====> shrimp
herb & pepper;chocolate;nan ====> ground beef
soup;chocolate;nan ====> milk
spaghetti;chocolate;mineral water ====> olive oil
spaghetti;chocolate;mineral water ====> shrimp
ground beef;cooking oil;nan ====> spaghetti
eggs;frozen vegetables;mineral water ====> milk
eggs;nan;ground beef ====> herb & pepper
eggs;red wine;nan ====> spaghetti
ground beef;french fries;nan ====> herb & pepper
spaghetti;frozen smoothie;mineral water ====> milk
green tea;frozen vegetables;nan ====> tomatoes
ground beef;frozen vegetables;mineral water ====> milk
ground beef;frozen vegetables;milk ====> spaghetti
spaghetti;frozen vegetables;mineral water ====> ground beef
spaghetti;frozen vegetables;nan ====> ground beef
frozen vegetables;milk;mineral water ====> olive oil
frozen vegetables;milk;mineral water ====> soup
spaghetti;milk;mineral water ====> frozen vegetables
frozen vegetables;milk;nan ====> olive oil
soup;frozen vegetables;nan ====> milk
nan;milk;tomatoes ====> frozen vegetables
shrimp;nan;mineral water ====> frozen vegetables
spaghetti;frozen vegetables;mineral water ====> shrimp
spaghetti;frozen vegetables;mineral water ====> tomatoes
spaghetti;frozen vegetables;nan ====> olive oil
spaghetti;frozen vegetables;nan ====> shrimp
shrimp;frozen vegetables;nan ====> tomatoes
spaghetti;frozen vegetables;nan ====> tomatoes
grated cheese;spaghetti;nan ====> ground beef
ground beef;green tea;nan ====> tomatoes
herb & pepper;nan;milk ====> ground beef
herb & pepper;nan;mineral water ====> ground beef
herb & pepper;spaghetti;nan ====> ground beef
ground beef;nan;milk ====> olive oil
ground beef;soup;nan ====> milk
olive oil;spaghetti;mineral water ====> ground beef
ground beef;tomatoes;mineral water ====> spaghetti
pepper;spaghetti;nan ====> ground beef
ground beef;nan;shrimp ====> spaghetti
ground beef;tomato sauce;nan ====> spaghetti
spaghetti;milk;mineral water ====> olive oil
spaghetti;milk;mineral water ====> tomatoes
olive oil;nan;shrimp ====> milk
olive oil;nan;milk ====> soup
spaghetti;nan;milk ====> olive oil
nan;milk;tomatoes ====> soup
whole wheat pasta;spaghetti;nan ====> milk
soup;nan;mineral water ====> olive oil
whole wheat pasta;nan;mineral water ====> olive oil
pancakes;spaghetti;nan ====> olive oil
olive oil;nan;tomatoes ====> spaghetti
whole wheat rice;spaghetti;nan ====> tomatoes
eggs;nan;chocolate;mineral water ====> ground beef
frozen vegetables;nan;chocolate;mineral water ====> ground beef
frozen vegetables;chocolate;ground beef;nan ====> spaghetti
frozen vegetables;nan;chocolate;mineral water ====> milk
frozen vegetables;spaghetti;chocolate;nan ====> milk
frozen vegetables;nan;chocolate;mineral water ====> shrimp
nan;spaghetti;chocolate;mineral water ====> olive oil
nan;spaghetti;chocolate;mineral water ====> shrimp
eggs;nan;frozen vegetables;mineral water ====> milk
nan;spaghetti;frozen smoothie;mineral water ====> milk
ground beef;nan;frozen vegetables;mineral water ====> milk
ground beef;frozen vegetables;milk;nan ====> spaghetti
nan;spaghetti;frozen vegetables;mineral water ====> ground beef
nan;frozen vegetables;milk;mineral water ====> olive oil
nan;frozen vegetables;milk;mineral water ====> soup
spaghetti;nan;milk;mineral water ====> frozen vegetables
nan;spaghetti;frozen vegetables;mineral water ====> shrimp
nan;spaghetti;frozen vegetables;mineral water ====> tomatoes
olive oil;spaghetti;nan;mineral water ====> ground beef
ground beef;tomatoes;nan;mineral water ====> spaghetti
spaghetti;nan;milk;mineral water ====> olive oil
spaghetti;nan;milk;mineral water ====> tomatoes
2.2.使用Gephi后的可视化效果
具体操作可以参考 Gephi教程 https://blog.csdn.net/jp_zhou256/article/details/83689224
下图中节点和边均需要设置才有如下效果:
点选chicken,然后显示出来的文字就是买了chicken的人,还会买什么了。
3.FPgrowth用于电影数据集分析
import pandas
movie = pandas.read_csv('F:/buyingVideo/pandas资料/python_for_data_science-master/Data/movies.csv')
movie.head()
movie_dic = {}
for rec in movie.iterrows():
movie_dic[rec[1].movieId] = rec[1].title #将DataFrame的一行选取两个字段,构造成字典
movie_dic.get(1)
import pandas
import datetime
df = pandas.read_csv('F:/buyingVideo/pandas资料/python_for_data_science-master/Data/ratings.csv')
df.info()
df = df[df['timestamp'] >= 1325376000] #筛选13年的数据
df.info()
from apyori import apriori
#如下代码是整个Apriori算法的输入,理解了数据矩阵的物理意义,泛化算法的使用就会轻而易举。
#dfdf1=[ele for ele in df.groupby('userId')['movieId'].apply(list)] #某个用户喜欢的电影集合
transactions = [ele for ele in df.groupby('userId')['movieId'].apply(list)]
rules = apriori(transactions, min_support = 0.2, min_confidence = 0.5, min_lift = 3, min_length = 2)
results = list(rules)
for rec in results:
print(' ;\n'.join([movie_dic.get(item) for item in rec.items]))
#[print(item) for item in results[1].items]
from pymining import itemmining
fp_input = itemmining.get_fptree(transactions)
#FPgrowth
report = itemmining.fpgrowth(fp_input, min_support=30, pruning=True)
for ele in report:
if len(ele) >=6: #选取频繁6项集
print(' ;'.join([movie_dic.get(item) for item in ele]))
print('\n')
#for rec in results:
Kill Bill: Vol. 1 (2003) ;
Kill Bill: Vol. 2 (2004)
Star Wars: Episode VI - Return of the Jedi (1983) ;
Star Wars: Episode V - The Empire Strikes Back (1980) ;
Matrix, The (1999) ;
Star Wars: Episode IV - A New Hope (1977) ;
Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)
Star Wars: Episode VI - Return of the Jedi (1983) ;
Star Wars: Episode V - The Empire Strikes Back (1980) ;
Matrix, The (1999) ;
Star Wars: Episode IV - A New Hope (1977) ;
#for ele in report:
Lord of the Rings: The Two Towers, The (2002)
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Star Wars: Episode V - The Empire Strikes Back (1980) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Batman Begins (2005) ;Lord of the Rings: The Return of the King, The (2003) ;Matrix, The (1999) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Matrix, The (1999) ;Inception (2010) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Matrix, The (1999) ;Star Wars: Episode V - The Empire Strikes Back (1980) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Inception (2010) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Batman Begins (2005) ;Lord of the Rings: The Return of the King, The (2003) ;Inception (2010) ;Dark Knight, The (2008)
============
Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Star Wars: Episode V - The Empire Strikes Back (1980) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Inception (2010)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Inception (2010) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Star Wars: Episode V - The Empire Strikes Back (1980)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Inception (2010) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Dark Knight, The (2008)
============
Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Matrix, The (1999) ;Inception (2010) ;Dark Knight, The (2008)
============
Lord of the Rings: The Two Towers, The (2002) ;Lord of the Rings: The Fellowship of the Ring, The (2001) ;Lord of the Rings: The Return of the King, The (2003) ;Star Wars: Episode IV - A New Hope (1977) ;Star Wars: Episode V - The Empire Strikes Back (1980) ;Dark Knight, The (2008)
============