apriori关联规则

pip install mlxtend  #注意在jupyter里面操作要加!
The following command must be run outside of the IPython shell:

    $ pip install mlxtend

The Python package manager (pip) can only be used from outside of IPython.
Please reissue the `pip` command in a separate terminal or command prompt.

See the Python documentation for more information on how to install packages:

    https://docs.python.org/3/installing/
!pip install mlxtend
Collecting mlxtend
  Downloading https://files.pythonhosted.org/packages/86/30/781c0b962a70848db83339567ecab656638c62f05adb064cb33c0ae49244/mlxtend-0.18.0-py2.py3-none-any.whl (1.3MB)
Collecting scipy>=1.2.1 (from mlxtend)
  Downloading https://files.pythonhosted.org/packages/e1/8b/d05bd3bcd0057954f08f61472db95f4ac71c3f0bf5432abe651694025396/scipy-1.6.3-cp37-cp37m-win_amd64.whl (32.6MB)
Collecting scikit-learn>=0.20.3 (from mlxtend)
  Downloading https://files.pythonhosted.org/packages/33/ac/98a9c3f4b6e810c45196f6e15e04f9d83fe3d6000eebbb74dfd084446432/scikit_learn-0.24.2-cp37-cp37m-win_amd64.whl (6.8MB)
Collecting joblib>=0.13.2 (from mlxtend)
  Downloading https://files.pythonhosted.org/packages/55/85/70c6602b078bd9e6f3da4f467047e906525c355a4dacd4f71b97a35d9897/joblib-1.0.1-py3-none-any.whl (303kB)
Requirement already satisfied: matplotlib>=3.0.0 in c:\programdata\anaconda3\lib\site-packages (from mlxtend) (3.0.2)
Collecting pandas>=0.24.2 (from mlxtend)
  Downloading https://files.pythonhosted.org/packages/74/8c/9cf2e5304f4466dbc759a799b97bfd75cd3dc93b00d49558ca93bfc29173/pandas-1.2.4-cp37-cp37m-win_amd64.whl (9.1MB)
Requirement already satisfied: setuptools in c:\programdata\anaconda3\lib\site-packages (from mlxtend) (40.6.3)
Collecting numpy>=1.16.2 (from mlxtend)
  Downloading https://files.pythonhosted.org/packages/ce/de/0ed39fd77c5584cd9e44b4305ee4444ea7af1b38d4d71734ae684fc14184/numpy-1.20.3-cp37-cp37m-win_amd64.whl (13.6MB)
Collecting threadpoolctl>=2.0.0 (from scikit-learn>=0.20.3->mlxtend)
  Downloading https://files.pythonhosted.org/packages/f7/12/ec3f2e203afa394a149911729357aa48affc59c20e2c1c8297a60f33f133/threadpoolctl-2.1.0-py3-none-any.whl
Requirement already satisfied: cycler>=0.10 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (1.0.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (2.3.0)
Requirement already satisfied: python-dateutil>=2.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (2.7.5)
Requirement already satisfied: pytz>=2017.3 in c:\programdata\anaconda3\lib\site-packages (from pandas>=0.24.2->mlxtend) (2018.7)
Requirement already satisfied: six in c:\programdata\anaconda3\lib\site-packages (from cycler>=0.10->matplotlib>=3.0.0->mlxtend) (1.12.0)
Installing collected packages: numpy, scipy, joblib, threadpoolctl, scikit-learn, pandas, mlxtend
  Found existing installation: numpy 1.15.4
    Uninstalling numpy-1.15.4:
      Successfully uninstalled numpy-1.15.4
  Found existing installation: scipy 1.1.0
    Uninstalling scipy-1.1.0:
      Successfully uninstalled scipy-1.1.0
  Found existing installation: joblib 0.13.0
    Uninstalling joblib-0.13.0:
      Successfully uninstalled joblib-0.13.0
  Found existing installation: scikit-learn 0.20.1
    Uninstalling scikit-learn-0.20.1:
      Successfully uninstalled scikit-learn-0.20.1
  Found existing installation: pandas 0.23.4
    Uninstalling pandas-0.23.4:
      Successfully uninstalled pandas-0.23.4
Successfully installed joblib-1.0.1 mlxtend-0.18.0 numpy-1.20.3 pandas-1.2.4 scikit-learn-0.24.2 scipy-1.6.3 threadpoolctl-2.1.0
import pandas as pd
item_list = [['牛奶','面包'],
            ['面包','尿布','啤酒','土豆'],
            ['牛奶','尿布','啤酒','可乐'],
            ['面包','牛奶','尿布','啤酒'],
            ['面包','牛奶','尿布','可乐']]
item_df = pd.DataFrame(item_list)
from mlxtend.preprocessing import TransactionEncoder
te = TransactionEncoder()
df_tf = te.fit_transform(item_list)
df = pd.DataFrame(df_tf,columns=te.columns_)
print(df)     #数据格式处理,传入模型的数据需要满足bool值的格式
      可乐     啤酒     土豆     尿布     牛奶     面包
0  False  False  False  False   True   True
1  False   True   True   True  False   True
2   True   True  False   True   True  False
3  False   True  False   True   True   True
4   True  False  False   True   True   True
from mlxtend.frequent_patterns import apriori  
# use_colnames=True表示使用元素名字,默认的False使用列名代表元素, 设置最小支持度min_support  
frequent_itemsets = apriori(df, min_support=0.05, use_colnames=True)  
frequent_itemsets.sort_values(by='support', ascending=False, inplace=True)  
# 选择2频繁项集  
print(frequent_itemsets[frequent_itemsets.itemsets.apply(lambda x: len(x)) == 2])   

    support  itemsets
17      0.6  (面包, 尿布)
18      0.6  (面包, 牛奶)
11      0.6  (啤酒, 尿布)
16      0.6  (牛奶, 尿布)
13      0.4  (面包, 啤酒)
7       0.4  (可乐, 尿布)
12      0.4  (啤酒, 牛奶)
8       0.4  (可乐, 牛奶)
14      0.2  (尿布, 土豆)
6       0.2  (可乐, 啤酒)
15      0.2  (面包, 土豆)
9       0.2  (面包, 可乐)
10      0.2  (啤酒, 土豆)
#计算关联规则   
# metric可以有很多的度量选项,返回的表列名都可以作为参数  
from mlxtend.frequent_patterns import association_rules 
association_rule = association_rules(frequent_itemsets,metric='confidence',min_threshold=0.9)  
#关联规则可以提升度排序  
association_rule.sort_values(by='lift',ascending=False,inplace=True)   
association_rule  
# 规则是:antecedents->consequents  

antecedents consequents antecedent support consequent support support confidence lift leverage conviction
15 (土豆) (面包, 啤酒, 尿布) 0.2 0.4 0.2 1.0 2.500000 0.12 inf
30 (土豆) (面包, 啤酒) 0.2 0.4 0.2 1.0 2.500000 0.12 inf
12 (土豆, 尿布) (面包, 啤酒) 0.2 0.4 0.2 1.0 2.500000 0.12 inf
24 (土豆) (面包, 尿布) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
36 (土豆) (啤酒) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
21 (可乐, 啤酒) (牛奶, 尿布) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
25 (土豆, 尿布) (啤酒) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
5 (可乐) (牛奶, 尿布) 0.4 0.6 0.4 1.0 1.666667 0.16 inf
18 (可乐, 面包) (牛奶, 尿布) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
27 (土豆) (啤酒, 尿布) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
9 (面包, 土豆, 尿布) (啤酒) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
28 (面包, 土豆) (啤酒) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
13 (面包, 土豆) (啤酒, 尿布) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
14 (啤酒, 土豆) (面包, 尿布) 0.2 0.6 0.2 1.0 1.666667 0.08 inf
26 (啤酒, 土豆) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
0 (啤酒) (尿布) 0.6 0.8 0.6 1.0 1.250000 0.12 inf
23 (面包, 土豆) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
31 (土豆) (面包) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
32 (可乐, 面包) (牛奶) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
33 (可乐, 面包) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
34 (可乐, 啤酒) (牛奶) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
35 (可乐, 啤酒) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
29 (啤酒, 土豆) (面包) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
19 (可乐, 啤酒, 牛奶) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
22 (土豆, 尿布) (面包) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
20 (可乐, 啤酒, 尿布) (牛奶) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
1 (面包, 啤酒) (尿布) 0.4 0.8 0.4 1.0 1.250000 0.08 inf
17 (可乐, 面包, 尿布) (牛奶) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
16 (可乐, 面包, 牛奶) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
11 (面包, 啤酒, 土豆) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
10 (啤酒, 土豆, 尿布) (面包) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
8 (土豆) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf
7 (可乐) (牛奶) 0.4 0.8 0.4 1.0 1.250000 0.08 inf
6 (可乐) (尿布) 0.4 0.8 0.4 1.0 1.250000 0.08 inf
4 (可乐, 尿布) (牛奶) 0.4 0.8 0.4 1.0 1.250000 0.08 inf
3 (可乐, 牛奶) (尿布) 0.4 0.8 0.4 1.0 1.250000 0.08 inf
2 (啤酒, 牛奶) (尿布) 0.4 0.8 0.4 1.0 1.250000 0.08 inf
37 (面包, 啤酒, 牛奶) (尿布) 0.2 0.8 0.2 1.0 1.250000 0.04 inf

你可能感兴趣的:(Python,python,apriori关联规则,mlxtend)