Python3.6-Apriori算法进行关联分析

python3.6机器学习实战第11章代码问题总结

程序清单11-1
from numpy import *
def loadDataSet():
    return [[1,3,4],[2,3,5],[1,2,3,5],[2,5]]
def createC1(dataSet):
    C1=[]
    for transaction in dataSet:
        for item in transaction:
            if not [item] in C1:
                C1.append([item])
    C1.sort()
    return list(map(frozenset,C1))
def scanD(D,Ck,minSupport):
    numItems=float(len(list(D)))#python3.6不支持len(map对象,即D),可以看到D可以用一次
    print(list(D))#我调整了源代码的位置,这一步发现map对象D运行完这一段就变空了
    ssCnt={}
    for tid in D:
        for can in Ck:
            if can.issubset(tid):
                if not can in ssCnt:
                    ssCnt[can]=1
                else:ssCnt[can]+=1
    print(list(D))
    #numItems=float(len(D))
    print(numItems)
    retList=[]
    supportData={}
    for key in ssCnt:
        support=ssCnt[key]/numItems
        if support>=minSupport:
            retList.insert(0,key)
        supportData[key]=support
    return retList,supportData
有两处问题:
1.Python3.6中dict没用has_key(),而应该用if not can(key) in ssCnt(dict):
2.python3.6中map对象用一次就放空,如下图:


开始还没发现这个问题,除了这个之外,使用源代码需要将len(D)改成len(list(D)),不然会提示TypeError: object of type 'map' has no len()
问题截图得到都是空的,有点崩溃啊,明明没错误的是吧~~~

为了解决map对象只能用一次的问题,可以用l1来转移,下面两种方法都可以,第二种在Spyder里试过了:
第一种:l1 = list(D)
第二种:l1 = [i for i in D]


最终的代码就是下面这样啦,运行结果也贴出来了:

@author: asus
"""
from numpy import *
def loadDataSet():
    return [[1,3,4],[2,3,5],[1,2,3,5],[2,5]]
def createC1(dataSet):
    C1=[]
    for transaction in dataSet:
        for item in transaction:
            if not [item] in C1:
                C1.append([item])
    C1.sort()
    return list(map(frozenset,C1))
def scanD(D,Ck,minSupport):
    #在这一行后加了代码
    M=list(D)
    numItems=float(len(M))#因为M是list对象,可以用len()
    print(list(M))#这一行可以验证M可以用多次啦,哈哈哈
    ssCnt={}
    for tid in M:#后面的D都要变成M,没办法得用多次就只能M了
        for can in Ck:
            if can.issubset(tid):
                if not can in ssCnt:
                    ssCnt[can]=1
                else:ssCnt[can]+=1
    #print(list(D))
    #numItems=float(len(D))
    print(numItems)
    retList=[]
    supportData={}
    for key in ssCnt:
        support=ssCnt[key]/numItems
        if support>=minSupport:
            retList.insert(0,key)
        supportData[key]=support
    return retList,supportData



你可能感兴趣的:(机器学习实战)