dEclat算法Python实现

参考了博客(https://www.cnblogs.com/infaraway/p/6774521.html)中的Eclat算法,经过部分修改,得到了dEclat算法的Python代码,特作分享:

def dEclat(prefix, diffsets, transaction_num, min_support, freq_items):
    '''
    Input:
        prefix: a list of prefixs, type: list
        diffsets: a list-tuple of difference transaction ID of items,
                  type: dict(), example: {item_1:[diffid1,diffid4],...}
        transaction_num: the number of transactions, type: int
        min_support: the minimum support, type: float
        freq_items: frequent items,
                    type: dict(), example: {items1:[sup,relative_sup],...}
    Output:
        freq_items: frequent items,
                    type: dict(), example: {items1:[sup,relative_sup],...}
    '''

    while diffsets:
        # fetch an item and its diffset
        item, diffset = diffsets.pop()
        #print(item,': ',diffset)
        # calculate the support of this item, which is corresponding the length of its tidset
        key_support = transaction_num - len(diffset)
        if key_support >= min_support:
            # add and its support to the set of ferquent items
            freq_items[frozenset(sorted(prefix+[item]))] = [key_support,round(key_support/transaction_num,2)]
            suffix = []  # list of suffixes
            for other_item, other_diffset in diffsets:
                # calculate the diffset of the current item in combination with other items
                new_diffset = diffset.union(other_diffset)
                # when the support of the combination is greater than or equal to the minimum
                # support, add the combination to the candidate set
                if (transaction_num - len(diffset)) >= min_support:
                    suffix.append((other_item,new_diffset))
            # find frequent items starting with the current item
            dEclat(prefix+[item], sorted(suffix, key=lambda diffset: len(diffset[1]), reverse=True), transaction_num, min_support, freq_items)
    return freq_items

注:传入函数的数据除了diffset不一样以外,其他的可以参考上述所提参考博文。

你可能感兴趣的:(关联规则挖掘,python,关联规则挖掘,dEclat算法)