『字典树』单词的压缩编码820

题目相关

原题链接：820. 单词的压缩编码 - 力扣（LeetCode）
涉及知识：字典树
题目难度：★★

题目解读

经过简单分析可知，如果能够实现编码复用，必须得满足某个单词是另一个较长单词的后缀才可以。很明显，我们可以通过构造字典树来实现。

Python相关

见下面

具体实现

自己的实现方案：

class Solution:
    def minimumLengthEncoding(self, words: List[str]) -> int:
        res = []
        dic = {}
        for word in words:
            temp = dic
            for letter in word[::-1]:
                temp = temp.setdefault(letter, {})
        def helper(node, length):
            if not node:
                res.append(length + 1)
            else:
                for node in node.values():
                    helper(node, length + 1)
        helper(dic, 0)
        return sum(res)

官方给出的更加Pythonic的实现：

class Solution:
    def minimumLengthEncoding(self, words: List[str]) -> int:
        words = list(set(words)) #remove duplicates
        #Trie is a nested dictionary with nodes created
        # when fetched entries are missing
        Trie = lambda: collections.defaultdict(Trie)
        trie = Trie()

        #reduce(..., S, trie) is trie[S[0]][S[1]][S[2]][...][S[S.length - 1]]
        nodes = [reduce(dict.__getitem__, word[::-1], trie)
                 for word in words]

        #Add word to the answer if it's node has no neighbors
        return sum(len(word) + 1
                   for i, word in enumerate(words)
                   if len(nodes[i]) == 0)

下面好好品一品这个代码。
首先是：

Trie = lambda: collections.defaultdict(Trie)
trie = Trie()

这两行，不得不说很Pythonic，这里Trie是一个函数，是通过lambda匿名函数表达式实现的，而后面的collections.defaultdict中的构造函数也是Trie，就有点递归函数嵌套定义的意思了。总之，这两行的意思是构造一个能构造defaultdict对象的函数，并且该对象的默认属性值也是一个defaultdict对象。然后通过函数调用实例化一个对象trie。
后面是：

nodes = [functools.reduce(dict.__getitem__, word[::-1], trie)
    for word in words]

不得不承认这句更加Pythonic，该句的意思是对于words中的word，都将其逆序对tire对象进行构造，其中reduce函数接收一个函数，一个迭代器，以及一个可选的初始值，功能是：

函数将一个数据集合（链表，元组等）中的所有数据进行下列操作：用传给 reduce 中的函数 function（有两个参数）先对集合中的第 1、2 个元素进行操作，得到的结果再与第三个数据用 function 函数运算，最后得到一个结果。

现在我们看一个示例：

a = {'b': 3}
print(dict.__getitem__(a, 'b'))

输出结果自然是3，因为此时的调用dict.__getitem__(a, 'b')完全等价于调用a['b']。而defaultdict是dict的子类，自然也有此性质。
假设word为'love'的话，那么reduce(dict.__getitem__, word[::-1], trie)语句的执行流程就是先执行trie['e']语法糖，假设trie此时没有'e'这个item，，那么便会执行类似trie['e'] = Trie()即trie['e'] = collections.defaultdict(Trie)类似的操作，并返回trie['e']这个对象，然后再执行trie['e']['v'] = Trie()、trie['e']['v']['o'] = Trie()、trie['e']['v']['o']['l'] = Trie()这一系列操作，最终的返回结果则是trie['e']['v']['o']['l']这个对象。
因此：

nodes = [reduce(dict.__getitem__, word[::-1], trie)
    for word in words]

语句的执行结果，则是nodes中存储的都是一系列defaultdict(Trie)对象，如果有的单词是另外单词的后缀，那么此时对应的对象则不为空，否则为空，对其调用len函数的结果是0。
因此，我们还可以将代码改写为：

class Solution:
    def minimumLengthEncoding(self, words: List[str]) -> int:
        words = list(set(words)) 

        Trie = lambda: collections.defaultdict(Trie)
        trie = Trie()

        nodes = [functools.reduce(dict.__getitem__, word[::-1], trie)
                 for word in words]

        return sum(len(word) + 1
                   for node, word in zip(nodes, words)
                   if not node)

参考

python - Any Functional Programming method of traversing a nested dictionary? - Stack Overflow
python3 Tire solution - LeetCode Discuss

『字典树』单词的压缩编码820

题目相关

题目解读

Python相关

具体实现

参考

你可能感兴趣的:(『字典树』单词的压缩编码820)