Python编程-自定义模拟容器类型与自定义类型对象排序

Python编程-自定义模拟容器类型与自定义类型对象排序

Emulating container types模拟容器类型

最近在阅读cachetools的源码时,常常遇见方法getitem与setitem,如下面中Cache实现协议的方法:

def __getitem__(self, key):
    try:
        return self.__data[key]
    except KeyError:
        return self.__missing__(key)

def __setitem__(self, key, value):
    maxsize = self.__maxsize
    size = self.getsizeof(value)
    if size > maxsize:
        raise ValueError("value too large")
    if key not in self.__data or self.__size[key] < size:
        while self.__currsize + size > maxsize:
            self.popitem()
    if key in self.__data:
        diffsize = size - self.__size[key]
    else:
        diffsize = size
    self.__data[key] = value
    self.__size[key] = size
    self.__currsize += diffsize

这两者方法实际上是实现了模拟容器的基础方法,在官网上有:

3.3.7. Emulating container types

The following methods can be defined to implement container objects. Containers usually are sequences (such as lists or tuples) or mappings (like dictionaries), but can represent other containers as well. The first set of methods is used either to emulate a sequence or to emulate a mapping; the difference is that for a sequence, the allowable keys should be the integers k for which 0 <= k < N where N is the length of the sequence, or slice objects, which define a range of items. It is also recommended that mappings provide the methods keys(), values(), items(), get(), clear(), setdefault(), pop(), popitem(), copy(), and update() behaving similar to those for Python’s standard dictionary objects. The collections.abc module provides a MutableMapping abstract base class to help create those methods from a base set of __getitem__(), __setitem__(), __delitem__(), and keys(). Mutable sequences should provide methods append(), count(), index(), extend(), insert(), pop(), remove(), reverse() and sort(), like Python standard list objects. Finally, sequence types should implement addition (meaning concatenation) and multiplication (meaning repetition) by defining the methods __add__(), __radd__(), __iadd__(), __mul__(), __rmul__() and __imul__() described below; they should not define other numerical operators. It is recommended that both mappings and sequences implement the __contains__() method to allow efficient use of the in operator; for mappings, in should search the mapping’s keys; for sequences, it should search through the values. It is further recommended that both mappings and sequences implement the __iter__() method to allow efficient iteration through the container; for mappings, __iter__() should iterate through the object’s keys; for sequences, it should iterate through the values.

from:【Python官方文档】https://docs.python.org/3/reference/datamodel.html?highlight=getitem#emulating-container-types

接下来我以创建一个不允许出现重复值字典的模拟容器为例,编写常见的功能实现,并且实现长度获取功能。需要注意的是,使用字典只是为了方便演示,并不推荐键的类型使用整数,这里使用整数的原因是模拟容器类型接受的键最好为整数,并且为了分部分演示,我并未引入实现该接口collections.abc.MutableMapping,文末将附上实现该接口的完整代码

setitem方法创建键值对应关系

__setitem__方法一般需要设置两个参数,一个作为键,一个作为值。实现该方法的实例对象可以通过[key]=value来创建对应的值

class MySelfDefineDict:
    def __init__(self) -> None:
        self.length: int = 0
        self.data: dict[int, str] = {}

    def __setitem__(self, key: int, value: str) -> None:
        if value not in self.data.values():
            self.data[key] = value
            self.length += 1	# 这里是为了方便实现测量长度的功能
            print("Added a set of key-value pairs")
        else:
            raise Exception("The value is exist")	# 实际情况下自定义一个异常即可
        
        
if __name__ == "__main__":
    my_dict: MySelfDefineDict = MySelfDefineDict()
    my_dict[1] = "233"
    """ 
        输出: Added a set of key-value pairs
    """

getitem方法获取键的值内容

__getitem__方法用于获取对应键的值的内容,一般只需要接受一个参数,即键值,我们在上述基础上添加:

def __getitem__(self, key: int) -> str:
    return self.data.get(key, "This key does not exist")

运行并得到以下结果:

if __name__ == "__main__":
    my_dict: MySelfDefineDict = MySelfDefineDict()
    my_dict[1] = "233"
    print(my_dict[1])
    print(my_dict[3])

    """ 
        输出: 
            Added a set of key-value pairs
            233
            This key does not exist
    """

delitem方法删除对应的键值对

__delitem__也接受一个键作为参数,用来指定删除的键:

def __delitem__(self, key: int) -> None:
    if key in self.data:
        del self.data[key]
        self.length -= 1
        print("The key value was removed successfully")
    else:
        raise Exception("The value is exist")

运行代码与结果:

if __name__ == "__main__":
    my_dict: MySelfDefineDict = MySelfDefineDict()
    my_dict[1] = "233"
    del my_dict[1]

    """ 
        输出: 
            Added a set of key-value pairs
            The key value was removed successfully
    """

len方法获取长度

实现__len__方法即可使用len()来获取长度,这个相对简单了,直接附上代码:

def __len__(self) -> int:
        return self.length

运行代码与结果:

if __name__ == "__main__":
    my_dict: MySelfDefineDict = MySelfDefineDict()
    my_dict[1] = "233"
    my_dict[2] = "2333"
    del my_dict[1]
    print(len(my_dict))

    """ 
        输出: 
            Added a set of key-value pairs
            Added a set of key-value pairs
            The key value was removed successfully
            1
    """

完整实现collections.abc.MutableMapping接口

要实现collections.abc.MutableMapping接口,除了上述方法外,我们还需要让该类实例实现可迭代方法__iter__,注意,并非实现迭代器协议,我这里直接使用字典的值用于迭代:

def __iter__(self) -> list[str]:
    return iter(self.data.values())

运行代码得到:

if __name__ == "__main__":
    my_dict: MySelfDefineDict = MySelfDefineDict()
    my_dict[1] = "233"
    my_dict[2] = "2333"
    my_dict[3] = "23333"
    my_dict[4] = "233333"

    del my_dict[1]
    print(len(my_dict))
    
    for element in my_dict:
        print(element, end="  ")

    """ 
        输出: 
            Added a set of key-value pairs
            Added a set of key-value pairs
            Added a set of key-value pairs
            Added a set of key-value pairs
            The key value was removed successfully
            3
            2333  23333  233333
    """

完整实现代码如下:

import collections


class MySelfDefineDict(collections.abc.MutableMapping):
    def __init__(self) -> None:
        self.length: int = 0
        self.data: dict[int, str] = {}

    def __setitem__(self, key: int, value: str) -> None:
        if value not in self.data.values():
            self.data[key] = value
            self.length += 1
            print("Added a set of key-value pairs")
        else:
            raise Exception("The value is exist")
        
    def __getitem__(self, key: int) -> str:
        return self.data.get(key, "This key does not exist")
    
    def __delitem__(self, key: int) -> None:
        if key in self.data:
            del self.data[key]
            self.length -= 1
            print("The key value was removed successfully")
        else:
            raise Exception("The value is exist")
        
    def __len__(self) -> int:
        return self.length

    def __iter__(self) -> list[str]:
        return iter(self.data.values())

自定义类型对象排序

从上述不难看出,实现模拟容器类型本质上还是实现的对应协议,只要协议满足了,那么它就是一个模拟容器类型,我们并不关心其本质上是不是完全一致,在 Python 中,重要的是对象的行为,而不是其类型。如果一个对象像鸭子一样走路、游泳、嘎嘎叫,那么它就是鸭子。这意味着你可以通过对象的行为来判断它是否符合某个接口。那么对应的,我们的自定义类型也可以实现多种协议来达到我们想要的效果,例如实现自定义类型对象的排序,我们也可以实现它们应有的基础方法(即实现协议)

实现排序的基础方法

具有不同标识的类的实例比较结果通常为不相等,除非类定义了 __eq__() 方法。

一个类的实例不能与相同类的其他实例或其他类型的对象进行排序,除非定义该类定义了足够多的方法,包括 __lt__(), __le__(), __gt__() 以及 __ge__() (而如果你想实现常规意义上的比较操作,通常只要有 __lt__()__eq__() 就可以了)。

from:【Python官方文档中文】https://docs.python.org/zh-cn/3/library/stdtypes.html?highlight=sort#comparisons

注意:并不是很推荐看中文,有时候官方翻译也会出歧义,只是这里描述的不错,所以写在了开头更多内容请参照:

【Python官方Doc】:https://docs.python.org/3/reference/datamodel.html?highlight=%20lt%20#object._lt

魔法方法 运算符 行为
__lt__(self, other) < 小于运算符的行为
__le__(self, other) <= 小于等于运算符的行为
__eq__(self, other) == 等于运算符的行为
__ne__(self, other) != 不等于运算符的行为
__gt__(self, other) > 大于运算符的行为
__ge__(self, other) >= 大于等于运算符的行为

排序示例

定义Person类,并且实现大于,小于,等于魔术方法:

from typing import Self


class Person:
    def __init__(self, name: str, age: int) -> None:
        self.name: str = name
        self.age: int = age

    # 定义小于运算符
    def __lt__(self, other: Self) -> None:
        return self.age < other.age

    # 定义小于等于运算符
    def __le__(self, other: Self) -> None:
        return self.age <= other.age

    # 定义等于运算符
    def __eq__(self, other: Self) -> None:
        return self.age == other.age
    

写一个主程序来运行验证:

if __name__ == "__main__":
    # 创建一些 Person 对象
    person1: Person = Person("Alice", 25)
    person2: Person = Person("Bob", 30)
    person3: Person = Person("Charlie", 20)
    person4: Person = Person("Frank", 20)

    # 使用排序函数对对象列表进行排序
    people: list[Person] = [person1, person2, person3, person4]
    sorted_people: list[Person] = sorted(people)

    # 输出排序结果
    for person in sorted_people:
        print(person.name, person.age, "  ", end='')

得到运行结果:

Charlie 20   Frank 20   Alice 25   Bob 30

为了增加输出的人性化提示,我们还可以写一个__repr__方法

__repr__ 是 Python 中一个特殊的魔法方法,用于定义对象的字符串表示形式,我们可以修改我们的程序:

def __repr__(self) -> str:
        return f"The age of {self.name} is {self.age}"
for person in sorted_people:
        print(person)

运行得到:

The age of Charlie is 20
The age of Frank is 20
The age of Alice is 25
The age of Bob is 30

你可能感兴趣的:(python,网络,开发语言)