python hash函数
Python hash() is one of the built-in function. Today we will look into the usage of hash() function and how we can override it for our custom object.
Python hash()是内置函数之一。 今天,我们将研究hash()函数的用法以及如何为自定义对象覆盖它。
In simplest terms, a hash is a fixed size integer which identifies a particular value. Please note that this is the simplest explanation.
用最简单的术语来说,哈希是标识特定值的固定大小的整数。 请注意,这是最简单的解释。
Let us point out what a fixed hash can mean:
让我们指出固定哈希值的含义:
We can move into great detail about hashing but an important point about making a GOOD Hash function is worth mentioning here:
我们可以深入介绍有关散列的细节,但是在此处值得一提的是有关制作GOOD Hash函数的重要一点:
Apart from the above definition, hash value of an object should be cheap to calculate in terms of space and memory complexity.
除了上述定义之外,就空间和内存复杂性而言,对象的哈希值应该便宜。
Hash codes are most used in when comparison for dictionary keys is done. Hash code of dictionary keys is compared when dictionary lookup is done for a specific key. Comparing hash is much faster than comparing the complete key values because the set of integers that the hash function maps each dictionary key to is much smaller than the set of objects itself.
在完成字典键的比较时,最常用哈希码。 当对特定键进行字典查找时,将比较字典键的哈希码。 比较散列比比较完整键值要快得多,因为散列函数将每个字典键映射到的整数集比对象集本身小得多。
Also, note that if two numeric values can compare as equal, they will have the same hash as well, even if they belong to different data types, like 1 and 1.0.
另外,请注意,如果两个数值可以比较相等,则即使它们属于不同的数据类型(例如1和1.0),它们也将具有相同的哈希值。
Let us start constructing simple examples and scenarios in which the hash()
function can be very helpful. In this example, we will simply get the hash value of a String.
让我们开始构建简单的示例和方案,其中hash()
函数可能会非常有帮助。 在此示例中,我们将简单地获取String的哈希值。
name = "Shubham"
hash1 = hash(name)
hash2 = hash(name)
print("Hash 1: %s" % hash1)
print("Hash 2: %s" % hash2)
We will obtain the following result when we run this script:
运行此脚本时,将获得以下结果:
Here is an important catch. If you run the same script again, the hash changes as shown below:
这是一个重要的收获。 如果再次运行相同的脚本,则哈希值将发生如下变化:
So, the life of a hash is only for the program scope and it can change as soon as the program has ended.
因此,哈希的寿命仅适用于程序范围,并且可以在程序结束后立即更改。
Here, we will see how a slight change in data can change a hash value. Will it change completely or just a little? A better way is to find out through a script!
在这里,我们将看到数据的微小变化如何改变哈希值。 它会完全改变还是改变一点? 更好的方法是通过脚本找出答案!
name1 = "Shubham"
name2 = "Shubham!"
hash1 = hash(name1)
hash2 = hash(name2)
print("Hash 1: %s" % hash1)
print("Hash 2: %s" % hash2)
Let’s run this script now:
现在运行此脚本:
See how the hash changed completely when only one character changed in original data? This makes a hash value completely unpredictable!
看看当原始数据中只有一个字符改变时,哈希如何完全改变? 这使得哈希值完全不可预测!
Internally, hash()
function works by overriding the __hash__()
function. It is worth noticing that not every object is hashable (mutable collections aren’t hashable). We can also define this function for our custom class. Actually, that’s what we will do now. Before that, let’s point out some important points:
在内部, hash()
函数通过重写__hash__()
函数来工作。 值得注意的是,并非每个对象都是可哈希的(可变集合不是可哈希的)。 我们还可以为我们的自定义类定义此函数。 实际上,这就是我们现在要做的。 在此之前,让我们指出一些要点:
__eq__()
function implementation as it is defined for all objects. 我们不必为所有对象定义自定义__eq__()
函数实现。 Now, let us define an object and override the __hash__()
function:
现在,让我们定义一个对象并覆盖__hash__()
函数:
class Student:
def __init__(self, age, name):
self.age = age
self.name = name
def __eq__(self, other):
return self.age == other.age and self.name == other.name
def __hash__(self):
return hash((self.age, self.name))
student = Student(23, 'Shubham')
print("The hash is: %d" % hash(student))
Let’s run this script now:
现在运行此脚本:
This program actually described how we can override both the __eq__()
and the __hash__()
functions. This way, we can actually define our own logic to compare any objects.
该程序实际上描述了我们如何覆盖__eq__()
和__hash__()
函数。 这样,我们实际上可以定义自己的逻辑来比较任何对象。
As we already know, only immutable objects can be hashed. This restriction of not allowing a mutable object to be hashed simplify the hash table a lot. Let’s understand how.
众所周知,只能对不可变的对象进行哈希处理。 不允许对可变对象进行哈希处理的限制大大简化了哈希表。 让我们了解如何。
If a mutable object is allowed to be hashed, we need to update the hash table every time the value of the objects updates. This means that we will have to move the object to a completely different bucket. This is a very costly operation to be performed.
如果允许对可变对象进行哈希处理,则每次对象值更新时,我们都需要更新哈希表。 这意味着我们将不得不将对象移至完全不同的存储桶。 这是要执行的非常昂贵的操作。
In Python, we have two objects that uses hash tables, dictionaries and sets:
在Python中,我们有两个使用哈希表的对象, 字典和集合 :
That’s all for a quick roundup on python hash() function.
这就是对python hash()函数的快速汇总。
Reference: API Doc
参考: API文档
翻译自: https://www.journaldev.com/17357/python-hash-function
python hash函数