ruby 生成哈希值_为什么使用默认值的Ruby哈希值很危险

ruby 生成哈希值

Ruby hashes are simple yet powerful data structures every Ruby-developer uses about ten times a day. Setting default values with the Hash.new constructor feels intuitive and makes developers’ lives even easier. But overusing this language feature can lead to some surprises down the line — lots of fun debugging.

Ruby哈希是简单但功能强大的数据结构,每个Ruby开发人员每天大约使用十次。 使用Hash.new构造函数设置默认值感觉直观,并使开发人员的生活更加轻松。 但是过度使用此语言功能可能会导致意外惊喜-很多有趣的调试过程。

My journey into Ruby hashes and their default values started with the following code (this is the simplified version):

我进入Ruby哈希及其默认值的过程始于以下代码(这是简化版本):

values_by_type = Hash.new(){[]}
some_values_list = [1, 2, 1, 2, 3]some_values_list.each do |value|
type = 1
# some logic is omitted if !values_by_type[type].include?(value)
values_by_type[type] << value
# put something to database
end
end# some usage of values_by_type hash
p values_by_type[1]

Of course, this piece of code isn’t great, but we’ll focus on resolving the main issues, not rewriting the whole thing. Our goal is to understand why it isn’t working properly.

当然,这段代码不是很好,但是我们将专注于解决主要问题,而不是重写整个过程。 我们的目标是了解为什么它无法正常工作。

This code saves everything in the database correctly but the resulting hash is empty. An experienced (or just attentive) Ruby developer will quickly detect the issue — changing Hash.new(){[]} to Hash.new([]) resolves the problem. But why?

此代码将所有内容正确保存在数据库中,但结果散列为空。 经验丰富(或者只是专心)的Ruby开发人员将Swift发现问题-将Hash.new(){[]}更改为Hash.new([])可解决问题。 但为什么?

We’ll dive deeper into the Hash.new method later, for now let’s just look at a couple of examples, which can be easily repeated in irb.

稍后,我们将更深入地研究Hash.new方法,现在让我们看几个示例,可以在irb中轻松重复这些示例。

# Example 1. hash with a simple default value
> h1 = Hash.new(1) # => {}
> h1[1] # => 1
> h1[2] # => 1
> h1[1] += 1 # => 2
> h1[2] # => 1
> h1 # => {1=>2}# Example 2. hash with array as a default value
> h2 = Hash.new([]) # => {}
> h2[1] # => []
> h2[2] # => []
> h2[1] << 'x' # => ["x"]
> h2[1] # => ["x"]
> h2 # => {}# Example 3. hash with block provided
> h3 = Hash.new(){ [] } # => {}
> h3[1] # => []
> h3[1] << 'x' # => ["x"]
> h3[1] # => []
> h3 # => {}
> h3[1] = ['x'] # => ["x"]
> h3[1] # => ["x"]
> h3 # => {1=>["x"]}

Looks a bit confusing, but lets try to understand the logic behind it.

看起来有些混乱,但是让我们尝试了解其背后的逻辑。

First, let’s define why the third example can’t work (and doesn’t make sense). The correct usage of Hash.new with the block is:

首先,让我们定义为什么第三个示例不起作用(并且没有意义)。 Hash.new与该块的正确用法是:

Hash.new { |hash, key| hash[key] = [] }

This example will help us understand the issue.

这个例子将帮助我们理解问题。

Another hint — if you haven’t left irb yet run the following code:

另一个提示-如果您还没有离开irb,请运行以下代码:

> h2[3]  # => ["x"]

Hmm, the puzzle begins to add up.

嗯,难题开始加起来。

Ruby hashes have the following structure:

Ruby哈希具有以下结构:

A default value passed to the hash constructor, via either argument or block, is saved in the IFNONE structure. The only difference between the default value and the default block is that the RHASH_PROC_DEFAULT flag is only set for the block.

通过参数或块传递给哈希构造函数的默认值保存在IFNONE结构中。 默认值和默认块之间的唯一区别是,仅为该块设置了RHASH_PROC_DEFAULT标志。

When you’re trying to get the value from the hash, it invokes code that looks something like this (originally written in C, not Ruby):

当您尝试从哈希中获取值时,它会调用看起来像这样的代码(最初用C而不是Ruby编写):

def [](key)
value = st_table.fetch(key)
return value unless value.nil? get_default(key)
enddef get_default(key)
if ifnone && RHASH_PROC_DEFAULT
ifnone.call(self, key)
else
ifnone
end
end

Returning to our examples — modifying a value in the hash with a default array, h2[1] << ‘x’, didn’t update the value in the ST Table, it updated the default value for the whole hash. And asking the hash for another key not presented in the ST Table will return the same default object, already modified.

回到我们的示例中-使用默认数组h2 [1] <<'x'修改哈希中的值并没有更新ST表中的值,而是更新了整个哈希中的默认值。 并向散列询问未在ST表中显示的另一个键将返回已修改的相同默认对象。

In my opinion, this is exactly the point — the hash returns the default object, not the default value. And as we know, objects in Ruby are modifiable.

在我看来,这才是重点–哈希返回默认对象 ,而不是默认 。 并且我们知道,Ruby中的对象是可修改的。

The only question left is why modifying the value with default number h1[1] += 1 didn’t modify it for the h1[2]? I think, the answer is pretty obvious, isn’t it?

剩下的唯一问题是,为什么用默认数字h1 [1] + = 1修改值却没有修改h1 [2]? 我认为答案很明显,不是吗?

结论 (Conclusion)

I don’t expect this small article about the hashes’ inner structure to be extremely useful, but we can probably extract the following advice from it:

我不希望这篇关于哈希的内部结构的小文章非常有用,但是我们可以从中提取以下建议:

  1. Try to use the simplest possible values for hash defaults.

    尝试使用最简单的值作为哈希默认值。
  2. Use the Hash.new { |hash, key| hash[key] = … } form. It’s the clearest and a customizable way to set default value.

    使用Hash.new {| | hash,key | hash [key] =…}形式。 这是设置默认值的最清晰和可自定义的方法。

Of course, everything written here is covered in ruby documentation, but who really reads it attentively and isn’t it so much fun to discover stuff deep within Ruby internals.

当然,这里编写的所有内容都被ruby文档所涵盖,但是谁真正认真地阅读了它,发现Ruby内部的深处内容并不是那么有趣。

链接 (Links)

翻译自: https://medium.com/@oleg0potapov/why-ruby-hashes-with-default-values-are-dangerous-df7f03533b55

ruby 生成哈希值

你可能感兴趣的:(哈希表,python,ruby)