Go Slice vs Maps（翻译）

原文链接
Slices 和 Maps 是 GO 中两个重要的数据类型。这篇博客将记录我关于这两个数据结构性能的一些关键发现。

在进入性能方面之前，让我们先简要看一下 Slices 和 Maps

Slice:
Slices 是建立在数组之上的一个抽象数据结构。Slice 有一个指向数组开始的指针，从数组中可以使用的最大容量以及数组的长度。Slices 可以根据需要增长或收缩。增加一个切片通常涉及到为底层数组重新分配内存。像 copy 和 append 这样的函数可以帮助增长数组

Maps:
GO 中的 Maps 和其他语言(内部可能会有所不同)类似。GO中的Map 创建段(每段可以容纳8个keys)。

性能统计:
我对这两个数据结构运行一些基准测试，结果记录如下。

样例总数	大小	f(n)-[]int(o(N))-循环并if(lcheck最后一个元素)	map[int]int直接查找o(1)
2百万	5	5.42ns/op	12.7ns/op
2百万	10	8.19ns/op	17.8ns/op
2百万	100	63.3ns/op	16.5ns/op
2百万	200	118ns/op	16.6ns/op
2百万	400	228ns/op	18.4ns/op
2百万	1000	573ns/op	17.0ns/op
2百万	10000	5674ns/op	17.6ns/op
2百万	100000	55141ns/op	15.1ns/op

TEST-1:在 slice 中查询一个INT元素 vs 在map中查找元素 -
这里，让我们试着在长度为n的slice中查找一个条目，并与在map中查找键进行比较。在slice中查找项，我们将遍历该slice并且简单地使用if来校验该元素。对于map，我们将简单地查询key。在我们所有的测试中查找最坏的情况。

样例总数	大小	f(n)-[]int(o(N))-循环并if(lcheck最后一个元素)	map[int]int直接查找o(1)
2百万	5	5.42ns/op	12.7ns/op
2百万	10	8.19ns/op	17.8ns/op
2百万	100	63.3ns/op	16.5ns/op
2百万	200	118ns/op	16.6ns/op
2百万	400	228ns/op	18.4ns/op
2百万	1000	573ns/op	17.0ns/op
2百万	10000	5674ns/op	17.6ns/op
2百万	100000	55141ns/op	15.1ns/op

我们可以看到(正如预期那样)，对于变化的 n 来说，map 查找的复杂度为常量(O(1))。然而，有趣的是对于一个小的数n 一个简单的for循环和if判断，切片花费的时间比map要小，很大的n值将需要更多的时间。

image

结论：我推荐使用map来查找给定的key。对于一个很小的大小(n)，仍然可以使用slice

样本总数	长度 – 大小 ( n )	f(n) []string [for loop and if (lcheck for last element)]	f(n) map[string]string
2 百万	5	30.4 ns/op	32.7 ns/op
2 百万	10	56.5 ns/op	23.5 ns/op
2 百万	100	128 ns/op	25.7 ns/op
2 百万	200	665 ns/op	23.6 ns/op
2 百万	400	1766 ns/op	23.7 ns/op
2 百万	1000	905 ns/op	25.7 ns/op
2 百万	10000	8488 ns/op	24.4 ns/op
2 百万	100000	82444 ns/op	25.9 ns/op

TEST-2:在一个 slice 中查找一个STRING元素 vs 在 map 中查找一个STRING Key-
我们采取和TEST-1中一样的步骤，仅仅不同的是这儿我们使用string。

样本总数	长度 – 大小 ( n )	f(n) []string [for loop and if (lcheck for last element)]	f(n) map[string]string
2 百万	5	30.4 ns/op	32.7 ns/op
2 百万	10	56.5 ns/op	23.5 ns/op
2 百万	100	128 ns/op	25.7 ns/op
2 百万	200	665 ns/op	23.6 ns/op
2 百万	400	1766 ns/op	23.7 ns/op
2 百万	1000	905 ns/op	25.7 ns/op
2 百万	10000	8488 ns/op	24.4 ns/op
2 百万	100000	82444 ns/op	25.9 ns/op

从上面的数据，我们可以看到给定一个key,查找字符串(使用MAP)的复杂度为O(1)。对于字符串比较，Maps击败了slices

image

结论：我推荐使用使用maps查找一个给定的字符串类型的key。即使是更小的'n',使用map也是很好的。

样本总数	长度 – 大小	[]int (直接索引循环) – O(1)	[]string (直接索引循环) – O(1)	map[int]int 直接循环 o(1)	map[string]string o(1)
2 百万	5	0.30 ns/op	0.29 ns/op	12.7	32.7
2 百万	10	0.29 ns/op	0.29 ns/op	17.8	23.5
2 百万	100	0.29 ns/op	0.29 ns/op	16.5	25.7
2 百万	200	0.29 ns/op	0.29 ns/op	16.6	23.6
2 百万	400	0.29 ns/op	0.29 ns/op	18.4	23.7
2 百万	1000	0.29 ns/op	0.29 ns/op	17	25.7
2 百万	10000	0.58 ns/op	0.57 ns/op	17.6	24.4
2 百万	100000	0.58 ns/op	0.55 ns/op	15.1	25.9

TEST-3:查找给定索引的一个 slice 元素
如果我们知道索引，然后在GO中查找一个切片，类似于在任何语言中查找数组，就像数组一样简单。

样本总数	长度 – 大小	[]int (直接索引循环) – O(1)	[]string (直接索引循环) – O(1)	map[int]int 直接循环 o(1)	map[string]string o(1)
2 百万	5	0.30 ns/op	0.29 ns/op	12.7	32.7
2 百万	10	0.29 ns/op	0.29 ns/op	17.8	23.5
2 百万	100	0.29 ns/op	0.29 ns/op	16.5	25.7
2 百万	200	0.29 ns/op	0.29 ns/op	16.6	23.6
2 百万	400	0.29 ns/op	0.29 ns/op	18.4	23.7
2 百万	1000	0.29 ns/op	0.29 ns/op	17	25.7
2 百万	10000	0.58 ns/op	0.57 ns/op	17.6	24.4
2 百万	100000	0.58 ns/op	0.55 ns/op	15.1	25.9

如上图所示，直接查找切片是O(1)恒定的生长速率。

image

结论：直接查找是的复杂度是常量，因此，如果您知道索引，那么i使用哪一个就不重要了。然而，索引是已知的话，slice/array查找仍然是比map查找要快

样本总数	长度 – 大小 ( n )	遍历整型 slice O(N)	遍历整型 Map O(N)
2 百万	5	9.02 ns/op	107 ns/op
2 百万	10	12.5 ns/op	196 ns/op
2 百万	100	59.2 ns/op	1717 ns/o
2 百万	200	84.9 ns/op	3356 ns/op
2 百万	400	155 ns/op	6677 ns/op
2 百万	1000	315 ns/op	18906 ns/op
2 百万	10000	2881 ns/op	178804 ns/op***
2 百万	100000	29012 ns/op	1802439 ns/op***

TEST-4:遍历一个Slice vs 遍历一个Map
这里我尝试遍历一个map和slice，并且在这个循环中执行一个恒定的操作。整体的复杂度将仍然是O(N)。

样本总数	长度 – 大小 ( n )	遍历整型 slice O(N)	遍历整型 Map O(N)
2 百万	5	9.02 ns/op	107 ns/op
2 百万	10	12.5 ns/op	196 ns/op
2 百万	100	59.2 ns/op	1717 ns/o
2 百万	200	84.9 ns/op	3356 ns/op
2 百万	400	155 ns/op	6677 ns/op
2 百万	1000	315 ns/op	18906 ns/op
2 百万	10000	2881 ns/op	178804 ns/op***
2 百万	100000	29012 ns/op	1802439 ns/op***

正如我们在上面看到的。遍历一个slice比遍历一个map快将近20倍。这个原因是slice(基于数组的抽象)被创建是在一个连续的内在块上，不像maps。在maps的例子中，循环不得不迭代键空间(在GO中称为桶)并且内存块被分配不是连续的。这就是为什么每次运行时，遍历映射的结果会以不同的顺序显示键/值。

image

系统详情	goos:darwin	Go-1.9.2
MAC-OSX	goarch:amd64

结论：如果要求是打印或执行一个操作而不是查找整个列表元素，slice是最好的选择。上面描述的原因，遍历Maps花费更多的时候
另外，请注意，在slice上的append操作保证了O(1)，就像插入到map，但这是一个平摊的常数。Append可能偶尔需要重新分配底层数组的内存
(***)-样例大小减少到2000而不是200万作为Go的大型Maps基准测试函数
关于测试的详情:

系统详情	goos:darwin	Go-1.9.2
MAC-OSX	goarch:amd64

源代码：

//m-c02jn0m1f1g4:slicevsmap user1$ cat slicemap.go 

package slicemap
//You can uncomment print to see the results(to confirm if code is working). 
//But for benchmark, we don't care about printing results.We are concerned about time it takes to run

//import "fmt"

func RangeSliceInt(input []int, find int) (index int) {
    for index,value := range input {
        if (value == find) {
      // fmt.Println("found at",index)
            return index
        }
    }

    return -1
}


func RangeSliceIntPrint(input []int) {
    for _,_ = range input {
        continue
    }
}

func MapLookupInt(input map[int]int, find int) (key,value int) {
    if value,ok := input[find];ok {
    // fmt.Println("found at", find,value)
        return find,value
    }
    return 0,0
}

func MapRangeInt(input map[int]int) {
    for _,_ = range input {
        continue
    }
}

func DirectSliceInt(input []int, index int) int {
    return input[index]
}

// FOR STRINGS //
func RangeSliceString(input []string, find string) (index int) {
  for index,value := range input {
    if (value == find) {
      //fmt.Println("found at",index)
      return index
    }
  }
  return -1
}

func RangeSliceStringPrint(input []string) {
  for _,_ = range input {
    continue
  }
}

func MapLookupString(input map[string]string, find string) (key,value string) {
  if value,ok := input[find];ok {
    // fmt.Println("found at", find,value)
    return find,value
  }
  return "0","0"
}

func MapRangeString(input map[string]string) {
  for _,_ = range input {
    continue
  }
}

func DirectSliceString(input []string, index int) string {
  return input[index]
}

//m-c02jn0m1f1g4:slicevsmap user1$ cat slicemap_test.go
package slicemap

import (
  "testing"
  "strconv"
)

func Benchmark_TimeRangeSliceInt(b *testing.B) {

  b.StopTimer()

    input := make([]int, 100000, 500000)

    for i := 0; i < 100000; i++ {
        input[i] = i+10
    }

    b.StartTimer()

    b.N = 2000000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)

    for i := 0; i < b.N; i++ {
        RangeSliceInt(input, 100009)  //check for the last item for worst case
    }
}


func Benchmark_TimeDirectSliceInt(b *testing.B) {

    b.StopTimer()

  input := make([]int, 100000, 500000)

  for i := 0; i < 100000; i++ {
    input[i] = i+10
  }

  b.StartTimer()        

  b.N = 2000000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)    

  for i := 0; i < b.N; i++ {
     DirectSliceInt(input, 99999)  //directly check for value given a index. o(1). sending the index
  }
}

func Benchmark_TimeMapLookupInt(b *testing.B) {

    b.StopTimer()

    input := make(map[int]int)

    //throw in some values into the map

    for i := 1; i <=100000; i++ {

        input[i] = i+10

    }

    b.StartTimer()

    b.N = 2000000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)

    for k := 0; k < b.N; k++ { 

        MapLookupInt(input, 100000)

    }

/*

to run this 

go test -bench=Benchmark_TimeMapLookup

*/

}


func Benchmark_TimeSliceRangeInt(b *testing.B) {

    b.StopTimer()

  input := make([]int, 5, 10)

  for i := 0; i < 5; i++ {
    input[i] = i+10
  }

  b.StartTimer()

  b.N = 2000000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)

  for k := 0; k < b.N; k++ {
    RangeSliceIntPrint(input)
  }

}

func Benchmark_TimeMapRangeInt(b *testing.B) {

  b.StopTimer()

  input := make(map[int]int)

  //throw in some values into the map

  for i := 1; i <=100000; i++ {
    input[i] = i+10
  }

  b.StartTimer()

  b.N = 2000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)

  for k := 0; k < b.N; k++ {
    MapRangeInt(input)
  }

}


// Tests for String

func Benchmark_TimeRangeSliceString(b *testing.B) {

  b.StopTimer()

  input := make([]string, 100000, 500000)

  for i := 0; i < 100000; i++ {
    input[i] = strconv.FormatInt(int64(i+10),10)
  }

  b.StartTimer()

  b.N = 2000000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)

  for i := 0; i < b.N; i++ {
    RangeSliceString(input, "100009")  //check for the last item for worst case
  }

}


func Benchmark_TimeDirectSliceString(b *testing.B) {
  b.StopTimer()
  input := make([]string, 100000, 500000)

  for i := 0; i < 100000; i++ {
    input[i] = strconv.FormatInt(int64(i+10),10)
  }

  b.StartTimer()

  b.N = 2000000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)
  for i := 0; i < b.N; i++ {
    DirectSliceString(input, 99999)  //directly check for value given a index. o(1)
  }
}


func Benchmark_TimeMapLookupString(b *testing.B) {
  b.StopTimer()
  input := make(map[string]string)

  //throw in some values into the map

  for i := 1; i <=100000; i++ {
    input[strconv.FormatInt(int64(i),10)] = strconv.FormatInt(int64(i+10),10)
  }

  b.StartTimer()

  b.N = 2000000  //just to avoid 百万s of fmt.Println (in case you are doing fmt.Println in the slicemap.go package)

  for k := 0; k < b.N; k++ {
    MapLookupString(input, "100000")
  }

/*
to run this
go test -bench=Benchmark_TimeMapLookupString
*/

}

Go Slice vs Maps（翻译）

你可能感兴趣的:(Go Slice vs Maps（翻译）)