goroutine_如何使用Goroutine增强深度优先搜索

goroutine

Depth first search is a popular graph traversal algorithm. One application of depth first search in real world applications is in site mapping.

深度优先搜索是一种流行的图遍历算法。 实际应用中深度优先搜索的一种应用是站点映射。

A site map is a list of pages of a web site. They are organised in a hierarchical manner and describe the whole structure of a website starting from a root node.

站点地图是网站页面的列表。 它们以分层方式进行组织,并从根节点开始描述网站的整个结构。

算法 (The Algorithm)

Site mapping involves loading a root link, parsing the internal links on the page and then recursively applying the same process to those links. This gives us a graph data structure, but for simplicity, we can assume that it's a tree.

网站映射包括加载根链接,解析页面上的内部链接,然后将相同的过程递归地应用于这些链接。 这为我们提供了图形数据结构,但为简单起见,我们可以假定它是一棵树。

问题 (The Problem)

If we implemented the algorithm that way, loading and parsing HTML pages takes time and blocks the whole traversal process.

如果我们以这种方式实现该算法,则加载和解析HTML页面将花费时间并阻塞整个遍历过程。

Suppose an HTTP response takes an average of 300ms and there are 100 pages on the site to map. 300*100 = 30000ms => 30 seconds. So, the process will remain idle for 300 seconds.

假设HTTP响应平均需要300毫秒,并且该站点上有100个页面要映射。 300 * 100 = 30000ms => 30秒 因此,该过程将保持空闲300秒。

我们该怎样改进这个? (How Can We Improve This?)

In the time that a page loads, you can send multiple HTTP requests and parse the received HTML pages if you implement a multi-threaded architecture.

在页面加载期间,如果您实现了多线程体系结构,则可以发送多个HTTP请求并解析接收到HTML页面。

This concurrent method is 7x faster than the one previously mentioned.

这种并发方法比前面提到的方法快7倍

Implementing threads may set off the alarm bell in many developers' mind. However, Golang provides you with a beautiful set of concepts like goroutines, channels, and synchronization utilities to make the job much easier.

实施线程可能会引起许多开发人员的警觉。 但是,Golang为您提供了一套精美的概念,例如goroutine,通道和同步实用程序,使工作变得更加轻松。

I talked about site mapping earlier, however, it is much better and simpler if you learn how to program a depth first search algorithm for a binary tree. You can apply what you'll learn in this article to a lot of different things.

我之前曾讨论过站点映射,但是,如果您学习如何为二叉树编程深度优先搜索算法,它将变得更好,更简单。 您可以将本文中学习的内容应用于许多不同的事物。

Let's get started!

让我们开始吧!

You can find the code used in this article here on GitHub.

您可以在GitHub上找到本文中使用的代码。

设置树 (Setting Up the Tree)

节点定义 (Node Definition)

A node struct is the basic building block of your binary tree. It has a data, a left child and right child pointer. To simulate the delay in processing a node, you have to assign a random sleep time in microseconds.

节点结构是二进制树的基本构建块。 它具有数据,左子指针和右子指针。 为了模拟处理节点的延迟,您必须分配一个微秒的随机睡眠时间。

type Node struct {
	Data interface{}
	Sleep time.Duration
	Left *Node
	Right *Node
}

节点生成器功能 (Node Generator Function)

NewNode() returns a pointer to the a new node. Sleep is assigned a duration of 0-100 microseconds.

NewNode()返回一个指向新节点的指针。 睡眠的持续时间为0到100微秒。

func NewNode(data interface{}) *Node {

	node := new(Node)

	node.Data = data
	node.Left = nil
	node.Right = nil

	rand.Seed(time.Now().UTC().UnixNano())
	duration := int64(rand.Intn(100))
	node.Sleep = time.Duration(duration) * time.Microsecond

	return node
}

Now you've set up your tree and can implement the depth first search and a function to process the node.

现在,您已经设置了树,可以实现深度优先搜索和用于处理节点的功能。

ProcessNode() (ProcessNode())

ProcessNode() is a function that will be invoked when the node has to be processed during a traversal.

ProcessNode()是在遍历期间必须处理节点时将调用的函数。

Normally you would print or store the node's value. However, to show the benefits of goroutines, you'll have to implement a compute intensive task that takes somewhere around 1 second.

通常,您将打印或存储节点的值。 但是,为了展示goroutine的好处,您必须实施耗时约1秒钟的计算密集型任务。

During each iteration, the node sleeps for n.Sleep microseconds and prints out Node once the task completes.

在每次迭代期间,节点Hibernaten.Sleep微秒,并在任务完成后打印出Node

func (n *Node) ProcessNode() {

	var hello []int

	for i := 0; i < 10000; i++ {
		time.Sleep(n.Sleep)
		hello = append(hello, i)
	}
    
	fmt.Printf("Node %v ✅\n", n.Data)
}

深度优先搜索递归函数 (Depth First Search Recursive Function)

This is a single-threaded depth first search function implemented via recursion — it might look familiar to those who have written it before.

这是通过递归实现的单线程深度优先搜索功能-对于以前编写过该功能的人可能看起来很熟悉。

func (n *Node) DFS() {

	if n == nil {
		return
	}

	n.Left.DFS()
	n.ProcessNode()
	n.Right.DFS()
}

实现main()函数 (Implementing the main() Function)

In the main function, create a complete binary tree that consists of 7 nodes.

在main函数中,创建一个由7个节点组成的完整二叉树。

To see how much time has elapsed, initiate start and then begin the DFS at the root. Once it completes, main() prints out the time that has elapsed.

要查看已花费了多少时间,请启动start ,然后从根目录开始DFS。 完成后, main()打印出已过的时间。

var wg sync.WaitGroup

func main() {

	root := NewNode(1)
	root.Left = NewNode(2)
	root.Right = NewNode(3)
	root.Left.Left = NewNode(4)
	root.Left.Right = NewNode(5)
	root.Right.Left = NewNode(6)
	root.Right.Right = NewNode(7)

	start := time.Now()
	root.DFS()
	fmt.Printf("\nTime elapsed: %v\n\n", time.Since(start))
    
}

输出量 (Output)

It took 8.75s for the depth first search to complete.

深度优先搜索花费了8.75s

Most of the time, the processor was idle as each node was being processed. It also prevented other nodes from processing while it completed its sleep time.

大多数时候,处理器在处理每个节点时处于空闲状态。 它还在完成睡眠时间时阻止其他节点进行处理。

In the real world, this situation occurs during I/O or external HTTP calls.

在现实世界中,这种情况发生在I / O或外部HTTP调用期间。

Node 4 ✅
Node 2 ✅
Node 5 ✅
Node 1 ✅
Node 6 ✅
Node 3 ✅
Node 7 ✅

Time elapsed: 8.75086767s

使用Goroutine增强您的深度优先搜索 (Supercharge Your Depth First Search with Goroutines)

Converting the process and depth first search functions involves only minor changes when compared to other programming languages:

与其他编程语言相比,转换过程搜索和深度优先搜索功能仅涉及较小的更改:

  1. Calling the recursive function with the go command.

    使用go命令调用递归函数。

  2. Maintaining a waitGroup which keeps track of the in process function so the program doesn't exit without all of them completing.

    维护一个waitGroup来跟踪进程中的函数,这样程序就不会在没有全部完成的情况下退出。

DFSParallel() (DFSParallel())

wg.Add(1): Before going into recursion, add the goroutine that will be started to the waitGroup.

wg.Add(1) :在进行递归之前,将将要启动的goroutine添加到waitGroup

You can also run wg.Add(3) and then start the three goroutines and it will do the job. However, this is more aesthetic and clearly denotes what is going to happen.

您也可以运行wg.Add(3) ,然后启动三个goroutine,它将完成工作。 但是,这更具美学意义,并且清楚地表明将要发生的情况。

defer wg.Done(): decreases the waitGroup counter by 1 when the function returns. This conveys that the routine has completed.

defer wg.Done() :函数返回时,将waitGroup计数器减1。 这表示例程已完成。

go: Starts the function in a new goroutine.

go :在新的goroutine中启动功能。

func (n *Node) DFSParallel() {

	defer wg.Done()

	if n == nil {
		return
	}

	wg.Add(1)
	go n.Left.DFSParallel()

	wg.Add(1)
	go n.ProcessNodeParallel()

	wg.Add(1)
	go n.Right.DFSParallel()
}

ProcessNodeParallel() (ProcessNodeParallel())

Nothing much to be done here, just add a defer wg.Done() after the function starts. It'll inform waitGroup that this goroutine has finished.

此处无需执行任何操作,只需在函数启动后添加一个defer wg.Done() 。 它将通知waitGroup此goroutine已完成。

func (n *Node) ProcessNodeParallel() {

	defer wg.Done()

	var hello []int
    
	for i := 0; i < 10000; i++ {
		time.Sleep(n.Sleep)
		hello = append(hello, i)
	}
    
	fmt.Printf("Node %v ✅\n", n.Data)
}

在main()中调用DFSParallel() (Calling DFSParallel() in main())

GOMAXPROCS tells the Go compiler to run threads on all logical cores available on the computer.

GOMAXPROCS告诉Go编译器在计算机上所有可用的逻辑内核上运行线程。

This will help you to process multiple nodes as well. The concurrent design pattern that has been implemented here shows the benefit of having multiple cores on the computer. Not only can the program process other nodes while one is sleeping, but it can also process multiple nodes at the same time.

这也将帮助您处理多个节点。 此处已实现的并行设计模式显示了在计算机上具有多个内核的好处。 该程序不仅可以在一个节点处于Hibernate状态时处理其他节点,而且还可以同时处理多个节点。

You can start the DFSParallel() as a goroutine as before and add it to the wait group.

您可以像以前一样作为goroutine启动DFSParallel()并将其添加到等待组。

wg.Wait() waits for all goroutines to be completed. It waits for the goroutines count to be 0 and then moves the control forward.

wg.Wait()等待所有goroutine完成。 它等待goroutines计数为0,然后将控件向前移动。

输出量 (Output)

Node 7 ✅
Node 4 ✅
Node 2 ✅
Node 6 ✅
Node 5 ✅
Node 1 ✅
Node 3 ✅

Processors: 8 Time elapsed: 1.295332809s

As expected, the depth first search algorithm completes in just 1.3 seconds as opposed to the 8.7 seconds in the previous implementation.

不出所料,深度优先搜索算法仅需1.3秒即可完成,而之前的实现是8.7秒

说明 (Explanation)

正常执行 (Normal Implementation)

The functions were running serially in a pre-ordered manner as you would expect. Each function was taking ~1.1 seconds to complete leading to the long run time.

这些功能正以您期望的顺序运行。 每个功能大约需要1.1秒才能完成,从而导致运行时间较长。

However, each node sleeps for ~1 second as well, during which the processors remain idle as everything is running in one thread.

但是,每个节点也要睡眠约1秒钟,在此期间,由于所有内容都在一个线程中运行,因此处理器保持空闲状态。

并行执行 (Concurrent Implementation)

The functions were running independently and almost every one of them started at roughly ~ 0th second. They ran for 1 second and every thread completed.

这些功能是独立运行的,几乎每个功能都在大约第0秒开始。 他们运行了1秒钟,每个线程都完成了。

However you can see that the order is not the same as the previous implementation. This is because they are running independently and finish at different times. Since they all started at roughly the same time, the traversal completed in roughly the duration of a single function's runtime.

但是,您可以看到顺序与先前的实现不同。 这是因为它们独立运行并在不同时间完成。 由于它们都大致同时开始,因此遍历大约在单个函数运行时的时间内完成。

结论 (Conclusion)

I found this result to be pretty amazing since it didn't take me more than a few concepts and 5-6 extra lines to make this program 7x faster.

我发现此结果非常惊人,因为它不需要花太多的概念和5-6条额外的行就能使该程序快7倍

This technique can prove to be a major boost to your Go program if you can identify functions which can run independently at the same time. If your functions require synchronization, you can use channels to achieve that task.

如果您可以识别可以同时独立运行的函数,则可以证明该技术是Go程序的主要提升。 如果您的功能需要同步,则可以使用渠道来完成该任务。

You can find the code used in this article here on GitHub.

您可以在GitHub上找到本文中使用的代码。

补充材料 (Supplementary Stuff)

  1. https://medium.com/rungo/anatomy-of-goroutines-in-go-concurrency-in-go-a4cb9272ff88

    https://medium.com/rungo/anatomy-of-goroutines-in-go-concurrency-in-go-a4cb9272ff88

  2. https://blog.golang.org/defer-panic-and-recover

    https://blog.golang.org/defer-panic-and-recover

  3. https://medium.com/@houzier.saurav/dfs-and-bfs-golang-d5818ec690d3

    https://medium.com/@houzier.saurav/dfs-and-bfs-golang-d5818ec690d3

  4. https://medium.com/rungo/anatomy-of-channels-in-go-concurrency-in-go-1ec336086adb

    https://medium.com/rungo/anatomy-of-channels-in-go-concurrency-in-go-1ec336086adb

翻译自: https://www.freecodecamp.org/news/supercharge-your-dfs-with-goroutines/

goroutine

你可能感兴趣的:(python,java,linux,go,算法)