使用Channel时的注意

最近在写MIT 6.824课程,在Lab1 的 partIII这里,遇到一个关于channel的bug,特此记录

先上代码

package mapreduce

import
(
    "fmt"
    "sync"
)

//
// schedule() starts and waits for all tasks in the given phase (mapPhase
// or reducePhase). the mapFiles argument holds the names of the files that
// are the inputs to the map phase, one per map task. nReduce is the
// number of reduce tasks. the registerChan argument yields a stream
// of registered workers; each item is the worker's RPC address,
// suitable for passing to call(). registerChan will yield all
// existing registered workers (if any) and new ones as they register.
//
func schedule(jobName string, mapFiles []string, nReduce int, phase jobPhase, registerChan chan string) {
    var ntasks int
    var n_other int // number of inputs (for reduce) or outputs (for map)
    switch phase {
    case mapPhase:
        ntasks = len(mapFiles)
        n_other = nReduce
    case reducePhase:
        ntasks = nReduce
        n_other = len(mapFiles)
    }

    fmt.Printf("Schedule: %v %v tasks (%d I/Os)\n", ntasks, phase, n_other)

    // All ntasks tasks have to be scheduled on workers. Once all tasks
    // have completed successfully, schedule() should return.
    //
    // Your code here (Part III, Part IV).
    //
    var wg sync.WaitGroup

    //workIndex := make(chan int)
    //
    //workIndex <- 0

    //flag := make(chan bool)
    //flag <- false

    for i:=0;i< ntasks;{

        select {

        case workAddress := <- registerChan:
            //fmt.Printf("WorkName : %s   the WorkNum : %d \n", workAddress, i)
            wg.Add(1)


            go func(workAdress string, jobName string , taskIndex int ,numOtherPhase int) {

                    //taskIndex := <- taskNumber

                    oneTask := DoTaskArgs{
                        JobName:jobName,
                        File:mapFiles[taskIndex],
                        Phase:phase,
                        TaskNumber: taskIndex,
                        NumOtherPhase:numOtherPhase}


                    ok := call(workAddress, "Worker.DoTask", oneTask, nil)
                    //fmt.Printf("the %d task have return value \n", taskIndex)

                    if ok {
                        i++
                        wg.Done()
                        registerChan <- workAddress

                    }else {
                        fmt.Printf("Phase: %s work error , work index is %d \n", phase,i)
                    }

            }(workAddress, jobName, i ,n_other)


        default:

        }

    }

    //fmt.Println("jump jump !!")

    wg.Wait()

    fmt.Printf("Schedule: %v done\n", phase)
}

这是最终没有bug的版本,和我一开始有bug 的版本代码内容一点没差,只是在代码顺序上有一些变化,下面我贴上有 bug的版本

                    if ok {
                        i++
                         registerChan <- workAddress
                        wg.Done()
                    

                    }else {
                        fmt.Printf("Phase: %s work error , work index is %d \n", phase,i)
                    }

这里仅仅只是把wg.Done 和 registerChan <- workAddress的顺序换了一下,bug就解除了,相信大家应该也看出来了,就是因为channel是阻塞的,如果我把registerChan <- workAddress 放在wg.Done() 的上面,在for 循环跳出之后,没有 <- registerChan 语句来取出 channel中的值,这个channel就会一直阻塞,导致 wg.Done() 一直不被执行从而使wg.Wait()一直被阻塞

在最上面的代码中,那些被注释掉的channel是没有用的,但是如果把注释取消,这段代码同样会陷入无尽的等待,道理是一样的。

综上所述,我们在使用channel的时候一定要注意,有放一定要有拿,放和拿一定要是配套的,不然很容易出现奇怪的bug ( 通常是陷入无尽的等待

你可能感兴趣的:(使用Channel时的注意)