修改execSync后flaky test频发的解决

修改前,未对传进来的timeout参数进行处理,通过time.Sleep(100 * time.Millisecond)的方式,隔一段时间执行一次getExecConfig,用running参数来判断exec是否结束。

// ExecSync executes a command in the container, and returns the stdout output.

// If command exits with a non-zero exit code, an error is returned.

func (c *CriManager) ExecSync(ctx context.Context, r *runtime.ExecSyncRequest) (*runtime.ExecSyncResponse, error) {

// TODO: handle timeout.

  id := r.GetContainerId()

createConfig := &apitypes.ExecCreateConfig{

Cmd: r.GetCmd(),

}

execid, err := c.ContainerMgr.CreateExec(ctx, id, createConfig)

if err != nil {

return nil, fmt.Errorf("failed to create exec for container %q: %v", id, err)

}

var output bytes.Buffer

startConfig := &apitypes.ExecStartConfig{}

attachConfig := &AttachConfig{

Stdout:true,

Stderr:true,

MemBuffer: &output,

}

err = c.ContainerMgr.StartExec(ctx, execid, startConfig, attachConfig)

if err != nil {

return nil, fmt.Errorf("failed to start exec for container %q: %v", id, err)

}

var execConfig *ContainerExecConfig

for {

execConfig, err = c.ContainerMgr.GetExecConfig(ctx, execid)

if err != nil {

return nil, fmt.Errorf("failed to inspect exec for container %q: %v", id, err)

}

// Loop until exec finished.

      if !execConfig.Running {

break

      }

time.Sleep(100 * time.Millisecond)

}

var stderr []byte

if execConfig.Error != nil {

stderr = []byte(execConfig.Error.Error())

}

return &runtime.ExecSyncResponse{

Stdout:  output.Bytes(),

Stderr:  stderr,

ExitCode: int32(execConfig.ExitCode),

}, nil

}

修改后,把AttachConfig结构里的membuffer改成了pipe,pipe为io.pipewriter类型。

通过io.copy来判断StartExec是否执行完成。这样子,相当于把管道的一头交给 containerio 来操作了,剩下的read操作,就在 ExecSync 里读出来,放到一个 buffer 里。而io.copy返回有两种情况:一是遇到error直接返回;二是读到eof,也会返回,但是此时的返回值为nil。

这样就完美解决了频繁的ping pouchd的问题!但是这个pr merge后,cri test开始频发flaky test,还是一些与代码无关的文档pr。。

• Failure [0.653 seconds]

[k8s.io] Security Context

/home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/framework/framework.go:72

  SeccompProfilePath

  /home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/validate/security_context.go:411

    runtime should support an seccomp profile that blocks setting hostname with SYS_ADMIN [It]

    /home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/validate/security_context.go:517

    cmd [hostname ANewHostName], stdout "hostname: sethostname: Operation not permitted\n", stderr ""

    Expected an error to have occurred.  Got:

        : nil

    /home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/validate/security_context.go:1046

错误的源头不在cri部分,而是在pouchd部分。execSync方法中,判断IO是否完成的流程如下:


因此在pouchd的execExitedAndRelease方法里,把IO关闭的部分,移到execConfig update后面就好了。

你可能感兴趣的:(修改execSync后flaky test频发的解决)