之所以会纠结到这个问题上是因为发现在调用Popen的wait方法之后程序一直没有返回。google发现wait是有可能产生死锁的。为了把这个问题彻底弄清楚,搜索一些资料过来看看:
看到别人的例子:
今天遇到的一个问题。简单说就是,使用 subprocess
模块的 Popen
调用外部程序,如果 stdout
或 stderr
参数是 pipe,并且程序输出超过操作系统的 pipe size时,如果使用 Popen.wait()
方式等待程序结束获取返回值,会导致死锁,程序卡在 wait()
调用上。
ulimit -a
看到的 pipe size 是 4KB,那只是每页的大小,查询得知 linux 默认的 pipe size 是 64KB。
看例子:
#!/usr/bin/env python
# coding: utf-8
# yc@2013/04/28
import subprocess
def test(size):
print 'start'
cmd = 'dd if=/dev/urandom bs=1 count=%d 2>/dev/null' % size
p = subprocess.Popen(args=cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
#p.communicate()
p.wait()
print 'end'
# 64KB
test(64 * 1024)
# 64KB + 1B
test(64 * 1024 + 1)
首先测试输出为 64KB 大小的情况。使用 dd 产生了正好 64KB 的标准输出,由 subprocess.Popen
调用,然后使用 wait()
等待 dd
调用结束。可以看到正确的 start
和 end
输出;然后测试比 64KB 多的情况,这种情况下只输出了 start
,也就是说程序执行卡在了 p.wait()
上,程序死锁。具体输出如下:
start
end
start
那死锁问题如何避免呢?官方文档里推荐使用 Popen.communicate()
。这个方法会把输出放在内存,而不是管道里,所以这时候上限就和内存大小有关了,一般不会有问题。而且如果要获得程序返回值,可以在调用 Popen.communicate()
之后取 Popen.returncode
的值。
结论:如果使用 subprocess.Popen
,就不使用 Popen.wait()
,而使用 Popen.communicate()
来等待外部程序执行结束。
Popen.wait()¶
Wait for child process to terminate. Set and returnreturncode attribute.
Warning
This will deadlock when using stdout=PIPE and/orstderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.
Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optionalinput argument should be a string to be sent to the child process, orNone, if no data should be sent to the child.
communicate() returns a tuple (stdoutdata, stderrdata).
Note that if you want to send data to the process’s stdin, you need to create the Popen object with stdin=PIPE. Similarly, to get anything other thanNone in the result tuple, you need to give stdout=PIPE and/orstderr=PIPE too.
Note
The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
subprocess 的两种方法:
1)如果想调用之后直接阻塞到子程序调用结束:
Depending on how you want to work your script you have two options. If you want the commands to block and not do anything while it is executing, you can just use subprocess.call
.
#start and block until done subprocess.call([data["om_points"], ">", diz['d']+"/points.xml"])
2)非阻塞的时候方式:
If you want to do things while it is executing or feed things into stdin
, you can use communicate
after the popen
call.
#start and process things, then wait p = subprocess.Popen(([data["om_points"], ">", diz['d']+"/points.xml"]) print "Happens while running" p.communicate() #now wait
As stated in the documentation, wait
can deadlock, so communicate is advisable.