Threading in Python--- python中的线程操作.

原文是2.x版本的,然后应该是英文的.我在学习的过程中,同时改成python 3.3并且改成中文,引入一些自己的理解.

Thread Objects

线程对象


The simplest way to use a Thread is to instantiate it with a target function and call start() to let it begin working

最简单的线程应用就是初始化一个目标函数,调用start()函数,运行之~.

import threading
def worker():
    """thread worker function"""
    print ('Worker')
    return
threads = []
for i in range(5):
    t = threading.Thread(target=worker)
    threads.append(t)
    t.start()

运行结果

>>>
WorkerWorkerWorkerWorkerWorker

这里遇到奇怪的状况,没有分行打印.


It useful to be able to spawn a thread and pass it arguments to tell it what work to do. This example passes a number, which the thread then prints.

然后呢,这个目标函数,是可以带参数的。例子里面传递了一个数字从拿书,然后在线程里面打印它

import threading
def worker(num):
    """thread worker function"""
    print ('Worker:%s'%num)
    return
threads = []
for i in range(5):
    t = threading.Thread(target=worker,args=(i,))
    #print(type(t.getName))
    #print(type(t.name))
    threads.append(t)
    t.start()

运行结果

>>>
Worker:0Worker:2Worker:1Worker:3Worker:4

还是没有分行。很奇怪




Determining the Current Thread

确定当前进程


Using arguments to identify or name the thread is cumbersome, and unnecessary. Each Thread instance has a name with a default value that can be changed as the thread is created. Naming threads is useful in server processes with multiple service threads handling different operations.

用参数来定义或者命名一个线程太笨拙,也没什么意义。每个线程在创建伊始就已经被指定了一个默认的可变的值。在sever需要用不同的线程服务处理不同的操作的时候,对线程进行命名还是有意义的。

import time
import threading
def worker():
    print (threading.currentThread().getName()+'Starting\n')
    time.sleep(2)
    print (threading.currentThread().getName()+ 'Exiting\n')
def my_service():
    print (threading.currentThread().getName()+ 'Starting\n')
    time.sleep(3)
    print (threading.currentThread().getName()+ 'Exiting\n')
t = threading.Thread(name='my_service', target=my_service)
w = threading.Thread(name='worker', target=worker)
w2 = threading.Thread(target=worker) # use default name
w.start()
w2.start()
t.start()

为了运行结果清晰一点,没办法,额外加了换行符。

The debug output includes the name of the current thread on each line.

The lines with "Thread-1" in the thread name column correspond to the unnamed thread w2.

debug输出的内容包括每一行的当前线程名和线程状态

Thread-1”是未命名的默认线程名。这个w2不知道是指什么

>>>
workerStarting
Thread-1Starting
my_serviceStarting
>>>
workerExiting
Thread-1Exiting
my_serviceExiting


Most programs do not use print to debug. The logging module supports embedding the thread name in every log message using the formatter code %(threadName)s.

Including thread names in log messages makes it easier to trace those messages back to their source.

多数的时候,我们不用print来做debug。logging模块支持利用格式化编码符号%(threadName)s把线程名之类的信息嵌入到每一个message里。

把线程名什么的嵌入到log中可以让我们更加容易追踪源头

import logging
import threading
import time
logging.basicConfig(level=logging.DEBUG,
                    format='[%(levelname)s] (%(threadName)-10s) %(message)s',
                    )
def worker():
    logging.debug('Starting')
    time.sleep(2)
    logging.debug('Exiting')
def my_service():
    logging.debug('Starting')
    time.sleep(3)
    logging.debug('Exiting')
t = threading.Thread(name='my_service', target=my_service)
w = threading.Thread(name='worker', target=worker)
w2 = threading.Thread(target=worker) # use default name
w.start()
w2.start()
t.start()

logging is also thread-safe, so messages from different threads are kept distinct in the output.

logging是 thread-safe的。不同线程间的message可以在output里面清晰的看出来。

>>>
[DEBUG] (worker    ) Starting
[DEBUG] (Thread-1  ) Starting
[DEBUG] (my_service) Starting
[DEBUG] (worker    ) Exiting
[DEBUG] (Thread-1  ) Exiting
[DEBUG] (my_service) Exiting




Daemon vs. Non-Daemon Threads

守护线程 vs 非守护线程


Up to this point, the example programs have implicitly waited to exit until all threads have completed their work. Sometimes programs spawn a thread as a daemon that runs without blocking the main program from exiting. Using daemon threads is useful for services where there may not be an easy way to interrupt the thread or where letting the thread die in the middle of its work does not lose or corrupt data (for example, a thread that generates “heart beats” for a service monitoring tool). To mark a thread as a daemon, call its setDaemon() method with a boolean argument. The default is for threads to not be daemons, so passing True turns the daemon mode on.

这个例程,默默的等待所有的线程完成工作。有的时候程序生成一个守护线程,它运行并不阻塞主程序(我理解就是主程序退出了,他可以继续执行)。使用守护进程对于服务来说是很有用的。因为一个进程不会轻易被中断,同时,中断一个运行中的进程也不会丢失数据或让数据崩溃(比如中断一个产生“心跳”来监听服务的进程)。调用一个布尔型函数setDaemon()来把一个线程标记为守护线程。默认状态下,线程是非守护线程,赋值传参True把一个线程改成守护线程模式。


import threading
import time
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )
def daemon():
    logging.debug('Starting')
    time.sleep(2)
    logging.debug('Exiting')
d = threading.Thread(name='daemon', target=daemon)
d.setDaemon(True)
def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')
t = threading.Thread(name='non-daemon', target=non_daemon)
d.start()
t.start()

我的运行结果是:

>>> (daemon    ) Starting
(non-daemon) Starting
(non-daemon) Exiting
(daemon    ) Exiting

原作者文章中的运行结果是:

$ python threading_daemon.py

(daemon    ) Starting
(non-daemon) Starting
(non-daemon) Exiting

原文中这样说:

Notice that the output does not include the "Exiting" message from the daemon thread, since all of the non-daemon threads (including the main thread) exit before the daemon thread wakes up from its two second sleep.“”可以看到,daemon 线程的输出中不包括“Exiting“这条信息。因为所有的非守护线程(包括主线程)都已经能够在daemon线程睡眠的2秒醒来前结束。

不知道是不是python3.3有什么变化,以后看到在修改。


To wait until a daemon thread has completed its work, use the join() method.

要等到守护进程完成工作,用 join() 方法

import threading
import time
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )
def daemon():
    logging.debug('Starting')
    time.sleep(2)
    logging.debug('Exiting')
d = threading.Thread(name='daemon', target=daemon)
d.setDaemon(True)
def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')
t = threading.Thread(name='non-daemon', target=non_daemon)
d.start()
t.start()
d.join()
t.join()

Waiting for the daemon thread to exit using join() means it has a chance to produce its "Exiting" message.

等着守护线程完成工作后退出用join(),也就是这里的守护线程有机会输出”Exiting“这条message

>>>
(daemon    ) Starting
(non-daemon) Starting
(non-daemon) Exiting
(daemon    ) Exiting


By default, join() blocks indefinitely. It is also possible to pass a timeout argument (a float representing the number of seconds to wait for the thread to become inactive). If the thread does not complete within the timeout period, join() returns anyway.、

默认状态下,join()会无限期(还是不定期)的阻塞。它有可能会引发超时(一个浮点数代表,可以等待这个线程继续工作的等待时间)。如果线程没在超时限期内完成,join( )了也会结束。

import threading
import time
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )
def daemon():
    logging.debug('Starting')
    time.sleep(2)
    logging.debug('Exiting')
d = threading.Thread(name='daemon', target=daemon)
d.setDaemon(True)
def non_daemon():
    logging.debug('Starting')
    logging.debug('Exiting')
t = threading.Thread(name='non-daemon', target=non_daemon)
d.start()
t.start()
d.join(1)
print ('d.isAlive()'+str(d.isAlive()))
t.join()

Since the timeout passed is less than the amount of time the daemon thread sleeps, the thread is still “alive” after join() returns.

因为时限设定了1秒小于守护进程sleep的2秒,所以这个线程在join()结束后仍然alive。

我自己实验了d.join(3),结果d.isAlive()就是false了,因为已经返回了。



Enumerating All Threads

枚举所有线程


It is not necessary to retain an explicit handle to all of the daemon threads in order to ensure they have completed before exiting the main process. enumerate()returns a list of active Thread instances. The list includes the current thread, and since joining the current thread is not allowed (it introduces a deadlock situation), it must be skipped.

没必要保持一个明确的handle来确保所有守护线程在主进程完成之前结束。enumerate()返回一个现存的线程列表。列表中包含现存线程,而调用列表函数的线程是不能和现存线程join的(否则就会包含死锁),所以本身这个线程就要被忽略。(后半句翻译是猜测的,,感觉只有这样说的通,回头看看在修改)

import random
import threading
import time
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )
def worker():
    """thread worker function"""
    t = threading.currentThread()
    pause = random.randint(1,5)
    logging.debug('sleeping %s', pause)
    time.sleep(pause)
    logging.debug('ending')
    return
for i in range(3):
    t = threading.Thread(target=worker)
    t.setDaemon(True)
    t.start()
main_thread = threading.currentThread()
for t in threading.enumerate():
    if t is main_thread:
        continue
    logging.debug('joining %s', t.getName())
    t.join()

Since the worker is sleeping for a random amount of time, the output from this program may vary. It should look something like this

原文作者说,sleep时间是随机的,所以结果可能不同,原文的结果如下

$ python threading_enumerate.py
(Thread-1  ) sleeping 3
(Thread-2  ) sleeping 2
(Thread-3  ) sleeping 5
(MainThread) joining Thread-1
(Thread-2  ) ending
(Thread-1  ) ending
(MainThread) joining Thread-3
(Thread-3  ) ending
(MainThread) joining Thread-2

但是我运行后结果是这样的

(Thread-1  ) sleeping 2
(Thread-2  ) sleeping 3
(Thread-3  ) sleeping 3
(MainThread) joining Thread-1
(Thread-1  ) ending
(MainThread) joining SockThread
(Thread-2  ) ending
(Thread-3  ) ending

刚开始的时候 只有joining SockThread,后来意外的出现了Thread-1.分析、查找、咨询大神们总结了一下原因,应该是用threading.enumerate()出现了问题。因为他是全局线程的列表。据说所有的线程都出现在这个列表里,经测,出现了一个叫SockThread的线程,它从哪里来的呢?猜测会不会是因为用IDLE运行的结果呢?

然后不用IDLE直接运行,结果如作者的结果所示。个人猜测这个SockThread是IDLE对应的一个线程。

这里总结经验,慎用threading.enumerate()。如需要用到线程列表,还是自己保下来的好。




Subclassing Thread

At startup,a Thread does some basic initialization and then calls its run() method, which calls the target function passed to the constructor. To create a subclass ofThread, override run() to do whatever is necessary.

开始阶段,一个线程会做一些基本的初始化操作,然后调用它的run()函数,这个函数会把目标函数传递给constructor。想要创建一个Thread的子集,可以随意重写run()函数

import threading
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )
class MyThread(threading.Thread):
    def run(self):
        logging.debug('running')
        return
for i in range(5):
    t = MyThread()
    t.start()

The return value of run() is ignored.

run()函数的返回值被忽略掉

(Thread-1  ) running
(Thread-2  ) running
(Thread-3  ) running
(Thread-4  ) running
(Thread-5  ) running


Because the args and kwargs values passed to the Thread constructor are saved in private variables, they are not easily accessed from a subclass. To pass arguments to a custom thread type, redefine the constructor to save the values in an instance attribute that can be seen in the subclass.

因为args和kwargs参数在Thread构造的时候被保存为私有成员变量,子类不容易访问他们。为了传递参数,在样例程序中重定义构造函数让它们在子类中变成可见的!

import threading
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )
class MyThreadWithArgs(threading.Thread):
    def __init__(self, group=None, target=None, name=None,
                 args=(), kwargs=None):
        threading.Thread.__init__(self, group=group, target=target, name=name)
        self.args = args
        self.kwargs = kwargs
        return
    def run(self):
        logging.debug('running with %s and %s', self.args, self.kwargs)
        return
for i in range(5):
    t = MyThreadWithArgs(args=(i,), kwargs={'a':i, 'b':'B'})
    t.start()

MyThreadWithArgs uses the same API as Thread, but another class could easily change the constructor method to take more or different arguments more directly related to the purpose of the thread, as with any other class.

MyThreadWithArgs 用和Thread 相同的API,但是可以轻松的改变构造函数,来完成更多或者不同的功能,来满足用户需求。(ps:args是无名参数,kwargs是有名的参数,args应该在kwargs前)

(Thread-1  ) running with (0,) and {'a': 0, 'b': 'B'}
(Thread-2  ) running with (1,) and {'a': 1, 'b': 'B'}
(Thread-3  ) running with (2,) and {'a': 2, 'b': 'B'}
(Thread-4  ) running with (3,) and {'a': 3, 'b': 'B'}
(Thread-5  ) running with (4,) and {'a': 4, 'b': 'B'}


Timer Threads

One example of a reason to subclass Thread is provided by Timer, also included in threading. A Timer starts its work after a delay, and can be canceled at any point within that delay time period.

Thread的一个重要继承类是Timer,它也在threading目录下。一个Timer对象在一个delay后开始工作,并且可以在工作时间的任意时刻被中止掉。

import threading
import time
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='(%(threadName)-10s) %(message)s',
                    )
def delayed():
    logging.debug('worker running')
    return
t1 = threading.Timer(3, delayed)
t1.setName('t1')
t2 = threading.Timer(3, delayed)
t2.setName('t2')
logging.debug('starting timers')
t1.start()
t2.start()
logging.debug('waiting before canceling %s', t2.getName())
time.sleep(2)
logging.debug('canceling %s', t2.getName())
t2.cancel()
logging.debug('done')

Notice that the second timer is never run, and the first timer appears to run after the rest of the main program is done. Since it is not a daemon thread, it is joined implicitly when the main thread is done.

可以看到第二个timer从未运行过,而第一timer是在主程序退出以后才开始运行。因为它不是守护线程,被隐含的与主线程join

(MainThread) starting timers
(MainThread) waiting before canceling t2
(MainThread) canceling t2
(MainThread) done
>>> (t1        ) worker running

(我自己把delay时间也就是3换成了1,)

t1 = threading.Timer(1, delayed)
t1.setName('t1')
t2 = threading.Timer(1, delayed)
t2.setName('t2')

(结果如下)

(MainThread) starting timers
(MainThread) waiting before canceling t2
(t2        ) worker running
(t1        ) worker running
(MainThread) canceling t2
(MainThread) done



未完待续~~



你可能感兴趣的:(thread,线程,python)