线程的正常退出和资源回收

在最近开发的多线程程序中,观察到一种现象,线程调用pthread_exit()退出后,进程的VSZ没有减少,随着这样的线程增多,可以看到VSZ的值变得越来越大。

一开始以为是程序那里漏内存,查看了所有new的地方,没有发现有漏内存的情况。

通过pmap分析,发现跟没有线程退出情况的进程相比,会多出下面几个内存块,其他部分都没有不同的地方。

pmap 19661

...................

00007f80eeb5c000      4K -----    [ anon ]
00007f80eeb5d000   8192K rwx--    [ anon ]

................................

gdb里头从这些地址里头看不到任何有意义的内容

通过valgrind也没发现问题

valgrind --tool=memcheck --leak-check=full -v --track-origins=yes --log-file=val.log --track-fds=yes --time-stamp=yes --show-reachable=yes my_app

于是就怀疑是线程退出的时候没有释放资源

从网上查找资源看到chinaunix上面有些文章,关于thread的资源安全释放的问题告诫如下:

如果线程是joinable的,主线程(或某个负责回收线程的线程)需要调用pthread_join()来回收线程

如果不想把回收线程阻塞住,而让系统自动回收线程资源,即不调用pthread_join(),则线程必须是detached。

joinable和detached是通过pthread_attr_setdetachstate()来设置的。

由于我的回收线程还需要处理别的事务不能长时间阻塞住,并且通过打印pthread_join()前后的时间差发现即使线程已经退出,pthread_join()仍然可能会等上5秒钟,

所以最后采用的是pthread_exit() + detached的方法,而不是pthread_exit() + pthread_join().


回头再来看看为什么是4k和8M

首先下载glibc,在nptl目录下面能找到pthread_create.c

__pthread_create_2_1()->ALLOCATE_STACK()->allocate_stack()

/* Allocate some anonymous memory.  If possible use the cache.  */

-->get_cached_stack()

gdb attach应用程序

(gdb) p stack_cache       ===========>这里得确保能读到libc的符号表
$3 = {next = 0x7f80edb599c0, prev = 0x7f80ed3589c0}

(gdb) p sizeof(struct pthread)
$6 = 2304               ==================>4k

(gdb) p *(struct pthread *)0x7f80eeb5b700
$9 = {{header = {tcb = 0x7f80eeb5b700, dtv = 0x1ff1190, self = 0x7f80eeb5b700, multiple_threads = 1, gscope_flag = 0, sysinfo = 0, stack_guard = 16092494444486863360, pointer_guard = 1023798611218601545, 
      vgetcpu_cache = {0, 0}, private_futex = 128, rtld_must_xmm_save = 0, __private_tm = {0x0, 0x0, 0x0, 0x0, 0x0}, __unused2 = 0, rtld_savespace_sse = {{{0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 
            0, 0}}, {{0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, {{0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, {{0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, {{0, 0, 
            0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, {{0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, {{0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}, {{0, 0, 0, 0}, {0, 0, 0, 
            0}, {0, 0, 0, 0}, {0, 0, 0, 0}}}, __padding = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, __padding = {0x7f80eeb5b700, 0x1ff1190, 0x7f80eeb5b700, 0x1, 0x0, 0xdf54066f81962a00, 
      0xe3543699f391649, 0x0, 0x0, 0x80, 0x0 }}, list = {next = 0x7f80efb5d9c0, prev = 0x7f8106d82230}, tid = 8395, pid = 19661, robust_prev = 0x7f80eeb5b9e0, robust_head = {
    list = 0x7f80eeb5b9e0, futex_offset = -32, list_op_pending = 0x0}, cleanup = 0x0, cleanup_jmp_buf = 0x7f80eeb5af30, cancelhandling = 2, flags = 0, specific_1stblock = {{seq = 1, data = 0x11bbbc0}, {
      seq = 0, data = 0x0} }, specific = {0x7f80eeb5ba10, 0x0 }, specific_used = true, report_events = true, user_stack = false, stopped_start = false, 
  parent_cancelhandling = 0, lock = 0, setxid_futex = 0, cpuclock_offset = 2940773369682629, joinid = 0x0, result = 0x0, schedparam = {__sched_priority = 0}, schedpolicy = 0, 
  start_routine = 0x69e32f , arg = 0x11fa5b0, eventbuf = {eventmask = {event_bits = {0, 0}}, eventnum = TD_ALL_EVENTS, eventdata = 0x0}, nextevent = 0x0, exc = {
    exception_class = 0, exception_cleanup = 0, private_1 = 0, private_2 = 0}, stackblock = 0x7f80ee35b000, stackblock_size = 8392704, guardsize = 4096, reported_guardsize = 4096, tpp = 0x0, res = {
    retrans = 0, retry = 0, options = 0, nscount = 0, nsaddr_list = {{sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}, {sin_family = 0, sin_port = 0, 
        sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}, {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}}, id = 0, dnsrch = {0x0, 
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, defdname = '\000' , pfcode = 0, ndots = 0, nsort = 0, ipv6_unavail = 0, unused = 0, sort_list = {{addr = {s_addr = 0}, mask = 0}, {addr = {
          s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {
        addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}}, qhook = 0, rhook = 0, res_h_errno = 0, _vcsock = 0, _flags = 0, _u = {
      pad = '\000' , _ext = {nscount = 0, nsmap = {0, 0, 0}, nssocks = {0, 0, 0}, nscount6 = 0, nsinit = 0, nsaddrs = {0x0, 0x0, 0x0}, initstamp = 0}}}, end_padding = 0x7f80eeb5b700 ""}
(gdb) p ((struct pthread *)0x7f80eeb5b700)->stackblock_size
$10 = 8392704           ============>8M

对照pmap里头dump出的信息,可以看出4k是thread控制块的大小(之所以是4k估计是页大小对其的结果),8M是thread栈的大小

而从地址的特点来看,所有栈都是在stack_cache里头分配的,这是一块预分配的内存,所以各个栈的地址是连续的。

这些地址在哪里释放呢?我们来看看pthread_join()函数

pthread_join()->__free_tcb()->__deallocate_stack()

这就是某些情况需要显式地调用pthread_join()的原因


你可能感兴趣的:(多线程,linux开发)