记一次 go 服务内存泄漏

pprof inuse_space 正常

一般查看内存是否泄漏,看下 inuse_space 下的函数占用是否符合预期即可定位

pprof goroutine 数量正常

其次,常见的协程泄漏也是一大内存泄漏常见问题

debug.FreeOSMemory() 释放不掉

因此初步定位,是否是 golang GC 持有了内存尚未归还系统

结果无效

使用 pmap 查看 go 进程内存占用情况

Address           Kbytes     RSS   Dirty Mode  Mapping
0000000000400000   20568   12460       0 r-x-- game
0000000001816000   22524   14816       0 r---- game
0000000002e15000    1272     448     200 rw--- game
0000000002f53000     412     280     280 rw---   [ anon ]
000000c000000000 1146880  722488  722488 rw---   [ anon ]
000000c046000000   32768       0       0 -----   [ anon ]
00007f51ed034000 8227428 8223952 8223952 rw---   [ anon ]
00007f53e32d4000   11148   11092   11092 rw---   [ anon ]
00007f53e3dc1000    7432    7396    7396 rw---   [ anon ]
00007f53e4504000    5380    5360    5360 rw---   [ anon ]
00007f53e4a4b000    7948    7936    7936 rw---   [ anon ]
00007f53e5212000    1088    1084    1084 rw---   [ anon ]
00007f53e532a000   47176   15380   15380 rw---   [ anon ]
00007f53e813c000  263680       0       0 -----   [ anon ]
00007f53f82bc000       4       4       4 rw---   [ anon ]
00007f53f82bd000  293564       0       0 -----   [ anon ]
00007f540a16c000       4       4       4 rw---   [ anon ]
00007f540a16d000   36692       0       0 -----   [ anon ]
00007f540c542000       4       4       4 rw---   [ anon ]
00007f540c543000    4580       0       0 -----   [ anon ]
00007f540c9bc000       4       4       4 rw---   [ anon ]
00007f540c9bd000     508       0       0 -----   [ anon ]
00007f540ca3c000     384      72      72 rw---   [ anon ]
00007ffe19062000     132      16      16 rw---   [ stack ]
00007ffe191bc000       8       4       0 r-x--   [ anon ]
ffffffffff600000       4       0       0 r-x--   [ anon ]
---------------- ------- ------- -------
total kB         10131592 9022800 8995272
  • 000000c000000000 地址段,被 anon 分配
  • 00007f51ed034000 地址段,被 anon 分配

pprof 查看 ReadMemStats 信息

# runtime.MemStats
# Alloc = 511215184
# TotalAlloc = 2387040868864
# Sys = 9002755864
# Lookups = 0
# Mallocs = 62979597032
# Frees = 62974616503
# HeapAlloc = 511215184
# HeapSys = 1170604032
# HeapIdle = 566894592
# HeapInuse = 603709440
# HeapReleased = 407527424
# HeapObjects = 4980529
# Stack = 3801088 / 3801088
# MSpan = 7122184 / 14925824
# MCache = 19200 / 32768
# BuckHashSys = 17712073
# GCSys = 52772768
# OtherSys = 7742907311
# NextGC = 666518528
# LastGC = 1650076291400621539
# PauseNs = [84573 100742 95422 86156 133630 119480 157597 95889 102826 130018 142217 121599 101298 110370 82708 104168 138280 117345 107443 86141 127083 90278 170451 116930 107839 170532 140186 111303 117594 125248 99441 106593 148126 107836 107129 127699 163856 92907 125739 119035 166194 127290 102103 140246 136436 87921 145587 126129 111681 156016 118715 91237 99338 108100 102268 106736 113862 105178 107456 88706 132915 128570 117247 106522 283094 91266 131885 105878 184512 95340 130203 94400 87651 110036 88427 109461 147071 144883 100193 94076 103126 123305 96265 119695 100927 138258 110019 104490 117543 128856 149506 127457 112336 157489 164125 112896 116319 95642 99765 119875 113373 99349 152705 118192 105129 92584 103465 137815 110037 138332 124392 125516 95171 138386 83369 110704 105518 113375 99595 94450 105359 120784 110603 120853 104693 109312 94535 104162 148201 112623 96646 110248 95551 92161 139134 113652 100086 89162 148071 106154 116013 97700 176868 82888 166987 82671 124975 93754 155597 109755 111886 90802 113215 89200 126413 88021 211183 111504 112437 155757 170727 89700 95937 109558 93163 138129 142533 84487 144022 133514 109569 109482 123691 85565 104291 122424 126503 92026 98294 124989 118067 86417 121080 116200 140696 104853 127884 88646 168958 132182 96966 84681 117802 104542 170031 163697 139453 104788 105676 122178 110008 148136 109332 104812 111864 124278 92567 130310 128673 116356 84261 98960 119525 94409 107832 90657 112062 102394 138133 90932 89751 134503 113936 282115 123879 114994 173142 117095 115490 148489 122029 116195 99539 127257 109065 124970 112145 87292 108048 109239 103802 110349 125632 88913 86671 111746 123310 103824 120306 116051 113493 134173 128964 129422 110302 150610]
# PauseEnd
# NumGC = 7380
# NumForcedGC = 26
# GCCPUFraction = 0.0012668645429657324
# DebugGC = false
# MaxRSS = 8586375168

仔细看,确定 OtherSys = 7742907311 OtherSys 分配持有了大量内存

看下 OtherSys 的解释:

// OtherSys is bytes of memory in miscellaneous off-heap
// runtime allocations.

OtherSys 是 golang 不在 heap 内的一些杂项功能的运行时分配

有点懵,golang 自己出问题了?

runtime.StartTrace

最后经排查,代码不规范,在某些情况下,调用了runtime.StartTrace而未调用runtime.StopTrace导致 golang 自身一直在分配内存

而这些内存,不会在 pprof inuse_space 中可以看到

看下 runtime.StartTrace 的注释:

// StartTrace enables tracing for the current process.
// While tracing, the data will be buffered and available via ReadTrace.
// StartTrace returns an error if tracing is already enabled.
// Most clients should use the runtime/trace package or the testing package’s
// -test.trace flag instead of calling StartTrace directly.

While tracing, the data will be buffered tracing 期间,会一直 buffered ,而这些分配的内存会到 off-heap ,不算在 heap 上。导致 pprof 失灵

其他

最后,默默的 XX OO 下这写样代码的同学,随手写下,后面是大量排查成本

敲下陌生 API 时,应当本能 2 个反应:

  1. 如何处理销毁逻辑(类似 c++ 程序,声明一个指针,脑子的冒出的应该是,这个指针在哪里处理释放)
  2. 这个 API 执行期间对程序有何影响

以上

你可能感兴趣的:(Go语言杂文,golang,内存泄漏,off-heap,OtherSys,inuse_space)