Andorid的Low Memory Killer是在标准的linux kernel的OOM基础上修改而来的一种内存管理机制。当系统内存不足时,杀死不必要的进程释放其内存。不必要的进程的选择根据有2个:oom_adj和占用的内存的大小。
oom_adj代表进程的优先级,数值越高,优先级越低,越容易被杀死;对应每个oom_adj都可以有一个空闲进程的阀值。Android Kernel每隔一段时间会检测当前空闲内存是否低于某个阀值。假如是,则杀死oom_adj最大的不必要的进程,如果有多个,就根据oom_score_adj去杀死进程,直到内存恢复低于阀值的状态。
LowMemoryKiller的值的设定,主要保存在2个文件之中,分别是:
/sys/module/lowmemorykiller/parameters/adj
/sys/module/lowmemorykiller/parameters/minfree。
oom_adj保存着当前系统杀进程的等级,minfree中数值的单位是内存中的页面(一般页面大小为4KB)。当内存小于某个阈值时,就杀死大于对应adj的进程。比如4*102400KB时会先回收adj为900的应用。
adj和minfree的阀值控制,通过updateOomLevels()设置:
private void updateOomLevels(int displayWidth, int displayHeight, boolean write)
// Scale buckets from avail memory: at 300MB we use the lowest values t
.............
if (write) {
ByteBuffer buf = ByteBuffer.allocate(4 * (2*mOomAdj.length + 1));
buf.putInt(LMK_TARGET);
for (int i=0; i
调整进程的adj的函数:
其中,最为常见的方法便是computeOomAdjLocked,这也是其他各个方法在需要更新adj时会调用的方法;
updateOomAdjLocked的实现过程中依次会computeOomAdjLocked和applyOomAdjLocked。其实设置的是进程的oom_score_adj;即设置文件/proc/
private final boolean updateOomAdjLocked(ProcessRecord app, int cachedAdj,
ProcessRecord TOP_APP, boolean doingAll, long now) {
if (app.thread == null) {
return false;
}
computeOomAdjLocked(app, cachedAdj, TOP_APP, doingAll, now);
return applyOomAdjLocked(app, doingAll, now, SystemClock.elapsedRealtime());
}
private final boolean applyOomAdjLocked(ProcessRecord app, boolean doingAll, long now,
long nowElapsed) {
boolean success = true;
//将curRawAdj赋给setRawAdj
if (app.curRawAdj != app.setRawAdj) {
app.setRawAdj = app.curRawAdj;
}
if (app.curAdj != app.setAdj) {
//将app adj值 发送给lmkd守护进程
ProcessList.setOomAdj(app.pid, app.info.uid, app.curAdj);
app.setAdj = app.curAdj;
}
.....
}
在ProcessList中定义了各个OOM优先级的数值
/**
* Activity manager code dealing with processes.
*/
public final class ProcessList {
// OOM adjustments for processes in various states:
// Uninitialized value for any major or minor adj fields
static final int INVALID_ADJ = -10000;
// Adjustment used in certain places where we don't know it yet.
// (Generally this is something that is going to be cached, but we
// don't know the exact value in the cached range to assign yet.)
//一般指将要会缓存进程,无法获取确定值
static final int UNKNOWN_ADJ = 1001;
// This is a process only hosting activities that are not visible,
// so it can be killed without any disruption.
static final int CACHED_APP_MAX_ADJ = 906;
static final int CACHED_APP_MIN_ADJ = 900;
// The B list of SERVICE_ADJ -- these are the old and decrepit
// services that aren't as shiny and interesting as the ones in the A list.
static final int SERVICE_B_ADJ = 800;
// This is the process of the previous application that the user was in.
// This process is kept above other things, because it is very common to
// switch back to the previous app. This is important both for recent
// task switch (toggling between the two top recent apps) as well as normal
// UI flow such as clicking on a URI in the e-mail app to view in the browser,
// and then pressing back to return to e-mail.
static final int PREVIOUS_APP_ADJ = 700;
// This is a process holding the home application -- we want to try
// avoiding killing it, even if it would normally be in the background,
// because the user interacts with it so much.
static final int HOME_APP_ADJ = 600;
// This is a process holding an application service -- killing it will not
// have much of an impact as far as the user is concerned.
static final int SERVICE_ADJ = 500;
// This is a process with a heavy-weight application. It is in the
// background, but we want to try to avoid killing it. Value set in
// system/rootdir/init.rc on startup.
//后台的重量级进程,system/rootdir/init.rc文件中设置
static final int HEAVY_WEIGHT_APP_ADJ = 400;
// This is a process currently hosting a backup operation. Killing it
// is not entirely fatal but is generally a bad idea.
static final int BACKUP_APP_ADJ = 300;
// This is a process only hosting components that are perceptible to the
// user, and we really want to avoid killing them, but they are not
// immediately visible. An example is background music playback.
//可感知进程,比如后台音乐播放
static final int PERCEPTIBLE_APP_ADJ = 200;
// This is a process only hosting activities that are visible to the
// user, so we'd prefer they don't disappear.
static final int VISIBLE_APP_ADJ = 100;
static final int VISIBLE_APP_LAYER_MAX = PERCEPTIBLE_APP_ADJ - VISIBLE_APP_ADJ - 1;
// This is the process running the current foreground app. We'd really
// rather not kill it!
static final int FOREGROUND_APP_ADJ = 0;
// This is a process that the system or a persistent process has bound to,
// and indicated it is important.
//关联着系统或persistent进程
static final int PERSISTENT_SERVICE_ADJ = -700;
// This is a system persistent process, such as telephony. Definitely
// don't want to kill it, but doing so is not completely fatal.
static final int PERSISTENT_PROC_ADJ = -800;
// The system process runs at the default adjustment.
static final int SYSTEM_ADJ = -900;
// Special code for native processes that are not being managed by the system (so
// don't have an oom adj assigned by the system).
static final int NATIVE_ADJ = -1000;
// Memory pages are 4K.
static final int PAGE_SIZE = 4*1024;
在kernel OOM机制当中,关键的参数分别是oom_adj,oom_score_adj,oom_score。
每个进程都会有这样的3个参数。 在Linux中,低内存情况下,系统通过计算这3个参数的值去杀死进程。
- oom_adj: 代表进程的优先级, 数值越大,优先级越低,越容易被杀. 取值范围[-16, 15]
- oom_score_adj: 取值范围[-1000, 1001] ,该值就是AMS设置过来的值。
- oom_score:lmk策略中没有看到使用的地方,应该是oom才会使用。 要想提高进程优先级,尽量避免自己被杀,那就得提高进程的oom_score_adj ;
高版本的内核都不在使用oom_adj,而是用oom_score_adj,oom_score_adj是一个向后兼容。
针对android系统的lowmemkiller,主要都是由AMS服务来动态更新oomlevel来调整app进程级别。
AMS对oom_score_adj设置
frameworks\base\services\core\java\com\android\server\am\ProcessList.java
public static final void setOomAdj(int pid, int uid, int amt) {
long start = SystemClock.elapsedRealtime();
ByteBuffer buf = ByteBuffer.allocate(4 * 4);
buf.putInt(LMK_PROCPRIO);
buf.putInt(pid);
buf.putInt(uid);
buf.putInt(amt);
writeLmkd(buf);
……….
}
writeLmkd中通过 LocalSocket 机制,与lmkd 进行socket通信
sLmkdSocket = new LocalSocket(LocalSocket.SOCKET_SEQPACKET);
sLmkdSocket.connect(
new LocalSocketAddress("lmkd", LocalSocketAddress.Namespace.RESERVED));
sLmkdOutputStream = sLmkdSocket.getOutputStream();
lowmemorykiller Driver部分
lowmemorykiller driver 位于 drivers/staging/android/lowmemorykiller.c
LMK通过注册shrinker来实现,shrinker是Linux kernel标准的回收page的机制,由内核线程kswapd负责监控。参见mm/vmscan.c中的kswapd。 或者某个app分配内存,发现可用内存不足时,则内核会阻塞请求分配内存的进程,进入slow path的内存申请逻辑进行回收(包括ZRAM的内存压缩)。
参见mm/page_alloc.c中的__alloc_pages_slowpath。 LMK核心思想:
选择oom_score_adj最大的进程中,并且RSS内存最大的进程作为选中要杀的进程。 具体的实现: lowmemorykiller.c 中的lowmem_scan.
lowmemorykiller.c 中的lowmem_scan注册到vmscan (kernel/mm/vmscan.c) shrinker链表里。
static struct shrinker lowmem_shrinker = {
.scan_objects = lowmem_scan,
.count_objects = lowmem_count,
.seeks = DEFAULT_SEEKS * 16
};
static int __init lowmem_init(void)
{ …………
register_shrinker(&lowmem_shrinker);
}
然后,当Linux内存管理模块线程kswapd被调度时,就会通过 kswapd_shrink_node- > ……>scan_objects 来触发lowmem_scan内存扫描及执行 lowmem killer
lowmem_scan 根据当前系统free内存和每个进程的oom_score_adj来决定当前哪个进程将会被killed.
int other_free = global_page_state(NR_FREE_PAGES) - totalreserve_pages;
int other_file = global_node_page_state(NR_FILE_PAGES) -
global_node_page_state(NR_SHMEM) -
global_node_page_state(NR_UNEVICTABLE) -
total_swapcache_pages();
………….
for (i = 0; i < array_size; i++) {
minfree = lowmem_minfree[i];
//触发条件:free size < minfree 并且 cache size < minifree
if (other_free < minfree && other_file < minfree) {
if (to_be_aggressive != 0 && i > 3) {
i -= to_be_aggressive;
if (i < 3)
i = 3;
}
min_score_adj = lowmem_adj[i];
break;
}
}
other_free 基本对应/proc/meminfo 中的 free size;other_file 基本对应/proc/meminfo 中的 cache size;
//选择oom_score_adj最大的进程中,并且rss内存最大的进程.
selected_oom_score_adj = min_score_adj;
for_each_process(tsk) {
………..
if (selected) {
if (oom_score_adj < selected_oom_score_adj)
continue;
if (oom_score_adj == selected_oom_score_adj && tasksize <= selected_tasksize)
continue;
}
selected = p;
selected_tasksize = tasksize;
selected_oom_score_adj = oom_score_adj;
lowmem_print(2, "select '%s' (%d), adj %hd, size %d, to kill\n", p->comm, p->pid, oom_score_adj, tasksize);
}
…………….
send_sig(SIGKILL, selected, 0);
内存回收时机
内存中有三个水位min 如果kswapd回收速度小于内存消耗速度,内存水位下降到min水位,则direct reclaim开始回收内存,并会阻塞应用程序。 内存换出到swap分区的过程: kswapd()-->balance_pgdat()-->shrink_zone()-->shrink_inactive_list() Watermark的设置 每个zone有单独的水位,可以在/proc/sys/vm/min_free_kbytes中设置min水位,这个参数本身决定了系统中每个zone的watermark[min]的值大小。 然后内核根据min的大小并参考每个zone的内存大小分别算出每个zone的low水位和high水位值 通过命令查看zone的watermark: 相关代码见/mm/page_alloc.c: __setup_per_zone_wmarks 最后: 要想提高进程优先级,尽量避免自己被杀,那就得提高进程的oom_score_adj ;在activity的创建与启动,结束; service的创建与启动,结束等场景下,调用AMS.applyOomAdjLocked ==>Process.setOomAdj ==>修改/proc/pid/oom_adj 。XXXXX:/ # cat /proc/zoneinfo
Node 0, zone DMA
per-node stats
………….
pages free 285389
min 1392
low 2054
high 2716
node_scanned 0
…………..