omap-pand-3.0 tickless bug及解决方案

HaiPeng([email protected]

一.Linux内核cpu利用率的统计

查找目前linux内核广泛使用的工具top、vmstat源代码可以,cpu利用率是通过读/proc/stat数据,加以修饰得到的。top、vmstat软件统计的cpu利用率是基于时钟中断的,当时钟中断发生的时候,account_user_time、account_system_time、account_idle_time等相关统计函数函数,会将该进程的utime、stime(数据在进程的task_struct结构体中)添加到每个cpu拥有的结构体struct cpu_usage_stat中。

         cpufreq使用get_cpu_idle_time与get_cpu_iowait_time函数用来统计cpu的idle时间与iowait时间,当cpu进入tickless状态的时候(CONFIG_NO_HZ=y),get_cpu_idle_time会调用get_cpu_idle_time_us来统计此状态下的idle时间,否则调用get_cpu_idle_time_jiffy从struct cpu_usage_stat获得相关idle时间。也就是cpufreq对cpu利用率的统计比top、vmstat工具多考虑了tickless这种状态。

二.Tickless导致的bug

1.      修改dricers/cpufreq/cpufreq_ondemand.c中的dbs_check_cpu函数,在486行添加打印cpu负载信息的语句:

485                 load = 100 * (wall_time - idle_time) / wall_time;
486                 printk("cpu %d load %d\n",j,load);

2.     重新编译内核,并将相关文件烧写到pandaboard中的sd卡中,重启系统,待系统进入idle状态后,printk输出的cpu负载信息是:

[ 2343.020874] cpu 0 load 70
[ 2343.024353] cpu 1 load 2
[ 2344.023040] cpu 0 load 68
[ 2344.026245] cpu 1 load 1
[ 2345.014465] cpu 0 load 68
[ 2345.017974] cpu 1 load 7
[ 2346.095092] cpu 0 load 77
[ 2346.095092] cpu 1 load 2
[ 2347.013793] cpu 0 load 68
[ 2347.017028] cpu 1 load 3
[ 2348.017242] cpu 0 load 59
[ 2348.020629] cpu 1 load 14
[ 2349.020202] cpu 0 load 67
[ 2349.023437] cpu 1 load 5
[ 2350.015808] cpu 0 load 82
[ 2350.019012] cpu 1 load 2
[ 2351.062286] cpu 0 load 65
[ 2351.065490] cpu 1 load 3
[ 2352.007965] cpu 0 load 68
[ 2352.011444] cpu 1 load 2
[ 2353.015167] cpu 0 load 51
[ 2353.018524] cpu 1 load 10
[ 2354.021636] cpu 0 load 69
[ 2354.025146] cpu 1 load 2
[ 2355.023162] cpu 0 load 72
[ 2355.026367] cpu 1 load 3
[ 2356.009704] cpu 0 load 63
[ 2356.012939] cpu 1 load 4
[ 2357.007904] cpu 0 load 68
[ 2357.007965] cpu 1 load 2
[ 2358.014495] cpu 0 load 66

vmstat输出的结果是:

procs  memory                       system          cpu
 r  b    free mapped   anon   slab    in   cs  flt  us ni sy id wa ir
 0  0  399960 122640 110720  10708   188   32    0   0  0  0 99  0  0
 0  0  399960 122640 110720  10708   298  223    0  11  0  2 99  0  0
 0  0  399960 122640 110740  10708   195   32    0   0  0  1 99  0  0
 0  0  399960 122640 110748  10708   186   29    0   0  0  0 99  0  0
 0  0  399960 122640 110748  10708   191   28    0   0  0  0 99  0  0
 0  0  399960 122640 110748  10708   189   31    0   0  0  0 99  0  0
 0  0  399960 122640 110748  10708   231   34    0   0  0  4 99  0  0
 0  0  399960 122640 110748  10708   201   29    0   0  0  1 99  0  0
 0  0  399960 122640 110748  10708   181   32    0   0  0  0 99  0  0
 0  0  399960 122640 110748  10704   188   21    0   0  0  0 99  0  0
 0  0  399960 122640 110748  10704   193   35    0   0  0  1 99  0  0
 1  0  399960 122640 110748  10704   206   28    0   0  0  1 99  0  0

关掉cpu1后,printk打印的cpu0的负载是:

[ 2720.071258] cpu 0 load 71
[ 2721.148071] cpu 0 load 76
[ 2722.187713] cpu 0 load 76
[ 2723.251342] cpu 0 load 70
[ 2724.336151] cpu 0 load 81
[ 2725.351806] cpu 0 load 75
[ 2726.379638] cpu 0 load 74
[ 2727.469604] cpu 0 load 75
[ 2728.500091] cpu 0 load 71
[ 2729.569946] cpu 0 load 72
[ 2730.577789] cpu 0 load 79
[ 2731.664886] cpu 0 load 80
[ 2732.836181] cpu 0 load 75
[ 2733.866851] cpu 0 load 76
[ 2734.946563] cpu 0 load 80
[ 2735.965270] cpu 0 load 80
[ 2736.968200] cpu 0 load 68
[ 2737.968200] cpu 0 load 79
[ 2738.973175] cpu 0 load 72
[ 2739.983154] cpu 0 load 83
[ 2741.062103] cpu 0 load 71
[ 2742.109619] cpu 0 load 73
[ 2743.189453] cpu 0 load 73
[ 2744.226165] cpu 0 load 83
[ 2745.281463] cpu 0 load 68
[ 2746.324768] cpu 0 load 75
[ 2747.384948] cpu 0 load 77
[ 2748.405853] cpu 0 load 80

Vmstat显示的cpu0的负载是:

procs  memory                       system          cpu
 r  b    free mapped   anon   slab    in   cs  flt  us ni sy id wa ir
 0  0  400608 122636 110676  10708   167   19    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   203   64    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   170   31    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   183   35    0   0  0  0 98  0  0
 0  0  400608 122640 110676  10708   174   26    0   1  0  0 99  0  0
 0  0  400608 122640 110676  10708   179   31    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   175   28    0   0  0  0 98  0  0
 0  0  400608 122640 110676  10708   173   26    0   0  0  1 98  0  0
 0  0  400608 122640 110676  10708   163   25    0   0  0  1 99  0  0
 0  0  400608 122640 110676  10708   173   24    0   0  0  1 99  0  0
 0  0  400608 122640 110676  10708   173   29    0   0  0  0 99  0  0
 1  0  400608 122640 110676  10708   179   25    0   0  0  1 99  0  0
 1  0  400608 122640 110676  10708   172   22    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   177   32    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   173   29    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   173   28    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   169   27    0   0  0  2 99  0  0
 0  0  400608 122640 110676  10708   175   33    0   0  0  1 99  0  0
 0  0  400608 122640 110676  10708   169   19    0   0  0  0 99  0  0
 0  0  400608 122640 110676  10708   177   23    0   0  0  0 99  0  0
3.      在x86平台上两者是相同的,cpufreq调频工具的ondemand策略会调用dbs_check_cpu来检查cpu的负载情况(printk打印的load就是),根据负载来调整cpu的主频,由于dbs_check_cpu显示的cpu负载跟vmstat显示的cpu负载差别很大,而且两者对cpu利用率的计算仅仅差别在tickless,因此将pandaboard上采用的linux内核中的tickless关掉(CONFIG_NO_HZ=N),重新进行上述实验,发现问题解决!
4.    pandaboard上运行的内核是 android-omap-panda- 3.0 ,获得方法:
git clone https://android.googlesource.com/kernel/omap.git
cd omap
git checkout origin/android-omap-panda-3.0






你可能感兴趣的:(struct,git,user,System,工具,linux内核)