sky-heaven

Linux启动时间优化-内核和用户空间启动优化实践

转自：https://www.cnblogs.com/arnoldlu/p/9187775.html

关键词：initcall、bootgraph.py、bootchartd、pybootchart等。

启动时间的优化，分为两大部分，分别是内核部分和用户空间两大部分。

从内核timestamp 0.000000作为内核启动起点，到free_initmem()输出"Freeing init memory"作为内核启动的终点。

借助于bootgraph.py对内核的kmsg进行分析，输出bootgraph.html和initcall耗时csv文件。

在紧接着free_initmem()下面，是init进程的启动，作为用户空间的起点。内核的终点和用户空间的起点基本上可以任务无缝衔接。

用户空间借助bootchartd抓取/proc/uptime、/proc/stat、/proc/diskstats、/proc/xxx/stat、/proc/meminfo信息，最后打包到bootlog.tgz。

pybootchart.py对bootlog.tgz进行解析，并生成关于CPU占用率、Memory使用情况、磁盘吞吐率以及各进程执行情况的图标。

基于以上内核和用户空间输出，可以发现initcall和进程启动的异常情况。

比如哪个initcall耗时异常；哪个进程启动耗时过长，可以进入进程启动函数查看是否有阻塞等情况。

1. 内核启动优化

在内核源码中自带了一个工具(scripts/bootgraph.pl)用于分析启动时间，这个工具生成output.svg。

但是bootgraph.py生成的结果可读性更好，也更加容易发现问题。

1.1 准备工作

对内核的修改包括，initcall_debug和CONFIG_LOG_BUF_SHIFT。

1.1.1 打开initcall_debug

bool initcall_debug = true;

这样做的目的是在内核kmsg中记录每个initcall的calling和initcall时间，本工具分析依赖于这些kmsg。

static int __init_or_module do_one_initcall_debug(initcall_t fn)
{
    ktime_t calltime, delta, rettime;
    unsigned long long duration;
    int ret;

    printk(KERN_DEBUG "calling  %pF @ %i\n", fn, task_pid_nr(current));-----------------------initcall开始log
    calltime = ktime_get();
    ret = fn();
    rettime = ktime_get();
    delta = ktime_sub(rettime, calltime);
    duration = (unsigned long long) ktime_to_ns(delta) >> 10;
    printk(KERN_DEBUG "initcall %pF returned %d after %lld usecs\n", fn,
        ret, duration);-----------------------------------------------------------------------initcall结束log

    return ret;
}

int __init_or_module do_one_initcall(initcall_t fn)
{
    int count = preempt_count();
    int ret;

    if (initcall_debug)
        ret = do_one_initcall_debug(fn);
    else
        ret = fn();
...
}

1.1.2 增大log_buf空间

log_buf用于存放printk消息，他类似于RingBuffer，超出部分会覆盖头部。

#define __LOG_BUF_LEN    (1 << CONFIG_LOG_BUF_SHIFT)

static char __log_buf[__LOG_BUF_LEN];
static char *log_buf = __log_buf;

所以将CONFIG_LOG_BUF_SHIFT从16增加到18，即log_buf空间从64K增加到256K。

1.1.3 对bootgraph.py的改进

1.1.3.1 划分内核启动的起点终点

界定内核启动的起点很容易，从时间0开始。

用户空间的起点是init进程，所以将内核空间的终点放在启动init进程之前。

这样就可以清晰看到initcall在整个内核初始化中的位置。

static inline int free_area(unsigned long pfn, unsigned long end, char *s)
{
    unsigned int pages = 0, size = (end - pfn) << (PAGE_SHIFT - 10);
...
    if (size && s)
        printk(KERN_INFO "Freeing %s memory: %dK\n", s, size);-------------输出“Freeing init memory:”到kmsg中。

    return pages;
}

void free_initmem(void)
{
...
    if (!machine_is_integrator() && !machine_is_cintegrator())
        totalram_pages += free_area(__phys_to_pfn(__pa(__init_begin)),
                        __phys_to_pfn(__pa(__init_end)),
                        "init");
}

static noinline int init_post(void)
{
    /* need to finish all async __init code before freeing the memory */
    async_synchronize_full();
    free_initmem();------------------------------------------------------------内核空间的终点
...
    run_init_process("/sbin/init");--------------------------------------------用户空间的起点
    run_init_process("/etc/init");
    run_init_process("/bin/init");
    run_init_process("/bin/sh");
...
}

基于“Freeing init memory”对内核和用户空间初始化进行划分，Split kernel and userspace by free_area()。

commit 6195fa73b5522ec5f2461932c894421c30fc3cd7
Author: Arnold Lu 
Date:   Tue Jun 19 22:49:09 2018 +0800

    Split kernel and userspace by free_area()

diff --git a/bootgraph.py b/bootgraph.py
index 8ee626c..dafe359 100755
--- a/bootgraph.py
+++ b/bootgraph.py
@@ -63,6 +63,7 @@ class SystemValues(aslib.SystemValues):
     timeformat = '%.6f'
     bootloader = 'grub'
     blexec = []
+    last_init=0
     def __init__(self):
         self.hostname = platform.node()
         self.testtime = datetime.now().strftime('%Y-%m-%d_%H:%M:%S')
@@ -223,7 +224,7 @@ class Data(aslib.Data):
             'kernel': {'list': dict(), 'start': -1.0, 'end': -1.0, 'row': 0,
                 'order': 0, 'color': 'linear-gradient(to bottom, #fff, #bcf)'},
             'user': {'list': dict(), 'start': -1.0, 'end': -1.0, 'row': 0,
-                'order': 1, 'color': '#fff'}
+                'order': 1, 'color': 'linear-gradient(to bottom, #456, #cde)'}
         }
     def deviceTopology(self):
         return ''
@@ -345,17 +346,18 @@ def parseKernelLog():
         m = re.match('^initcall *(?P.*)\+.* returned (?P.*) after (?P.*) usecs', msg)
         if(m):
             data.valid = True
-            data.end = ktime
+            sysvals.last_init = '%.0f'%(ktime*1000)
             f, r, t = m.group('f', 'r', 't')
             if(f in devtemp):
                 start, pid = devtemp[f]
                 data.newAction(phase, f, pid, start, ktime, int(r), int(t))
                 del devtemp[f]
             continue
-        if(re.match('^Freeing unused kernel memory.*', msg)):
+        if(re.match('^Freeing init kernel memory.*', msg)):
             data.tUserMode = ktime
             data.dmesg['kernel']['end'] = ktime
             data.dmesg['user']['start'] = ktime
+            data.end = ktime+0.1
             phase = 'user'
 
     if tp.stamp:
@@ -531,8 +533,8 @@ def createBootGraph(data):
         print('ERROR: No timeline data')
         return False
     user_mode = '%.0f'%(data.tUserMode*1000)
-    last_init = '%.0f'%(tTotal*1000)
-    devtl.html += html_timetotal.format(user_mode, last_init)
+    #last_init = '%.0f'%(tTotal*1000)
+    devtl.html += html_timetotal.format(user_mode, sysvals.last_init)
 
     # determine the maximum number of rows we need to draw
     devlist = []

1.1.3.2 将每个initcall启动记录到csv

图形化的好处就是直观，但是有时候需要更准确的数据进行排序分析。

这时候生成excel数据，进行处理就很方便了。

增加下面代码会在生成bootgraph.html的同时生成devinit.csv文件，Record data to csv file.。

commit 7bcb705ed30b1e1a0ca3385d01b412f8e6f23b4e
Author: Arnold Lu 
Date:   Tue Jun 19 22:52:43 2018 +0800

    Record data to csv file.

diff --git a/bootgraph.py b/bootgraph.py
index dafe359..7f43cb7 100755
--- a/bootgraph.py
+++ b/bootgraph.py
@@ -33,6 +33,7 @@ import shutil
 from datetime import datetime, timedelta
 from subprocess import call, Popen, PIPE
 import sleepgraph as aslib
+import csv
 
 # ----------------- CLASSES --------------------
 
@@ -48,6 +49,7 @@ class SystemValues(aslib.SystemValues):
     kernel = ''
     dmesgfile = ''
     ftracefile = ''
+    csvfile = 'devinit.csv'
     htmlfile = 'bootgraph.html'
     testdir = ''
     kparams = ''
@@ -300,6 +302,9 @@ def parseKernelLog():
         lf = open(sysvals.dmesgfile, 'r')
     else:
         lf = Popen('dmesg', stdout=PIPE).stdout
+    csvfile = open(sysvals.csvfile, 'wb');
+    csvwriter = csv.writer(csvfile)
+    csvwriter.writerow(['Func', 'Start(ms)', 'End(ms)', 'Duration(ms)', 'Return'])
     for line in lf:
         line = line.replace('\r\n', '')
         # grab the stamp and sysinfo
@@ -351,6 +356,7 @@ def parseKernelLog():
             if(f in devtemp):
                 start, pid = devtemp[f]
                 data.newAction(phase, f, pid, start, ktime, int(r), int(t))
+                csvwriter.writerow([f, start*1000, ktime*1000, float(t)/1000, int(r)]);
                 del devtemp[f]
             continue
         if(re.match('^Freeing init kernel memory.*', msg)):
@@ -364,6 +370,7 @@ def parseKernelLog():
         sysvals.stamp = 0
         tp.parseStamp(data, sysvals)
     data.dmesg['user']['end'] = data.end
+    csvfile.close()
     lf.close()
     return data

1.2 生成测试结果

执行如下命令生成两个文件bootgraph.html和devinit.csv。

bootgraph.py依赖于kmsg中的“calling”/“initcall”识别initcall的起点终点，依赖“Freeing init memory”作为内核启动终点。

./bootgraph.py -dmesg kmsg.txt -addlogs

PS：下面两张截图都覆盖了函数名称。

1.2.1 bootgraph.html分析

从下面的图可以看出内核的初始化持续到2672ms处，然后整个内核初始化主要部分就是initcall。

同时从上面可以看出哪几个initcall占用时间较长，点击可以看到持续多久、是否成功等信息。

1.2.2 devinit.csv分析

相对于bootgraph.html，devinit.csv更容易进行量化。

对devinit.csv按照Duration进行降序，可以看出占用靠前的initcall。

1.3 优化实例

1.3.1 串口log优化

对于115200的串口速率来说，一个字符耗时大概1/(115200/10)=0.087ms。所以100个字符大概耗时8.7ms。

在内核初始化的时候，输出很多串口log是一件恐怖的事情。

虽然不是什么高深的技巧，但是却很有效。

1.3.1.1 初始状态

在没有打开initcall_debug，console_printk采用默认配置情况下，内核启动总共耗时2881ms。

<6>[ 2.881049] Freeing init memory: 340K

1.3.1.2 打开initcall_debug

在打开initcall_debug用于调试之后，引入了额外的开销。

但又不得不借助于initcall_debug来发现问题。

内核启动共耗时3404ms，引入了523ms开销。

关于initcall耗时列表如下：

1.3.1.3 打开initcall_debug，关闭console显示

在关闭了console显示过后，串口被最大化的关闭。

内核共耗时1281ms，相较原始状态减少了1600ms。也就是说整个内核初始化的一大半时间被节省了。

在关闭串口console之后，可以看出initcall的时间大大减少了。

1.3.2 优化耗时top10的initcall

参见上图列表，进入initcall进行优化。

2. 用户空间启动优化

用户空间的优化依赖于bootchartd获取log，然后使用pybootchart.py进行分析。

下面分几部分进行分析：如何在busybox中使能bootchartd；对bootchartd进行简单分析；对pybootchart.py进行简单分析；最后对测试结果进行分析。

2.1 使能bootchartd

要使能bootchartd，需要修改命令行参数以支持从bootchartd启动init；bootchartd本身以及tar、dmesg等支持。

2.1.1 bootloader中修改命令行参数增加

修改bootloader中传递给Linux的命令行参数，如果bootchartd放在ramfs中，使用rdinit=/sbin/bootchartd。

如果bootchartd放在非ramfs中：

init=/sbin/bootchartd

如此使用bootchartd作为init，然后再用bootchartd去启动/sbin/init。

Linux内核init_setup()函数从cmdline解析出init参数，赋给execute_command。

然后在init_post()中就会使用run_init_process()。

static int __init init_setup(char *str)
{
    unsigned int i;

    execute_command = str;------------------------------------------从cmdline中解析出init的值，赋给execute_command。
    /*
     * In case LILO is going to boot us with default command line,
     * it prepends "auto" before the whole cmdline which makes
     * the shell think it should execute a script with such name.
     * So we ignore all arguments entered _before_ init=... [MJ]
     */
    for (i = 1; i < MAX_INIT_ARGS; i++)
        argv_init[i] = NULL;
    return 1;
}
__setup("init=", init_setup);

static noinline int init_post(void)
{
...
    free_initmem();
...
    if (execute_command) {
        run_init_process(execute_command);---------------------------如果execute_command被赋值，那么作为init进程进行初始化。如果成功，后面的run_init_process()不会被执行。
        printk(KERN_WARNING "Failed to execute %s.  Attempting "
                    "defaults...\n", execute_command);
    }
    run_init_process("/sbin/init");
    run_init_process("/etc/init");
    run_init_process("/bin/init");
    run_init_process("/bin/sh");

    panic("No init found.  Try passing init= option to kernel. "
          "See Linux Documentation/init.txt for guidance.");
}

2.1.2 内核中修改busybox

内核中需要打开bootchartd选项、同时还需要支持tar，因为需要对生成的文件进行打包。

由于需要获取内核kmsg，所以需要dmesg支持。

CONFIG_FEATURE_SEAMLESS_GZ=y
CONFIG_GUNZIP=y
CONFIG_GZIP=y
CONFIG_FEATURE_GZIP_LONG_OPTIONS=y
CONFIG_TAR=y
CONFIG_FEATURE_TAR_CREATE=y
CONFIG_FEATURE_TAR_AUTODETECT=y
CONFIG_FEATURE_TAR_FROM=y
CONFIG_FEATURE_TAR_OLDGNU_COMPATIBILITY=y
CONFIG_FEATURE_TAR_OLDSUN_COMPATIBILITY=y
CONFIG_FEATURE_TAR_GNU_EXTENSIONS=y
CONFIG_FEATURE_TAR_LONG_OPTIONS=y
CONFIG_FEATURE_TAR_TO_COMMAND=y
CONFIG_FEATURE_TAR_UNAME_GNAME=y
CONFIG_FEATURE_TAR_NOPRESERVE_TIME=y
CONFIG_BOOTCHARTD=y
CONFIG_FEATURE_BOOTCHARTD_BLOATED_HEADER=y
CONFIG_DMESG=y

2.1.3 对bootchartd的调整

对bootchartd的配置可以通过指定配置文件，ENABLE_FEATURE_BOOTCHARTD_CONFIG_FILE。

或者通过修改sample_period_us和process_accounting。

int bootchartd_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int bootchartd_main(int argc UNUSED_PARAM, char **argv)
{
...
    /* Read config file: */
    sample_period_us = 200 * 1000;-----------------------------------如果觉得粒度不够，丢失细节，可以提高采样频率查看更多细节。但代价是bootchard占用更多CPU资源。
    process_accounting = 0;
    if (ENABLE_FEATURE_BOOTCHARTD_CONFIG_FILE) {
        char* token[2];
        parser_t *parser = config_open2("/etc/bootchartd.conf" + 5, fopen_for_read);
        if (!parser)
            parser = config_open2("/etc/bootchartd.conf", fopen_for_read);
        while (config_read(parser, token, 2, 0, "#=", PARSE_NORMAL & ~PARSE_COLLAPSE)) {
            if (strcmp(token[0], "SAMPLE_PERIOD") == 0 && token[1])
                sample_period_us = atof(token[1]) * 1000000;
            if (strcmp(token[0], "PROCESS_ACCOUNTING") == 0 && token[1]
             && (strcmp(token[1], "on") == 0 || strcmp(token[1], "yes") == 0)
            ) {
                process_accounting = 1;
            }
        }
        config_close(parser);
        if ((int)sample_period_us <= 0)
            sample_period_us = 1; /* prevent division by 0 */
    }
...
    return EXIT_SUCCESS;
}

2.1.4 增加meminfo、dmesg

打开对/proc/meminfo的解析，原始数据保存在proc_meminfo.log中。

同时保存内核kmsg到dmesg中。

@@ -212,6 +212,7 @@
 {
     FILE *proc_stat = xfopen("proc_stat.log", "w");
     FILE *proc_diskstats = xfopen("proc_diskstats.log", "w");
+    FILE *proc_meminfo = xfopen("proc_meminfo.log", "w");
     //FILE *proc_netdev = xfopen("proc_netdev.log", "w");
     FILE *proc_ps = xfopen("proc_ps.log", "w");
     int look_for_login_process = (getppid() == 1);
@@ -240,6 +241,7 @@
 
         dump_file(proc_stat, "/proc/stat");
         dump_file(proc_diskstats, "/proc/diskstats");
+        dump_file(proc_meminfo, "/proc/meminfo");
         //dump_file(proc_netdev, "/proc/net/dev");
         if (dump_procs(proc_ps, look_for_login_process)) {
             /* dump_procs saw a getty or {g,k,x}dm
@@ -306,8 +308,11 @@
     }
     fclose(header_fp);
 
+    system(xasprintf("dmesg >dmesg"));
+
     /* Package log files */
-    system(xasprintf("tar -zcf /var/log/bootlog.tgz header %s *.log", process_accounting ? "kernel_pacct" : ""));
+    //system(xasprintf("tar -zcf /var/log/bootlog.tgz header %s *.log", process_accounting ? "kernel_pacct" : ""));
+    system(xasprintf("tar -zcf /var/log/bootlog.tgz header dmesg %s *.log", process_accounting ? "kernel_pacct" : ""));
     /* Clean up (if we are not in detached tmpfs) */
     if (tempdir) {
         unlink("header");
@@ -315,6 +320,7 @@
         unlink("proc_diskstats.log");
         //unlink("proc_netdev.log");
         unlink("proc_ps.log");
+        unlink("dmesg");
         if (process_accounting)
             unlink("kernel_pacct");
         rmdir(tempdir);

2.2 bootchartd分析

bootchartd的入口点是bootchartd_main()函数。

在bootchartd_main中主要就是解析start/init/stop参数。如果使能bootchartd.conf的话，解析出sample_period_us和process_accounting。

bootchartd_main()主要通过do_logging()收集log和finalize()做打包收尾工作。

static void do_logging(unsigned sample_period_us, int process_accounting)
{
    FILE *proc_stat = xfopen("proc_stat.log", "w");
    FILE *proc_diskstats = xfopen("proc_diskstats.log", "w");
    FILE *proc_meminfo = xfopen("proc_meminfo.log", "w");
    //FILE *proc_netdev = xfopen("proc_netdev.log", "w");
    FILE *proc_ps = xfopen("proc_ps.log", "w");
    int look_for_login_process = (getppid() == 1);
    unsigned count = 60*1000*1000 / sample_period_us; /* ~1 minute */--------------------------最长统计1分钟时间bootchart

    if (process_accounting) {
        close(xopen("kernel_pacct", O_WRONLY | O_CREAT | O_TRUNC));
        acct("kernel_pacct");
    }

    while (--count && !bb_got_signal) {--------------------------------------------------------如果满足count为0或者bb_got_signal，则停止采样。
        char *p;
        int len = open_read_close("/proc/uptime", G.jiffy_line, sizeof(G.jiffy_line)-2);
        if (len < 0)
            goto wait_more;
        /* /proc/uptime has format "NNNNNN.MM NNNNNNN.MM" */
        /* we convert it to "NNNNNNMM\n" (using first value) */
        G.jiffy_line[len] = '\0';
        p = strchr(G.jiffy_line, '.');
        if (!p)
            goto wait_more;
        while (isdigit(*++p))
            p[-1] = *p;
        p[-1] = '\n';
        p[0] = '\0';

        dump_file(proc_stat, "/proc/stat");---------------------------------------------------保存/proc/stat到proc_stat.og中
        dump_file(proc_diskstats, "/proc/diskstats");-----------------------------------------保存/proc/diskstats到proc_diskstats.log中
        dump_file(proc_meminfo, "/proc/meminfo");---------------------------------------------保存/proc/meminfo到proc_meminfo.log中
        //dump_file(proc_netdev, "/proc/net/dev");
        if (dump_procs(proc_ps, look_for_login_process)) {------------------------------------遍历/proc下所有进程到proc_ps.log中
            /* dump_procs saw a getty or {g,k,x}dm
             * stop logging in 2 seconds:
             */
            if (count > 2*1000*1000 / sample_period_us)
                count = 2*1000*1000 / sample_period_us;
        }
        fflush_all();
 wait_more:
        usleep(sample_period_us);-------------------------------------------------------------每次采样后睡眠sample_period_us，达到周期性的目的。
    }
}

dump_procs()处理/proc目录下每个pid的stat文件。

static int dump_procs(FILE *fp, int look_for_login_process)
{
    struct dirent *entry;
    DIR *dir = opendir("/proc");
    int found_login_process = 0;

    fputs(G.jiffy_line, fp);
    while ((entry = readdir(dir)) != NULL) {------------------------------------遍历/proc目录，返回entry是struct dirent数据结构
        char name[sizeof("/proc/%u/cmdline") + sizeof(int)*3];
        int stat_fd;
        unsigned pid = bb_strtou(entry->d_name, NULL, 10);----------------------这里只取数字类型，其它目录则continue。
        if (errno)
            continue;

        /* Android's version reads /proc/PID/cmdline and extracts
         * non-truncated process name. Do we want to do that? */

        sprintf(name, "/proc/%u/stat", pid);
        stat_fd = open(name, O_RDONLY);
        if (stat_fd >= 0) {
            char *p;
            char stat_line[4*1024];
            int rd = safe_read(stat_fd, stat_line, sizeof(stat_line)-2);

            close(stat_fd);
            if (rd < 0)
                continue;
            stat_line[rd] = '\0';
            p = strchrnul(stat_line, '\n');
            *p++ = '\n';
            *p = '\0';
            fputs(stat_line, fp);----------------------------------------------保存读取的/proc/xxx/stat到fp中
            if (!look_for_login_process)
                continue;
...
        }
    }
    closedir(dir);
    fputc('\n', fp);
    return found_login_process;
}

finalize()生成header、dmesg，然后和do_logging()中生成的文件一起打包到bootlog.tgz中。

static void finalize(char *tempdir, const char *prog, int process_accounting)
{
    //# Stop process accounting if configured
    //local pacct=
    //[ -e kernel_pacct ] && pacct=kernel_pacct

    FILE *header_fp = xfopen("header", "w");

    if (process_accounting)
        acct(NULL);

    if (prog)
        fprintf(header_fp, "profile.process = %s\n", prog);

    fputs("version = "BC_VERSION_STR"\n", header_fp);
    if (ENABLE_FEATURE_BOOTCHARTD_BLOATED_HEADER) {
        char *hostname;
        char *kcmdline;
        time_t t;
        struct tm tm_time;
        /* x2 for possible localized weekday/month names */
        char date_buf[sizeof("Mon Jun 21 05:29:03 CEST 2010") * 2];
        struct utsname unamebuf;

        hostname = safe_gethostname();
        time(&t);
        localtime_r(&t, &tm_time);
        strftime(date_buf, sizeof(date_buf), "%a %b %e %H:%M:%S %Z %Y", &tm_time);
        fprintf(header_fp, "title = Boot chart for %s (%s)\n", hostname, date_buf);
        if (ENABLE_FEATURE_CLEAN_UP)
            free(hostname);

        uname(&unamebuf); /* never fails */
        /* same as uname -srvm */
        fprintf(header_fp, "system.uname = %s %s %s %s\n",
                unamebuf.sysname,
                unamebuf.release,
                unamebuf.version,
                unamebuf.machine
        );

        //system.release = `cat /etc/DISTRO-release`
        //system.cpu = `grep '^model name' /proc/cpuinfo | head -1` ($cpucount)

        kcmdline = xmalloc_open_read_close("/proc/cmdline", NULL);
        /* kcmdline includes trailing "\n" */
        fprintf(header_fp, "system.kernel.options = %s", kcmdline);
        if (ENABLE_FEATURE_CLEAN_UP)
            free(kcmdline);
    }
    fclose(header_fp);

    system(xasprintf("dmesg >dmesg"));

    /* Package log files */
    //system(xasprintf("tar -zcf /var/log/bootlog.tgz header %s *.log", process_accounting ? "kernel_pacct" : ""));
    system(xasprintf("tar -zcf /var/log/bootlog.tgz header dmesg %s *.log", process_accounting ? "kernel_pacct" : ""));
    /* Clean up (if we are not in detached tmpfs) */
    if (tempdir) {
        unlink("header");
        unlink("proc_stat.log");
        unlink("proc_diskstats.log");
        //unlink("proc_netdev.log");
        unlink("proc_ps.log");
        unlink("dmesg");
        if (process_accounting)
            unlink("kernel_pacct");
        rmdir(tempdir);
    }

    /* shell-based bootchartd tries to run /usr/bin/bootchart if $AUTO_RENDER=yes:
     * /usr/bin/bootchart -o "$AUTO_RENDER_DIR" -f $AUTO_RENDER_FORMAT "$BOOTLOG_DEST"
     */
}

2.3 pybootchart分析

pybootchart主要分为两大部分：解析和画图。

从_do_parse()中可以看出解析的数据是从哪个log文件中获取的。而这些log文件是由do_logging()从内核节点获取的。

通过_do_parse()和do_logging()两函数，就可以明白生成结果图表中数据在内核中的对应意义。

2.3.1 pybootchart解析bootload.tgz

pybootchart在解析这些log文件的时候，同时解析了从/proc/uptime获取的时间作为时间轴。

def _do_parse(writer, state, name, file):
    writer.status("parsing '%s'" % name)
    t1 = clock()
    if name == "header":
        state.headers = _parse_headers(file)
    elif name == "proc_diskstats.log":
        state.disk_stats = _parse_proc_disk_stat_log(file, get_num_cpus(state.headers))
    elif name == "taskstats.log":
        state.ps_stats = _parse_taskstats_log(writer, file)
        state.taskstats = True
    elif name == "proc_stat.log":
        state.cpu_stats = _parse_proc_stat_log(file)
    elif name == "proc_meminfo.log":
        state.mem_stats = _parse_proc_meminfo_log(file)
    elif name == "dmesg":
        state.kernel = _parse_dmesg(writer, file)
    elif name == "cmdline2.log":
        state.cmdline = _parse_cmdline_log(writer, file)
    elif name == "paternity.log":
        state.parent_map = _parse_paternity_log(writer, file)
    elif name == "proc_ps.log":  # obsoleted by TASKSTATS
        state.ps_stats = _parse_proc_ps_log(writer, file)
    elif name == "kernel_pacct": # obsoleted by PROC_EVENTS
        state.parent_map = _parse_pacct(writer, file)
    t2 = clock()
    writer.info("  %s seconds" % str(t2-t1))
    return state

2.3.2 pybootchart画图

经过__do_parse()解析的结果，在render()中进行渲染。

#
# Render the chart.
#
def render(ctx, options, xscale, trace):
    (w, h) = extents (options, xscale, trace)
    global OPTIONS
    OPTIONS = options.app_options

    proc_tree = options.proc_tree (trace)

    # x, y, w, h
    clip = ctx.clip_extents()

    sec_w = int (xscale * sec_w_base)
    ctx.set_line_width(1.0)
    ctx.select_font_face(FONT_NAME)
    draw_fill_rect(ctx, WHITE, (0, 0, max(w, MIN_IMG_W), h))
    w -= 2*off_x
    # draw the title and headers
    if proc_tree.idle:
        duration = proc_tree.idle
    else:
        duration = proc_tree.duration

    if not options.kernel_only:
        curr_y = draw_header (ctx, trace.headers, duration)
    else:
        curr_y = off_y;

    if options.charts:
        curr_y = render_charts (ctx, options, clip, trace, curr_y, w, h, sec_w)

    # draw process boxes
    proc_height = h
    if proc_tree.taskstats and options.cumulative:
        proc_height -= CUML_HEIGHT

    draw_process_bar_chart(ctx, clip, options, proc_tree, trace.times,
                   curr_y, w, proc_height, sec_w)

    curr_y = proc_height
    ctx.set_font_size(SIG_FONT_SIZE)
    draw_text(ctx, SIGNATURE, SIG_COLOR, off_x + 5, proc_height - 8)

    # draw a cumulative CPU-time-per-process graph
    if proc_tree.taskstats and options.cumulative:
        cuml_rect = (off_x, curr_y + off_y, w, CUML_HEIGHT/2 - off_y * 2)
        if clip_visible (clip, cuml_rect):
            draw_cuml_graph(ctx, proc_tree, cuml_rect, duration, sec_w, STAT_TYPE_CPU)

    # draw a cumulative I/O-time-per-process graph
    if proc_tree.taskstats and options.cumulative:
        cuml_rect = (off_x, curr_y + off_y * 100, w, CUML_HEIGHT/2 - off_y * 2)
        if clip_visible (clip, cuml_rect):
            draw_cuml_graph(ctx, proc_tree, cuml_rect, duration, sec_w, STAT_TYPE_IO)

渲染图表的主要工作在render_charts()中完成。

def render_charts(ctx, options, clip, trace, curr_y, w, h, sec_w):
    proc_tree = options.proc_tree(trace)

    # render bar legend
    ctx.set_font_size(LEGEND_FONT_SIZE)

    draw_legend_box(ctx, "CPU (user+sys)", CPU_COLOR, off_x, curr_y+20, leg_s)-----------------------CPU占用率部分
    draw_legend_box(ctx, "I/O (wait)", IO_COLOR, off_x + 120, curr_y+20, leg_s)

    # render I/O wait
    chart_rect = (off_x, curr_y+30, w, bar_h)
    if clip_visible (clip, chart_rect):
        draw_box_ticks (ctx, chart_rect, sec_w)
        draw_annotations (ctx, proc_tree, trace.times, chart_rect)
        draw_chart (ctx, IO_COLOR, True, chart_rect, \
                [(sample.time, sample.user + sample.sys + sample.io) for sample in trace.cpu_stats], \
                proc_tree, None)
        # render CPU load
        draw_chart (ctx, CPU_COLOR, True, chart_rect, \
                [(sample.time, sample.user + sample.sys) for sample in trace.cpu_stats], \
                proc_tree, None)

    curr_y = curr_y + 30 + bar_h

    # render second chart
    draw_legend_line(ctx, "Disk throughput", DISK_TPUT_COLOR, off_x, curr_y+20, leg_s)---------------磁盘吞吐率部分
    draw_legend_box(ctx, "Disk utilization", IO_COLOR, off_x + 120, curr_y+20, leg_s)

        # render I/O utilization
    chart_rect = (off_x, curr_y+30, w, bar_h)
    if clip_visible (clip, chart_rect):
        draw_box_ticks (ctx, chart_rect, sec_w)
        draw_annotations (ctx, proc_tree, trace.times, chart_rect)
        draw_chart (ctx, IO_COLOR, True, chart_rect, \
                [(sample.time, sample.util) for sample in trace.disk_stats], \
                proc_tree, None)

    # render disk throughput
    max_sample = max (trace.disk_stats, key = lambda s: s.tput)
    if clip_visible (clip, chart_rect):
        draw_chart (ctx, DISK_TPUT_COLOR, False, chart_rect, \
                [(sample.time, sample.tput) for sample in trace.disk_stats], \
                proc_tree, None)

    pos_x = off_x + ((max_sample.time - proc_tree.start_time) * w / proc_tree.duration)

    shift_x, shift_y = -20, 20
    if (pos_x < off_x + 245):
        shift_x, shift_y = 5, 40

    label = "%dMB/s" % round ((max_sample.tput) / 1024.0)
    draw_text (ctx, label, DISK_TPUT_COLOR, pos_x + shift_x, curr_y + shift_y)

    curr_y = curr_y + 30 + bar_h

    # render mem usage
    chart_rect = (off_x, curr_y+30, w, meminfo_bar_h)
    mem_stats = trace.mem_stats
    if mem_stats and clip_visible (clip, chart_rect):
        #mem_scale = max(sample.records['MemTotal'] - sample.records['MemFree'] for sample in mem_stats)
        mem_scale = max(sample.records['MemTotal'] for sample in mem_stats)
        draw_legend_box(ctx, "Mem cached (scale: %u MiB)" % (float(mem_scale) / 1024), MEM_CACHED_COLOR, off_x, curr_y+20, leg_s)
        draw_legend_box(ctx, "Used", MEM_USED_COLOR, off_x + 240, curr_y+20, leg_s)
        draw_legend_box(ctx, "Buffers", MEM_BUFFERS_COLOR, off_x + 360, curr_y+20, leg_s)
        draw_legend_line(ctx, "Swap (scale: %u MiB)" % max([(sample.records['SwapTotal'] - sample.records['SwapFree'])/1024 for sample in mem_stats]), \
                 MEM_SWAP_COLOR, off_x + 480, curr_y+20, leg_s)
        draw_legend_box(ctx, "Free", MEM_FREE_COLOR, off_x + 700, curr_y+20, leg_s)
        draw_box_ticks(ctx, chart_rect, sec_w)
        draw_annotations(ctx, proc_tree, trace.times, chart_rect)

        draw_chart(ctx, MEM_FREE_COLOR, True, chart_rect, \
               [(sample.time, sample.records['MemTotal']) for sample in trace.mem_stats], \
               proc_tree, [0, mem_scale])
        draw_chart(ctx, MEM_BUFFERS_COLOR, True, chart_rect, \
               [(sample.time, sample.records['MemTotal'] - sample.records['MemFree']) for sample in trace.mem_stats], \
               proc_tree, [0, mem_scale])
        draw_chart(ctx, MEM_CACHED_COLOR, True, chart_rect, \
               [(sample.time, sample.records['MemTotal'] - sample.records['MemFree'] - sample.records['Buffers']) for sample in trace.mem_stats], \
               proc_tree, [0, mem_scale])
        draw_chart(ctx, MEM_USED_COLOR, True, chart_rect, \
               [(sample.time, sample.records['MemTotal'] - sample.records['MemFree'] - sample.records['Buffers'] - sample.records['Cached']) for sample in trace.mem_stats], \
               proc_tree, [0, mem_scale])
        draw_chart(ctx, MEM_SWAP_COLOR, False, chart_rect, \
               [(sample.time, float(sample.records['SwapTotal'] - sample.records['SwapFree'])) for sample in mem_stats], \
               proc_tree, None)
        curr_y = curr_y + meminfo_bar_h

    return curr_y

2.3.3 bootchart进程状态分析

bootchart对进程状态分析依赖于/proc/xxx/stat节点获取的信息，包括进程开始执行时间和终止时间，以及在此过程中的状态变化。

2.3.3.1 proc/xxx/stat解读

每个进程都有自己的一系列节点，bootchart的进程状态、起始点、终止点依赖于proc/xxx/stat节点的分析。

每个sample_period_us，bootchartd就会遍历/proc目录保存其中的stat信息。

stat信息通过do_task_stat()获取相关信息。

上面是proc_ps.log部分内容，可以看出和do_task_stat()中内容对应。

这些信息在pybootchart的__parse_proc_ps_log()中进行解析。

通过start_time可以确定进程的起始时间，然后不同时间的state确定进程在bootchart中的状态，ppid可以确定进程的父子关系，在bootchart中有虚线连接。

static const struct pid_entry tid_base_stuff[] = {
...
    ONE("stat",      S_IRUGO, proc_tid_stat),
...
}

int proc_tid_stat(struct seq_file *m, struct pid_namespace *ns,
            struct pid *pid, struct task_struct *task)
{
    return do_task_stat(m, ns, pid, task, 0);
}


static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
            struct pid *pid, struct task_struct *task, int whole)
{
    unsigned long vsize, eip, esp, wchan = ~0UL;
    long priority, nice;
    int tty_pgrp = -1, tty_nr = 0;
    sigset_t sigign, sigcatch;
    char state;
    pid_t ppid = 0, pgid = -1, sid = -1;
    int num_threads = 0;
    int permitted;
    struct mm_struct *mm;
    unsigned long long start_time;
    unsigned long cmin_flt = 0, cmaj_flt = 0;
    unsigned long  min_flt = 0,  maj_flt = 0;
    cputime_t cutime, cstime, utime, stime;
    cputime_t cgtime, gtime;
    unsigned long rsslim = 0;
    char tcomm[sizeof(task->comm)];
    unsigned long flags;
...
    /* scale priority and nice values from timeslices to -20..20 */
    /* to make it look like a "normal" Unix priority/nice value  */
    priority = task_prio(task);
    nice = task_nice(task);

    /* Temporary variable needed for gcc-2.96 */
    /* convert timespec -> nsec*/
    start_time =
        (unsigned long long)task->real_start_time.tv_sec * NSEC_PER_SEC
                + task->real_start_time.tv_nsec;
    /* convert nsec -> ticks */
    start_time = nsec_to_clock_t(start_time);---------------------------------------进程的启动时间，单位是ticks。

    seq_printf(m, "%d (%s) %c", pid_nr_ns(pid, ns), tcomm, state);------------------进程的pid、名称以及状态，状态在上一小节有介绍。
    seq_put_decimal_ll(m, ' ', ppid);-----------------------------------------------父进程pid。
    seq_put_decimal_ll(m, ' ', pgid);
    seq_put_decimal_ll(m, ' ', sid);
    seq_put_decimal_ll(m, ' ', tty_nr);
    seq_put_decimal_ll(m, ' ', tty_pgrp);
    seq_put_decimal_ull(m, ' ', task->flags);
    seq_put_decimal_ull(m, ' ', min_flt);
    seq_put_decimal_ull(m, ' ', cmin_flt);
    seq_put_decimal_ull(m, ' ', maj_flt);
    seq_put_decimal_ull(m, ' ', cmaj_flt);
    seq_put_decimal_ull(m, ' ', cputime_to_clock_t(utime));--------------------------用户空间消耗时间
    seq_put_decimal_ull(m, ' ', cputime_to_clock_t(stime));--------------------------内核空间消耗时间
    seq_put_decimal_ll(m, ' ', cputime_to_clock_t(cutime));
    seq_put_decimal_ll(m, ' ', cputime_to_clock_t(cstime));
    seq_put_decimal_ll(m, ' ', priority);
    seq_put_decimal_ll(m, ' ', nice);
    seq_put_decimal_ll(m, ' ', num_threads);
    seq_put_decimal_ull(m, ' ', 0);
    seq_put_decimal_ull(m, ' ', start_time);
    seq_put_decimal_ull(m, ' ', vsize);
    seq_put_decimal_ll(m, ' ', mm ? get_mm_rss(mm) : 0);
    seq_put_decimal_ull(m, ' ', rsslim);
    seq_put_decimal_ull(m, ' ', mm ? (permitted ? mm->start_code : 1) : 0);
    seq_put_decimal_ull(m, ' ', mm ? (permitted ? mm->end_code : 1) : 0);
    seq_put_decimal_ull(m, ' ', (permitted && mm) ? mm->start_stack : 0);
    seq_put_decimal_ull(m, ' ', esp);
    seq_put_decimal_ull(m, ' ', eip);
    /* The signal information here is obsolete.
     * It must be decimal for Linux 2.0 compatibility.
     * Use /proc/#/status for real-time signals.
     */
    seq_put_decimal_ull(m, ' ', task->pending.signal.sig[0] & 0x7fffffffUL);
    seq_put_decimal_ull(m, ' ', task->blocked.sig[0] & 0x7fffffffUL);
    seq_put_decimal_ull(m, ' ', sigign.sig[0] & 0x7fffffffUL);
    seq_put_decimal_ull(m, ' ', sigcatch.sig[0] & 0x7fffffffUL);
    seq_put_decimal_ull(m, ' ', wchan);
    seq_put_decimal_ull(m, ' ', 0);
    seq_put_decimal_ull(m, ' ', 0);
    seq_put_decimal_ll(m, ' ', task->exit_signal);
    seq_put_decimal_ll(m, ' ', task_cpu(task));
    seq_put_decimal_ull(m, ' ', task->rt_priority);
    seq_put_decimal_ull(m, ' ', task->policy);
...
    seq_putc(m, '\n');
    if (mm)
        mmput(mm);
    return 0;
}

2.3.3.2 bootchart中进程状态解释

在bootchart中显示的进程状态是从每个进程的/proc/x/stat中获取并解析的。

def draw_process_bar_chart(ctx, clip, options, proc_tree, times, curr_y, w, h, sec_w):
    header_size = 0
    if not options.kernel_only:
        draw_legend_box (ctx, "Running (%cpu)", PROC_COLOR_R, off_x    , curr_y + 45, leg_s)
        draw_legend_box (ctx, "Unint.sleep (I/O)", PROC_COLOR_D, off_x+120, curr_y + 45, leg_s)
        draw_legend_box (ctx, "Sleeping", PROC_COLOR_S, off_x+240, curr_y + 45, leg_s)
        draw_legend_box (ctx, "Zombie", PROC_COLOR_Z, off_x+360, curr_y + 45, leg_s)

从/proc/x/stat中看到的状态为单字符“RSDTtZXxKW”。

这些字符和内核中task_struct->state的对应关系，可以通过如下代码确定。

static const char * const task_state_array[] = {
    "R (running)",        /*   0 */
    "S (sleeping)",        /*   1 */
    "D (disk sleep)",    /*   2 */
    "T (stopped)",        /*   4 */
    "t (tracing stop)",    /*   8 */
    "Z (zombie)",        /*  16 */
    "X (dead)",        /*  32 */
    "x (dead)",        /*  64 */
    "K (wakekill)",        /* 128 */
    "W (waking)",        /* 256 */
};

#define TASK_RUNNING        0
#define TASK_INTERRUPTIBLE    1
#define TASK_UNINTERRUPTIBLE    2
#define __TASK_STOPPED        4
#define __TASK_TRACED        8
/* in tsk->exit_state */
#define EXIT_ZOMBIE        16
#define EXIT_DEAD        32
/* in tsk->state again */
#define TASK_DEAD        64
#define TASK_WAKEKILL        128
#define TASK_WAKING        256
#define TASK_STATE_MAX        512

#define TASK_STATE_TO_CHAR_STR "RSDTtZXxKW"

所以他们之间的关系如下：

Bootchart进程状态	proc状态	task_struct状态
Running	R	TASK_RUNNING
Unint.sleep(I/O)	D	TASK_UNINTERRUPTIBLE
Sleeping	S	TASK_INTERRUPTIBLE
Zombie	Z	EXIT_ZOMBIE

2.3.4 bootchart对内核log分析

基于dmesg文件，_parse_dmesg()函数进行分析。

终点定义为"Freeing init memory"；initcall起点为“calling”，终点为“initcall”。

2.3.5 bootchartd对meminfo分析

proc_meminfo.log如下，经过_parse_proc_meminfo_log()分析，主要提取MemTotal、MemFree、Buffers、Cached等数值。

然后在draw.py的render_charts()中绘制曲线。

MemTotal: 63436 kB
MemFree: 51572 kB
Buffers: 0 kB
Cached: 452 kB
SwapCached: 0 kB
...
SwapTotal: 0 kB
SwapFree: 0 kB
...

2.3.6 bootchart对CPU占用率分析

bootchart通过保存/proc/stat信息，来记录CPU的使用率问题。

cpu 0 0 140 16 0 0 0 0 0 0
cpu0 0 0 140 16 0 0 0 0 0 0
intr 42288 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 254 0 0 0 0 138 0 0 315 0 55 0 0 139 139 0 0 0 0 0 0 0 0 0 0 0 0 2639 0 0 0 0 0 0 0 0 0 93 0 0 0 0 0 0 0 0 0 0 0 0 0 0 105 0 0 534 0 0 0 54 0 0 0 37821 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 10926
btime 946692305
processes 708
procs_running 2
procs_blocked 0
softirq 243 0 243 0 0 0 0 0 0 0 0

2.3.6.1 /proc/stat解析

这些信息通过内核的show_stat()获取，这里主要分析第一行数据，第一行数据是所有CPU的累加信息。

第一行的数据表示的是CPU总的使用情况，依次是：user nice system idle iowait irq softirq steal guest guest_nice。

这些数值的单位是jiffies，jiffies是内核中的一个全局变量，用来记录系统以来产生的节拍数。在Linux中，一个节拍大致可理解为操作系统进程调度的最小时间片。

这些数值的单位并不是jiffies，而是USER_HZ定义的单位。也即一单位为10ms。

# define USER_HZ    100        /* some user interfaces are */
# define CLOCKS_PER_SEC    (USER_HZ)       /* in "ticks" like times() */

user：从系统开始累计到当前时刻，处于用户态的运行时间，包含nice值为负进程。

nice：从系统启动开始累计到当前时刻，nice值不为负的进程所占用的CPU时间。

system：从系统启动开始累计到当前时刻，处于核心态的运行时间，不包括中断时间。

idle：从系统启动开始累计到当前时刻，除IO等待时间以外的其它等待时间

iowait：从系统启动开始累计到当前时刻，IO等待时间

irq：从系统启动开始累计到当前时刻，硬中断时间

softirq：从系统启动开始累计到当前时刻，软中断时间

总的CPU时间=user+nice+system+idle+iowait+irq+softirq

在进行show_stat()分析之前，需要先了解kernel_cpustat和kernel_stat这两个数据结构，这两个数据结构对应的实例都是per-CPU的。

enum cpu_usage_stat {
    CPUTIME_USER,
    CPUTIME_NICE,
    CPUTIME_SYSTEM,
    CPUTIME_SOFTIRQ,
    CPUTIME_IRQ,
    CPUTIME_IDLE,
    CPUTIME_IOWAIT,
    CPUTIME_STEAL,
    CPUTIME_GUEST,
    CPUTIME_GUEST_NICE,
    NR_STATS,
};

struct kernel_cpustat {
    u64 cpustat[NR_STATS];
};

struct kernel_stat {
#ifndef CONFIG_GENERIC_HARDIRQS
       unsigned int irqs[NR_IRQS];
#endif
    unsigned long irqs_sum;
    unsigned int softirqs[NR_SOFTIRQS];
};

内核中tick中断处理函数中调用update_process_times()进行stat更新。

void update_process_times(int user_tick)
{
    struct task_struct *p = current;
    int cpu = smp_processor_id();

    account_process_tick(p, user_tick);
...
}

void account_process_tick(struct task_struct *p, int user_tick)
{
    cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
    struct rq *rq = this_rq();

    if (sched_clock_irqtime) {
        irqtime_account_process_tick(p, user_tick, rq);--------------------如果irq时间需要统计，使用此函数。
        return;
    }

    if (steal_account_process_tick())--------------------------------------累积到CPUTIME_STEAL。
        return;

    if (user_tick)---------------------------------------------------------处于用户态，更新用户态统计信息。
        account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
    else if ((p != rq->idle) || (irq_count() != HARDIRQ_OFFSET))-----------非用户态，则处于内核态；此处统计非idle，或者
        account_system_time(p, HARDIRQ_OFFSET, cputime_one_jiffy,
                    one_jiffy_scaled);
    else
        account_idle_time(cputime_one_jiffy);------------------------------idle状态时间。
}

void account_user_time(struct task_struct *p, cputime_t cputime,
               cputime_t cputime_scaled)
{
    int index;

    /* Add user time to process. */
    p->utime += cputime;
    p->utimescaled += cputime_scaled;
    account_group_user_time(p, cputime);

    index = (TASK_NICE(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;---------------nice大于0的进程，累积到CPUTIME_NICE；nice小于等于的进程，累积到CPUTIME_USER。

    /* Add user time to cpustat. */
    task_group_account_field(p, index, (__force u64) cputime);

    /* Account for user time used */
    acct_update_integrals(p);
}

void account_system_time(struct task_struct *p, int hardirq_offset,
             cputime_t cputime, cputime_t cputime_scaled)
{
    int index;

    if ((p->flags & PF_VCPU) && (irq_count() - hardirq_offset == 0)) {-----虚拟化环境中，累积到CPUTIME_GUEST、CPUTIME_GUEST_NICE。
        account_guest_time(p, cputime, cputime_scaled);
        return;
    }

    if (hardirq_count() - hardirq_offset)----------------------------------硬件中断中，累积到CPUTIME_IRQ。
        index = CPUTIME_IRQ;
    else if (in_serving_softirq())-----------------------------------------表示处于软中断中，累积到CPUTIME_SOFTIRQ。
        index = CPUTIME_SOFTIRQ;
    else
        index = CPUTIME_SYSTEM;--------------------------------------------内核中非idle、硬中断、软中断情况，累积到CPUTIME_SYSTEM。

    __account_system_time(p, cputime, cputime_scaled, index);
}

void account_idle_time(cputime_t cputime)
{
    u64 *cpustat = kcpustat_this_cpu->cpustat;
    struct rq *rq = this_rq();

    if (atomic_read(&rq->nr_iowait) > 0)
        cpustat[CPUTIME_IOWAIT] += (__force u64) cputime;------------------表示当前状态处于io等待，时间累积到CPUTIME_IOWAIT。
    else
        cpustat[CPUTIME_IDLE] += (__force u64) cputime;--------------------处于idle状态时间，累积到CPUTIME_IDLE。
}

关于中断信息的统计，在执行中断和软中断中有相关接口。

在每次硬中断处理中，都会调用kstat_incr_irqs_this_cpu()更新per-cpu的统计变量kernel_stat->irqs_sum，同时也更新irq_desc->kstat_irqs变量。

在软中断处理函数handle_pending_softirqs()中，更新对应软中断计数kernel_stat->softirqs[]。

#define kstat_incr_irqs_this_cpu(irqno, DESC)        \
do {                            \
    __this_cpu_inc(*(DESC)->kstat_irqs);        \
    __this_cpu_inc(kstat.irqs_sum);            \
} while (0)


static void handle_pending_softirqs(u32 pending, int cpu, int need_rcu_bh_qs)
{
    struct softirq_action *h = softirq_vec;
    unsigned int prev_count = preempt_count();

    local_irq_enable();
    for ( ; pending; h++, pending >>= 1) {
...
        kstat_incr_softirqs_this_cpu(vec_nr);
...
    }
    local_irq_disable();
}


static inline unsigned int kstat_softirqs_cpu(unsigned int irq, int cpu)
{
       return kstat_cpu(cpu).softirqs[irq];
}

内核在tick中不停地更新统计数据，然后用户空间想要知道CPU占用率，只需要解析/proc/stat文件信息。

下面就看看/proc/stat对应的函数show_stat()。

static int show_stat(struct seq_file *p, void *v)
{
    int i, j;
    unsigned long jif;
    u64 user, nice, system, idle, iowait, irq, softirq, steal;
    u64 guest, guest_nice;
    u64 sum = 0;
    u64 sum_softirq = 0;
    unsigned int per_softirq_sums[NR_SOFTIRQS] = {0};
    struct timespec boottime;

    user = nice = system = idle = iowait =
        irq = softirq = steal = 0;
    guest = guest_nice = 0;
    getboottime(&boottime);
    jif = boottime.tv_sec;

    for_each_possible_cpu(i) {------------------------------------------遍历所有possible CPU的cpustat，做累加操作。综合所有CPU给出一个统计值。可以看出下面统计和cpu_usage_stat一一对应。
        user += kcpustat_cpu(i).cpustat[CPUTIME_USER];
        nice += kcpustat_cpu(i).cpustat[CPUTIME_NICE];
        system += kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];
        idle += get_idle_time(i);
        iowait += get_iowait_time(i);
        irq += kcpustat_cpu(i).cpustat[CPUTIME_IRQ];
        softirq += kcpustat_cpu(i).cpustat[CPUTIME_SOFTIRQ];
        steal += kcpustat_cpu(i).cpustat[CPUTIME_STEAL];
        guest += kcpustat_cpu(i).cpustat[CPUTIME_GUEST];
        guest_nice += kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE];
        sum += kstat_cpu_irqs_sum(i);-----------------------------------从启动到现在的中断数目，kernel_stat->irqs_sum。
        sum += arch_irq_stat_cpu(i);

        for (j = 0; j < NR_SOFTIRQS; j++) {-----------------------------遍历所有的softirq。
            unsigned int softirq_stat = kstat_softirqs_cpu(j, i);-------从启动到现在的软中断数目，kernel_stat->softirqs[i]。

            per_softirq_sums[j] += softirq_stat;
            sum_softirq += softirq_stat;
        }
    }
    sum += arch_irq_stat();

    seq_puts(p, "cpu ");
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(user));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(nice));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(system));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(idle));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(iowait));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(irq));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(softirq));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(steal));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(guest));
    seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(guest_nice));
    seq_putc(p, '\n');

    for_each_online_cpu(i) {-------------------------------------------下面分别处理CUP单核的统计信息。
        /* Copy values here to work around gcc-2.95.3, gcc-2.96 */
        user = kcpustat_cpu(i).cpustat[CPUTIME_USER];
        nice = kcpustat_cpu(i).cpustat[CPUTIME_NICE];
        system = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];
        idle = get_idle_time(i);
        iowait = get_iowait_time(i);
        irq = kcpustat_cpu(i).cpustat[CPUTIME_IRQ];
        softirq = kcpustat_cpu(i).cpustat[CPUTIME_SOFTIRQ];
        steal = kcpustat_cpu(i).cpustat[CPUTIME_STEAL];
        guest = kcpustat_cpu(i).cpustat[CPUTIME_GUEST];
        guest_nice = kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE];
        seq_printf(p, "cpu%d", i);
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(user));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(nice));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(system));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(idle));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(iowait));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(irq));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(softirq));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(steal));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(guest));
        seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(guest_nice));
        seq_putc(p, '\n');
    }
    seq_printf(p, "intr %llu", (unsigned long long)sum);------------------所有CPU的硬中断计数。

    /* sum again ? it could be updated? */
    for_each_irq_nr(j)
        seq_put_decimal_ull(p, ' ', kstat_irqs_usr(j));-------------------再次遍历所有硬件中断描述符，打印中断执行次数。

    seq_printf(p,
        "\nctxt %llu\n"
        "btime %lu\n"
        "processes %lu\n"
        "procs_running %lu\n"
        "procs_blocked %lu\n",
        nr_context_switches(),-------------------------------------------所有核的进程切换统计和。
        (unsigned long)jif,
        total_forks,
        nr_running(),----------------------------------------------------正在运行的进程数目。
        nr_iowait());----------------------------------------------------处于io等待状态的进程数目。

    seq_printf(p, "softirq %llu", (unsigned long long)sum_softirq);------所有软中断计数。

    for (i = 0; i < NR_SOFTIRQS; i++)
        seq_put_decimal_ull(p, ' ', per_softirq_sums[i]);----------------单个软中断计数，依次是HI_SOFTIRQ,TIMER_SOFTIRQ,NET_TX_SOFTIRQ,NET_RX_SOFTIRQ,BLOCK_SOFTIRQ,BLOCK_IOPOLL_SOFTIRQ,TASKLET_SOFTIRQ,SCHED_SOFTIRQ,HRTIMER_SOFTIRQ,RCU_SOFTIRQ。
    seq_putc(p, '\n');

    return 0;
}

从_parse_proc_stat_log()可以看出，bootchart统计的时间。

由于/proc/stat是累加时间，所以下一次时间统计需要减去上次统计值。

在bootchart图表中，CPU=user+system，所以将内核时间分为三类，和内核时间的关系如下。

CPU=user+nice+system+irq+softirq，iowait=iowait，剩余部分为idle。因为都是tick为单位，所以这个占用率也是粗略的。

def _parse_proc_stat_log(file):
    samples = []
    ltimes = None
    for time, lines in _parse_timed_blocks(file):
        # skip emtpy lines
        if not lines:
            continue
        tokens = lines[0].split()
        if len(tokens) < 8:
            continue
        # CPU times {user, nice, system, idle, io_wait, irq, softirq}
        times = [ int(token) for token in tokens[1:] ]
        if ltimes:
            user = float((times[0] + times[1]) - (ltimes[0] + ltimes[1]))----------------------------------bootchart的user时间包括内核的user+nice
            system = float((times[2] + times[5] + times[6]) - (ltimes[2] + ltimes[5] + ltimes[6]))---------bootchart的system时间包括内核的system+irq+softirq
            idle = float(times[3] - ltimes[3])-------------------------------------------------------------bootchart的idle等于内核的idle
            iowait = float(times[4] - ltimes[4])-----------------------------------------------------------bootchart的iowait等于内核的iowait

            aSum = max(user + system + idle + iowait, 1)
            samples.append( CPUSample(time, us er/aSum, system/aSum, iowait/aSum) )

        ltimes = times
        # skip the rest of statistics lines
    return samples

2.4 测试结果分析

开机的时候bootchartd已经运行起来了，可以在shell中运行如下命令停止bootchartd。

bootchartd stop

在/var/log中生成bootlog.tgz文件，一个典型的bootlog.tgz包含如下文件。

如下命令进入interactive模式，如果不带-i则生成一张png图片。

./pybootchartgui.py bootlog/bootlog.tgz --show-all -i

2.4.1 kernel boot

如果bootlog.tgz中包含了dmesg文件，就会生成k-boot相关信息。

可以很粗略的看出kernel boot占用的总时间，以及占用比较大的initcall。

更详细的initcall以阶梯形式在Kernel boot中展示，阶梯的长度和initcall时长成正比。

但这两种形式都不如bootgraph.html展示的更有效。

2.4.2 用户空间进程启动分析

下图可以分为5部分：

头信息：包含内核uname信息，内核command line。主要从header中获取。

CPU占用率：分为三部分CPU占用率、I/O占用率、剩下的是idle部分。主要从proc_stat.log中获取。

磁盘信息：磁盘的吞吐率和饱和度。主要从proc_diskstats.log中获取。

内存信息：分为5部分使用中、cached、buffer、swap以及剩余内存。主要从proc_meminfo.log中获取。

进程信息：包含进程的父子关系、启动时间、终止时间、运行状态等信息。主要从pro_ps.log中获取。

从下一张图可以看出主要问题在：

由于内核实时进程太多，导致rc启动延迟。
internet.sh启动延迟太多。
g_xxxx_trace_sy进程延迟问题。
VpLoopThread延迟问题。

3. 总结

借助图形化的工具有利于发现问题，但解决问题还需要取具体问题具体对待。

Linux的启动从进入内核那一刻开始，到用户空间达到可用状态。

这个可用状态定义可能不一致，有的是进入shell，有的是弹出登陆框。但只要有一个固定的终点，就有了优化目标。

使用bootgraph.py进行优化，因为测试log本身会引入一些负荷，再找出问题点优化之后，关闭相关log。再和原始状态对比，比较准确。

在使用bootchart进行优化，需要根据实际情况适配采样时间。

如果采样率高，会导致额外负荷增加很多，因为CPU占用率、磁盘吞吐率、内存使用以及进程状态都是通过周期采样的得来的。

如果采样率太低，可能一些进程在采样周期内就启动-执行-退出了，不会被采样到。

联系方式:[email protected]

你可能感兴趣的:(Linux启动时间优化-内核和用户空间启动优化实践)

python start函数_Python中10个常用的内置函数半残大叔霁天 python start函数
大家好，我是小张在3.8版本中，Python解释器有近69个内置函数可供使用，有了它们能极大地提高编码效率，数量虽然不少，但在日常搬砖中只用到其中一部分，根据使用频率和用法，这里列出来几个本人认为不错的内置函数，结合一些例子介绍给大家complex()返回一个形如a+bj的复数，传入参数分为三种情况：参数为空时，返回0j参数为字符串时，将字符串表达式解释为复数形式并返回参数为两个整数(a,b)时，
cv君独家视角 | AI内幕系列七：EfficientViT模型：基于多尺度线性注意力模块，实现高效的高分辨率密集预测 cv君 cv君独家视角 AI内幕系列原创项目级实战项目深度学习与计算机视觉精品 1024程序员节 EfficientViT 高分辨率密集预测任务高分辨率视觉模型 Transformer 人工智能计算机视觉
专题概况cv君独家视角|AI内幕系列是一个专注于人工智能领域的深度专题，旨在为读者揭开AI所有领域技术的神秘面纱，展示其背后的科学原理和实际应用。通过一系列精心策划的文章，我们将带您深入了解AI的各个领域，从计算机视觉到文本语音等多模态领域，从基础理论到前沿技术，从行业应用到未来趋势。无论您是AI领域的工程师或者专家，还是对这一领域充满好奇的读者，这个系列都将为您提供高价值的见解和启发，为您带来横
LeetCode第85题_最大矩形 @蓝莓果粒茶算法 leetcode 算法职场和发展数据结构 c++python unity
LeetCode第85题：最大矩形题目描述给定一个仅包含0和1的二维二进制矩阵，找出只包含1的最大矩形，并返回其面积。难度困难问题链接最大矩形示例示例1:输入：matrix=[["1","0","1","0","0"],["1","0","1","1","1"],["1","1","1","1","1"],["1","0","0","1","0"]]输出：6解释：最大矩形如上图所示。示例2:输入：
数据分析大数据面试题大杂烩01 爱学习的菜鸟罢了大数据 flink 大数据面试 hive hadoop kafka
互联网:通过埋点实时计算用户浏览频次用优惠券等措施吸引用户,通过历史信息用非智能学习的title方式构造用户画像(抖音,京东)电信,银行统计营收和针对用户的个人画像:处理大量非实时数据政府:健康码,扫码之后确诊,找出与确诊对象有关联的人订单订单表(除商品以外所有信息),商品详情表,通过搜集用户title进行定制化推荐点击流数据通过埋点进行用户点击行为分析FLINK一般用来做实时SPARK一般用来做
Docker 入门指南：如何在 Ubuntu 上安装和使用 Docker 天青色等烟雨° linux docker ubuntu
Docker入门指南：如何在Ubuntu上安装和使用Docker安装Dockerdocker的配置安装k8s安装Docker官方Ubuntu存储库中提供的Docker安装软件包可能不是最新版本。Ubuntu官方的版本库中并不一定是Docker最新的安装包，为了保证是最新版，我们从Docker官方库来安装。首先，更新现有的软件包列表：$sudoaptupdate注意：如果无法更新，可能是你的软件源指
GEE数据集——Harmonized Landsat Sentinel-2 (HLS) 卫星sentinel-2哨兵-2（HLS）此星光明 GEE数据集专栏 sentinel 遥感影像 gee 数据集 nasa HLS-2
简介统一大地遥感卫星哨兵-2（HLS）项目通过虚拟卫星传感器群提供一致的地表反射率（SR）和大气层顶部亮度（TOA）数据。陆地成像仪（OLI）安装在美国宇航局/美国地质调查局的联合陆地卫星8号和陆地卫星9号上，而多光谱仪（MSI）则安装在欧洲的哥白尼哨兵-2A号和哨兵-2B号卫星上。通过综合测量，可以每2到3天以30米的空间分辨率对陆地进行全球观测。HLS项目使用一套算法来获得OLI和MSI的无缝
Google Earth Engine——导入无云 Sentinel-2 图像和NDVI计算此星光明 GEE教程训练 sentinel 人工智能 gee ndvi 归一化植被指数波段运算遥感
目录搜索和导入无云Sentinel-2图像Sentinel-2的背景打开GEE界面定义您感兴趣的领域查询Sentinel-2图像的存档过滤图像集合将图像添加到地图视图定义真彩色可视化参数探索影像定义假色可视化参数从波段组合中导出指数NDVI锻炼本实验的目的是介绍GoogleEarthEngine处理环境。在本练习结束时，您将能够搜索、查找和可视化范围广泛的遥感数据集。在第一个练习中，我们将重点关注
安装并配置终端字体獨梟全面配置 linux 运维服务器
1.简介在使用OhMyZsh+Powerlevel10k时，正确的字体配置至关重要。Powerlevel10k依赖NerdFonts扩展字体，以正确显示Git状态、分支、时间、图标等信息。如果没有正确配置字体，你可能会看到乱码、问号（?）、方块（□）或缺失的Powerlevel10k图标。本指南将介绍如何安装和配置终端字体，适用于Linux（Ubuntu、CentOS、Arch）、macOS、Wi
如何更新 Oh My Zsh 獨梟全面配置 linux 运维服务器
OhMyZsh会定期更新，提供新功能、优化和Bug修复。如果你想获取最新版本，可以手动更新。方法1：使用官方更新命令（推荐✅）OhMyZsh提供了内置的更新命令，非常简单高效。1.1运行更新命令omzupdate或者upgrade_oh_my_zsh1.2重启Zsh使更新生效execzsh✅这个方法最推荐，因为它：自动拉取最新的OhMyZsh版本更新所有插件不会覆盖你的.zshrc配置方法2：手动
STM32F407 SPI通信 Klein、凉城 STM32F407标准库 stm32 嵌入式硬件单片机
1、SPI介绍SPI（串行外设接口）是一种由摩托罗拉公司开发的同步串行通信协议，主要用于短距离、高速通信的场景（如芯片间通信）。其核心特点是主从架构、全双工通信和硬件简单，广泛应用于嵌入式系统中连接微控制器（MCU）与传感器、存储器（如EEPROMFlash）、显示屏、实时时钟和网络控制器等外设。SPI接口提供两个主要功能，支持SPI协议或I2S音频协议。默认情况下，选择的是SPI功能。可通过软件
鸿蒙与持续集成荔枝寄 harmonyos ci/cd 华为
鸿蒙操作系统（HarmonyOS）是华为公司开发的一款面向未来的分布式操作系统，它能够为各种设备提供统一的操作平台。为了确保鸿蒙应用的高质量和高效开发，持续集成（ContinuousIntegration,CI）实践显得尤为重要。持续集成是一种软件开发实践，即团队成员频繁地将代码集成到共享仓库中，每次集成都通过自动化的构建（包括编译、发布、自动化测试）来验证，从而尽早发现集成错误。鸿蒙与持续集成的
Apache Tomcat 9.0.37 压缩免安装版松京焕Max
ApacheTomcat9.0.37压缩免安装版apache-tomcat-9.0.37-windows-x64.zip项目地址:https://gitcode.com/open-source-toolkit/94318简介本仓库提供了一个经过压缩的ApacheTomcat9.0.37免安装版本。该版本无需复杂的安装步骤，解压后即可直接使用，非常适合快速部署和开发环境使用。资源文件文件名:apac
探索JavaWeb之旅：Tomcat 9.0.62一站式解决方案富展尤
探索JavaWeb之旅：Tomcat9.0.62一站式解决方案【下载地址】Tomcat9.0.62资源文件下载本仓库提供了一个用于运行JavaWeb项目的资源文件下载，具体为`tocmcat-9.0.62`版本的Tomcat9原始最新版的压缩包。该资源文件是Tomcat9.0.62的完整压缩包，适用于需要使用Tomcat9来部署和运行JavaWeb项目的开发者项目地址:https://gitcod
一个完整的python webSockets游戏服务器，每100ms接收并广播玩家位置小宝哥Code Python基础及AI开发 python 游戏服务器
PythonWebSockets游戏服务器下面是一个完整的PythonWebSockets游戏服务器实现，它每100ms接收并广播玩家位置信息。这个服务器使用websockets和asyncio库来处理WebSocket连接和异步操作。完整代码#!/usr/bin/envpython3"""实时游戏位置广播服务器每100ms接收玩家位置并广播给所有连接的客户端"""importasyncioimp
MySQL常用函数详解及SQL代码示例星河浪人 mysql sql android
MySQL常用函数详解及SQL代码示例引言当前日期和时间函数字符串函数数学函数聚合函数结论引言MySQL作为一种广泛使用的关系型数据库管理系统，提供了丰富的内置函数来简化数据查询、处理和转换。掌握这些函数可以大大提高数据库操作的效率和准确性。本文将详细介绍MySQL中一些常用的函数，并配以SQL代码示例，帮助读者更好地理解和应用这些函数。当前日期和时间函数在当前时间（中国北京时间2025年03月1
数学建模之数学模型-3：动态规划 ^ω^宇博数学模型数学建模动态规划算法
文章目录动态规划基本概念阶段状态决策策略状态转移方程指标函数最优指标函数动态规划的求解前向算法后向算法二者比较应用案例一种中文分词的动态规划模型摘要引言动态规划的分词模型问题的数学描述消除状态的后效性选择优化条件算法描述和计算实例算法的效率分析和评价结束语参考文献动态规划基本概念一个多阶段决策过程最优化问题的动态规划模型包括以下666个要素：以下是对动态规划中阶段、状态、决策、策略、状态转移方程、
HarmonyOS Next系统架构与核心技术解析披光人 harmonyos 系统架构 wpf
HarmonyOSNext作为华为最新一代的分布式操作系统，旨在为全场景设备提供统一的软件平台。它不仅支持传统的智能手机、平板电脑，还扩展到智能家居、可穿戴设备、车载系统等多种终端。HarmonyOSNext的核心目标是实现“一次开发，多端部署”，通过分布式技术和高效的系统架构，为用户提供更流畅、更智能的使用体验。本文将从系统架构、核心技术、实际应用场景等方面，详细解析HarmonyOSNext的
贪心算法和回溯算法有什么区别？少林码僧数据结构与算法实战算法贪心算法
贪心算法和回溯算法有什么区别？在算法的世界里，贪心算法和回溯算法是两种常见的解决问题的策略。它们在很多场景下都能发挥重要作用，但又有着明显的区别。本文将详细介绍贪心算法和回溯算法的区别，并通过具体案例进行说明。一、贪心算法（一）定义与特点贪心算法（GreedyAlgorithm）是一种在每一步选择中都采取当前状态下最优决策的算法。它的核心思想是局部最优解能够导致全局最优解。也就是说，贪心算法在每一
深入理解 OTSU 算法（大津法——最大类间方差法） ZHauLee 机器学习算法计算机视觉人工智能
一、算法概述OTSU算法是一种用于图像分割的自动阈值选择算法，广泛应用于图像处理领域，特别是在二值化过程中。它是由日本学者大津展之（NobuyukiOtsu）在1979年提出，因此得名“OTSU算法”。二、算法原理OTSU算法的核心思想是通过遍历所有可能的阈值，将图像分割为前景（目标）和背景两部分，使得这两部分之间的类内方差（intra-classvariance）最小，或者说使得这两部分之间的类
通信之光纤和光缆的对比玖Yee 信息与通信
光纤和光缆是通信领域中常用的两种传输介质。结构光纤：是一种由玻璃或塑料制成的纤维，一般由纤芯、包层和涂覆层组成。纤芯是光信号的传输通道，包层用于将光信号限制在纤芯内，涂覆层则起到保护光纤的作用。光缆：由多根光纤或光纤束加上加强芯和护套等组成。加强芯用于提高光缆的机械强度，护套则保护光纤免受外界环境的影响。功能光纤：主要功能是传输光信号，利用光在光纤内的全反射原理，实现光信号的高效传输，具有低损耗、
Leetcode Hot100 第40题 297.二叉树的序列化和反序列化 onlyzzr 暑期实习刷题记录 leetcode 深度优先算法
/***Definitionforabinarytreenode.*structTreeNode{*intval;*TreeNode*left;*TreeNode*right;*TreeNode(intx):val(x),left(NULL),right(NULL){}*};*/classCodec{public:intindex;//Encodesatreetoasinglestring.str
otsu算法_OTSU(大津法最大类间方差法) weixin_39996742 otsu算法
OTSU基本介绍OTSU是一种确定图像二值化分割阈值的算法，由日本学者大津于1979年提出，被誉为是图像分割中全局阈值选择的最佳方法。OTSU按照图像的灰度特性，将图像分成前景和背景两部分。因为方差可以看成是灰度分布均匀的一种度量，故前景和背景之间的类间方差越大，说明构成图像两部分的差别越大，当部分前景错分为背景或者部分背景被错分为前景时，都会导致两部分的差别变小。使用类间方差最大的分割一位置错分
海量数据查询加速：Presto、Trino、Apache Arrow 晴天彩虹雨 apache 大数据 hive 数据仓库
1.引言在大数据分析场景下，查询速度往往是影响业务决策效率的关键因素。随着数据量的增长，传统的行存储数据库难以满足低延迟的查询需求，因此，基于列式存储、向量化计算等技术的查询引擎应运而生。本篇文章将深入探讨Presto、Trino、ApacheArrow三种主流的查询优化工具，剖析其核心机制，并通过案例分析展示它们在实际业务中的应用。2.Presto：分布式SQL查询引擎2.1Presto介绍Pr
合并二叉树迭代（leetcode 617 JohnFF leetcode 算法职场和发展
leetcode系列文章目录一、核心操作二、外层配合操作三、核心模式代码总结一、核心操作1.将右树的值加到左树上2.对两棵树的子节点进行筛选，如果都有则都加进去，如果左树没有则将右数的节点指针赋给左树，如果左树有右树没有则不用管提示：小白个人理解，如有错误敬请谅解！二、外层配合操作1.确保root1和root2都有值，所以当一棵树为空则返回另外一棵树三、核心模式代码代码如下：classSoluti
数组总和（leetcode 40 JohnFF leetcode 算法职场和发展
leetcode系列文章目录一、核心操作二、外层配合操作三、核心模式代码总结去重方式和之前三数之和一样，也可以用used数组去重，但本次尝试使用set去重一、核心操作如果count为0了，则证明正好减到了0，就可以收获，并返回建立unordered_set开始循环，如果在set中能够搜寻到当前的数字，说明已经重复了，则直接进行下一次的循环，如果没有找到，则说明这是一个没有重复的新数字，将其加入se
leetcode1005:K次取反后最大化的数组和 0cfjg0 leetcode 算法 java 数据结构
K次取反后最大化的数组和给你一个整数数组nums和一个整数k，按以下方法修改该数组：选择某个下标i并将nums[i]替换为-nums[i]。重复这个过程恰好k次。可以多次选择同一个下标i。以这种方式修改数组后，返回数组可能的最大和。publicintlargestSumAfterKNegations(int[]nums,intk){intmin;intindex;while(true){min=I
UNet 改进：添加Transformer注意力机制增强捕捉长距离依赖关系的能力听风吹等浪起 AI 改进系列 transformer 深度学习人工智能
目录1.Transformer注意力机制2.Unet改进3.代码1.Transformer注意力机制TransformerBlock是Transformer模型架构的基本组件，广泛应用于机器翻译、文本摘要和情感分析等自然语言处理任务。TransformerBlock是一个由两个子组件组成的构建块：多头注意力机制和前馈神经网络。这两个组件协同工作，处理和转换输入序列。多头注意力机制负责从输入序列中捕
Vue动态组件完全指南：原理、使用场景与最佳实践北辰alk 前端 vue vue.js javascript 前端
文章目录一、什么是动态组件？核心特性：二、基本使用方式1.基础语法2.组件注册方式3.动态组件生命周期三、六大典型应用场景1.标签页切换系统2.多步骤表单流程3.动态仪表盘4.权限驱动视图5.插件系统集成6.服务端驱动界面四、高级使用技巧1.状态保持方案2.动态Props传递3.异步组件加载4.过渡动画支持五、性能优化策略1.缓存策略对比2.代码分割配置3.内存管理示例六、常见问题解决方案1.组件
【动态规划1】 m0_46150269 动态规划算法
力扣509.斐波那契数链接:link思路这是一道经典的动态规划DP题，做动态有5步：1.确定dp[i]含义，表示第i个数的斐波那契数值是dp[i]2.dp数组初始化3.确定递推公式4.确定遍历顺序，从递推公式可以知道dp[i]是依赖dp[i-1]和dp[i-2]，那么遍历的顺序一定是从前到后遍历的5.举例推导，草稿完成classSolution{publicintfib(intn){if(n<=1
理解 Retrofit 请求头与 GsonConverterFactory 的自动处理机制居然是阿宋 retrofit
在现代Web开发中，特别是在与RESTfulAPI进行交互时，我们经常会遇到JSON格式的数据交换。为了确保请求的正确解析和响应的准确返回，通常需要通过HTTP请求头明确指定请求体的数据类型。而Content-Type:application/json就是用来告诉服务器，当前请求体中的数据格式是JSON。为什么需要明确指定Content-Type:application/json？数据格式的明确性
Java实现的基于模板的网页结构化信息精准抽取组件：HtmlExtractor yangshangchuan 信息抽取 HtmlExtractor 精准抽取信息采集
HtmlExtractor是一个Java实现的基于模板的网页结构化信息精准抽取组件，本身并不包含爬虫功能，但可被爬虫或其他程序调用以便更精准地对网页结构化信息进行抽取。 HtmlExtractor是为大规模分布式环境设计的，采用主从架构，主节点负责维护抽取规则，从节点向主节点请求抽取规则，当抽取规则发生变化，主节点主动通知从节点，从而能实现抽取规则变化之后的实时动态生效。如
java编程思想 -- 多态百合不是茶 java 多态详解
一: 向上转型和向下转型面向对象中的转型只会发生在有继承关系的子类和父类中（接口的实现也包括在这里）。父类：人子类：男人向上转型： Person p = new Man() ; //向上转型不需要强制类型转化向下转型： Man man =
[自动数据处理]稳扎稳打,逐步形成自有ADP系统体系 comsci dp
对于国内的IT行业来讲,虽然我们已经有了"两弹一星",在局部领域形成了自己独有的技术特征,并初步摆脱了国外的控制...但是前面的路还很长.... 首先是我们的自动数据处理系统还无法处理很多高级工程...中等规模的拓扑分析系统也没有完成,更加复杂的
storm 自定义日志文件商人shang storm cluster logback
Storm中的日志级级别默认为INFO，并且，日志文件是根据worker号来进行区分的，这样，同一个log文件中的信息不一定是一个业务的，这样就会有以下两个需求出现： 1. 想要进行一些调试信息的输出 2. 调试信息或者业务日志信息想要输出到一些固定的文件中不要怕，不要烦恼，其实Storm已经提供了这样的支持，可以通过自定义logback 下的 cluster.xml 来输
Extjs3 SpringMVC使用 @RequestBody 标签问题记录 21jhf
springMVC使用 @RequestBody(required = false) UserVO userInfo 传递json对象数据，往往会出现http 415，400,500等错误，总结一下需要使用ajax提交json数据才行，ajax提交使用proxy，参数为jsonData，不能为params；另外，需要设置Content-type属性为json，代码如下：（由于使用了父类aaa
一些排错方法文强chu 方法
1、java.lang.IllegalStateException: Class invariant violation at org.apache.log4j.LogManager.getLoggerRepository(LogManager.java:199)at org.apache.log4j.LogManager.getLogger(LogManager.java:228) at o
Swing中文件恢复我觉得很难小桔子 swing
我那个草了！老大怎么回事，怎么做项目评估的？只会说相信你可以做的，试一下，有的是时间！用java开发一个图文处理工具，类似word，任意位置插入、拖动、删除图片以及文本等。文本框、流程图等，数据保存数据库，其余可保存pdf格式。ok,姐姐千辛万苦，
php 文件操作 aichenglong PHP 读取文件写入文件
1 写入文件 @$fp=fopen("$DOCUMENT_ROOT/order.txt", "ab"); if(!$fp){ echo "open file error" ; exit; } $outputstring="date:"." \t tire:".$tire."
MySQL的btree索引和hash索引的区别 AILIKES 数据结构 mysql 算法
Hash 索引结构的特殊性，其检索效率非常高，索引的检索可以一次定位，不像B-Tree 索引需要从根节点到枝节点，最后才能访问到页节点这样多次的IO访问，所以 Hash 索引的查询效率要远高于 B-Tree 索引。可能很多人又有疑问了，既然 Hash 索引的效率要比 B-Tree 高很多，为什么大家不都用 Hash 索引而还要使用 B-Tree 索引呢
JAVA的抽象--- 接口 --实现百合不是茶
抽象接口实现接口 //抽象类 ,方法 //定义一个公共抽象的类 ,并在类中定义一个抽象的方法体抽象的定义使用abstract abstract class A 定义一个抽象类例如： //定义一个基类 public abstract class A{ //抽象类不能用来实例化，只能用来继承 //
JS变量作用域实例 bijian1013 作用域
<script> var scope='hello'; function a(){ console.log(scope); //undefined var scope='world'; console.log(scope); //world console.log(b);
TDD实践（二） bijian1013 java TDD
实践题目：分解质因数 Step1：单元测试： package com.bijian.study.factor.test; import java.util.Arrays; import junit.framework.Assert; import org.junit.Before; import org.junit.Test; import com.bijian.
[MongoDB学习笔记一]MongoDB主从复制 bit1129 mongodb
MongoDB称为分布式数据库，主要原因是1.基于副本集的数据备份， 2.基于切片的数据扩容。副本集解决数据的读写性能问题，切片解决了MongoDB的数据扩容问题。事实上，MongoDB提供了主从复制和副本复制两种备份方式，在MongoDB的主从复制和副本复制集群环境中，只有一台作为主服务器，另外一台或者多台服务器作为从服务器。本文介绍MongoDB的主从复制模式，需要指明
【HBase五】Java API操作HBase bit1129 hbase
import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.ha
python调用zabbix api接口实时展示数据 ronin47
zabbix api接口来进行展示。经过思考之后，计划获取如下内容： 1、获得认证密钥 2、获取zabbix所有的主机组 3、获取单个组下的所有主机 4、获取某个主机下的所有监控项
jsp取得绝对路径 byalias 绝对路径
在JavaWeb开发中，常使用绝对路径的方式来引入JavaScript和CSS文件，这样可以避免因为目录变动导致引入文件找不到的情况，常用的做法如下：一、使用${pageContext.request.contextPath} 　　代码” ${pageContext.request.contextPath}”的作用是取出部署的应用程序名，这样不管如何部署，所用路径都是正确的。
Java定时任务调度：用ExecutorService取代Timer bylijinnan java
《Java并发编程实战》一书提到的用ExecutorService取代Java Timer有几个理由，我认为其中最重要的理由是：如果TimerTask抛出未检查的异常，Timer将会产生无法预料的行为。Timer线程并不捕获异常，所以 TimerTask抛出的未检查的异常会终止timer线程。这种情况下，Timer也不会再重新恢复线程的执行了;它错误的认为整个Timer都被取消了。此时，已经被
SQL 优化原则 chicony sql
一、问题的提出　在应用系统开发初期，由于开发数据库数据比较少，对于查询SQL语句，复杂视图的的编写等体会不出SQL语句各种写法的性能优劣，但是如果将应用系统提交实际应用后，随着数据库中数据的增加，系统的响应速度就成为目前系统需要解决的最主要的问题之一。系统优化中一个很重要的方面就是SQL语句的优化。对于海量数据，劣质SQL语句和优质SQL语句之间的速度差别可以达到上百倍，可见对于一个系统
java 线程弹球小游戏 CrazyMizzz java 游戏
最近java学到线程，于是做了一个线程弹球的小游戏，不过还没完善这里是提纲 1.线程弹球游戏实现 1.实现界面需要使用哪些API类 JFrame JPanel JButton FlowLayout Graphics2D Thread Color ActionListener ActionEvent MouseListener Mouse
hadoop jps出现process information unavailable提示解决办法 daizj hadoop jps
hadoop jps出现process information unavailable提示解决办法 jps时出现如下信息： 3019 -- process information unavailable3053 -- process information unavailable2985 -- process information unavailable2917 --
PHP图片水印缩放类实现 dcj3sjt126com PHP
<?php class Image{ private $path; function __construct($path='./'){ $this->path=rtrim($path,'/').'/'; } //水印函数，参数：背景图，水印图，位置，前缀,TMD透明度 public function water($b,$l,$pos
IOS控件学习：UILabel常用属性与用法 dcj3sjt126com ios UILabel
参考网站： http://shijue.me/show_text/521c396a8ddf876566000007 http://www.tuicool.com/articles/zquENb http://blog.csdn.net/a451493485/article/details/9454695 http://wiki.eoe.cn/page/iOS_pptl_artile_281
完全手动建立maven骨架 eksliang java eclipse Web
建一个 JAVA 项目： mvn archetype:create -DgroupId=com.demo -DartifactId=App [-Dversion=0.0.1-SNAPSHOT] [-Dpackaging=jar] 建一个 web 项目： mvn archetype:create -DgroupId=com.demo -DartifactId=web-a
配置清单 gengzg 配置
1、修改grub启动的内核版本 vi /boot/grub/grub.conf 将default 0改为1 拷贝mt7601Usta.ko到/lib文件夹拷贝RT2870STA.dat到 /etc/Wireless/RT2870STA/文件夹拷贝wifiscan到bin文件夹，chmod 775 /bin/wifiscan 拷贝wifiget.sh到bin文件夹，chm
Windows端口被占用处理方法 huqiji windows
以下文章主要以80端口号为例，如果想知道其他的端口号也可以使用该方法..........................1、在windows下如何查看80端口占用情况?是被哪个进程占用?如何终止等. 这里主要是用到windows下的DOS工具,点击"开始"--"运行",输入&
开源ckplayer 网页播放器，跨平台(html5, mobile)，flv, f4v, mp4, rtmp协议. webm, ogg, m3u8 ！天梯梦 mobile
CKplayer，其全称为超酷flv播放器，它是一款用于网页上播放视频的软件，支持的格式有：http协议上的flv,f4v,mp4格式，同时支持rtmp视频流格式播放，此播放器的特点在于用户可以自己定义播放器的风格，诸如播放/暂停按钮，静音按钮，全屏按钮都是以外部图片接口形式调用，用户根据自己的需要制作出播放器风格所需要使用的各个按钮图片然后替换掉原始风格里相应的图片就可以制作出自己的风格了，
简单工厂设计模式 hm4123660 java 工厂设计模式简单工厂模式
简单工厂模式（Simple Factory Pattern）属于类的创新型模式，又叫静态工厂方法模式。是通过专门定义一个类来负责创建其他类的实例，被创建的实例通常都具有共同的父类。简单工厂模式是由一个工厂对象决定创建出哪一种产品类的实例。简单工厂模式是工厂模式家族中最简单实用的模式，可以理解为是不同工厂模式的一个特殊实现。
maven笔记 zhb8015 maven
跳过测试阶段： mvn package -DskipTests 临时性跳过测试代码的编译： mvn package -Dmaven.test.skip=true maven.test.skip同时控制maven-compiler-plugin和maven-surefire-plugin两个插件的行为，即跳过编译，又跳过测试。指定测试类 mvn test
非mapreduce生成Hfile，然后导入hbase当中 Stark_Summer map hbase reduce Hfile path实例
最近一个群友的boss让研究hbase，让hbase的入库速度达到5w+/s，这可愁死了，4台个人电脑组成的集群，多线程入库调了好久，速度也才1w左右，都没有达到理想的那种速度，然后就想到了这种方式，但是网上多是用mapreduce来实现入库，而现在的需求是实时入库，不生成文件了，所以就只能自己用代码实现了，但是网上查了很多资料都没有查到，最后在一个网友的指引下，看了源码，最后找到了生成Hfile
jsp web tomcat 编码问题王新春 tomcat jsp pageEncode
今天配置jsp项目在tomcat上，windows上正常，而linux上显示乱码，最后定位原因为tomcat 的server.xml 文件的配置，添加 URIEncoding 属性： <Connector port="8080" protocol="HTTP/1.1" connectionTi