Linux RT(1)-硬实时Linux(RT-Preempt Patch)在PC上的编译、使用和测试

特别声明:本系列文章LiAnLab.org著作权所有,转载请注明出处。by  @宋宝华Barry 

Vanilla kernel的问题

Linux kernel在spinlock、irq上下文方面无法抢占,因此高优先级任务被唤醒到得以执行的时间并不能完全确定。同时,Linux kernel本身也不处理优先级反转。RT-Preempt Patch是在Linux社区kernel的基础上,加上相关的补丁,以使得Linux满足硬实时的需求。本文描述了该patch在PC上的实践。我们的测试环境为Ubuntu 10.10,默认情况下使用Ubuntu 10.10自带的kernel:

barry@barry-VirtualBox:/lib/modules$ uname -a
2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux
在Ubuntu 10.10,apt-get install rt-tests安装rt测试工具集,运行其中的cyclictest测试工具,默认创建5个SCHED_FIFO策略的realtime线程,优先级76-80,运行周期是1000,1500,2000,2500,3000微秒:

barry@barry-VirtualBox:~/development/panda/android$ sudo cyclictest -p 80 -t5 -n 
[sudo] password for barry: 
policy: fifo: loadavg: 9.22 8.57 6.75 11/374 21385          

T: 0 (20606) P:80 I:1000 C:  18973 Min:     26 Act:   76 Avg:  428 Max:   12637
T: 1 (20607) P:79 I:1500 C:  12648 Min:     31 Act:   68 Avg:  447 Max:   10320
T: 2 (20608) P:78 I:2000 C:   9494 Min:     28 Act:  151 Avg:  383 Max:    9481
T: 3 (20609) P:77 I:2500 C:   7589 Min:     29 Act:  889 Avg:  393 Max:   12670
T: 4 (20610) P:76 I:3000 C:   6325 Min:     37 Act:  167 Avg:  553 Max:   13673

由此可见在标准Linux内,rt线程投入运行的jitter非常不稳定,最小值在26-37微秒,平均值为68-889微秒,而最大值则分布在9481-13673微秒之间。

我们还是运行这个测试,但是在运行这个测试的过程中引入更多干扰,如mount /dev/sdb1 ~/development,则结果变为:

barry@barry-VirtualBox:~$ sudo cyclictest -p 80 -t5 -n 
policy: fifo: loadavg: 0.14 0.29 0.13 2/308 1908          

T: 0 ( 1874) P:80 I:1000 C:  28521 Min:      0 Act:  440 Avg: 2095 Max:  331482
T: 1 ( 1875) P:79 I:1500 C:  19014 Min:      2 Act:  988 Avg: 2099 Max:  330503
T: 2 ( 1876) P:78 I:2000 C:  14261 Min:      7 Act:  534 Avg: 2096 Max:  329989
T: 3 ( 1877) P:77 I:2500 C:  11409 Min:      4 Act:  554 Avg: 2073 Max:  328490
T: 4 ( 1878) P:76 I:3000 C:   9507 Min:     12 Act:  100 Avg: 2081 Max:  328991

mount过程中引入的irq、softirq和spinlock导致最大jitter明显地加大甚至达到了331482us,充分显示出了标准Linux内核中RT线程投入运行时间的不可预期性(硬实时要求意味着可预期)。

如果我们编译一份kernel,选择的是“Voluntary Kernel Preemption (Desktop)“,这类似于2.4不支持kernel抢占的情况,我们运行同样的case,时间的不确定性大地几乎让我们无法接受:

barry@barry-VirtualBox:~$ sudo /usr/local/bin/cyclictest -p 80 -t5 -n
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.23 0.30 0.15 3/247 5086           

T: 0 ( 5082) P:80 I:1000 C:   5637 Min:     60 Act:15108679 Avg:11195196 Max:15108679
T: 1 ( 5083) P:80 I:1500 C:   5723 Min:     48 Act:12364955 Avg:6389691 Max:12364955
T: 2 ( 5084) P:80 I:2000 C:   4821 Min:     32 Act:11119979 Avg:8061814 Max:11661123
T: 3 ( 5085) P:80 I:2500 C:   3909 Min:     27 Act:11176854 Avg:4563549 Max:11176854
T: 4 ( 5086) P:80 I:3000 C:   3598 Min:     37 Act:9951432 Avg:8761137 Max:116026155

RT-Preempt Patch使能

RT-Preempt Patch对Linux kernel的主要改造包括:

  • Making in-kernel locking-primitives (using spinlocks) preemptible though reimplementation with rtmutexes:
  • Critical sections protected by i.e. spinlock_t and rwlock_t are now preemptible. The creation of non-preemptible sections (in kernel) is still possible with raw_spinlock_t (same APIs like spinlock_t)
  • Implementing priority inheritance for in-kernel spinlocks and semaphores. For more information on priority inversion and priority inheritance please consultIntroduction to Priority Inversion
  • Converting interrupt handlers into preemptible kernel threads: The RT-Preempt patch treats soft interrupt handlers in kernel thread context, which is represented by a task_struct like a common userspace process. However it is also possible to register an IRQ in kernel context.
  • Converting the old Linux timer API into separate infrastructures for high resolution kernel timers plus one for timeouts, leading to userspace POSIX timers with high resolution.

在本试验中,我们取的带RT-Preempt Patch的kernel tree是git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable- rt.git,使用其v3.4-rt-rebase branch,编译kernel时选中了"Fully Preemptible Kernel"抢占模型:

───────────────────────── Preemption Model ─────────────────────────┐

│ │          ( ) No Forced Preemption (Server)                 
│ │          ( ) Voluntary Kernel Preemption (Desktop)       
│ │          ( ) Preemptible Kernel (Low-Latency Desktop)    
│ │          ( ) Preemptible Kernel (Basic RT)                
│ │          (X) Fully Preemptible Kernel (RT)      

另外,kernel中需支持tickless和高精度timer:

┌───────────────────Processor type and features ─────────────────────────┐
│ │                                      [*] Tickless System (Dynamic Ticks)                                                               
│ │                                      [*] High Resolution Timer Support       
 

make modules_install、make install、mkintramfs后,我们得到一个可以在Ubuntu中启动的RT kernel。具体编译方法可详见http://www.linuxidc.com/Linux/2012-01/50749.htm,根据该文修改版本号等信息即可,我们运行的命令包括:

安装模块

barry@barry-VirtualBox:~/development/linux-2.6$ sudo make modules_install
....
  INSTALL /lib/firmware/whiteheat_loader.fw
  INSTALL /lib/firmware/whiteheat.fw
  INSTALL /lib/firmware/keyspan_pda/keyspan_pda.fw
  INSTALL /lib/firmware/keyspan_pda/xircom_pgs.fw
  INSTALL /lib/firmware/cpia2/stv0672_vp4.bin
  INSTALL /lib/firmware/yam/1200.bin
  INSTALL /lib/firmware/yam/9600.bin
  DEPMOD  3.4.11-rt19

安装kernel

barry@barry-VirtualBox:~/development/linux-2.6$ sudo make install 
sh /home/barry/development/linux-2.6/arch/x86/boot/install.sh 3.4.11-rt19 arch/x86/boot/bzImage \ 
System.map "/boot" 

制作initrd

barry@barry-VirtualBox:~/development/linux-2.6$ sudo mkinitramfs 3.4.11-rt19 -o /boot/initrd.img-3.4.11-rt19

修改grub配置

在grub.conf中增加新的启动entry,仿照现有的menuentry,增加一个新的,把其中的相关版本号都变更为3.4.11-rt19,我们的修改如下:

 menuentry 'Ubuntu, with Linux 3.4.11-rt19' --class ubuntu --class gnu-linux --class gnu --class os {
    recordfail
    insmod part_msdos
    insmod ext2
    set root='(hd0,msdos1)'
    search --no-floppy --fs-uuid --set a0db5cf0-6ce3-404f-9808-88ce18f0177a
    linux    /boot/vmlinuz-3.4.11-rt19 root=UUID=a0db5cf0-6ce3-404f-9808-88ce18f0177a ro   quiet splash
    initrd    /boot/initrd.img-3.4.11-rt19
}
开机时选择3.4.11-rt19启动:

Linux RT(1)-硬实时Linux(RT-Preempt Patch)在PC上的编译、使用和测试_第1张图片

RT-Preempt Patch试用

运行同样的测试cyclictest benchmark工具,结果迥异:

barry@barry-VirtualBox:~$ sudo cyclictest -p 80 -t5 -n
WARNING: Most functions require kernel 2.6
policy: fifo: loadavg: 0.71 0.42 0.17 1/289 1926          

T: 0 ( 1921) P:80 I:1000 C:   7294 Min:      7 Act:   89 Avg:  197 Max:    3177
T: 1 ( 1922) P:79 I:1500 C:   4863 Min:     10 Act:   85 Avg:  186 Max:    2681
T: 2 ( 1923) P:78 I:2000 C:   3647 Min:     15 Act:   93 Avg:  160 Max:    2504
T: 3 ( 1924) P:77 I:2500 C:   2918 Min:     23 Act:   67 Avg:  171 Max:    2114
T: 4 ( 1925) P:76 I:3000 C:   2432 Min:     19 Act:  134 Avg:  339 Max:    3129

我们还是运行这个测试,但是在运行这个测试的过程中引入更多干扰,如mount /dev/sdb1 ~/development,则结果变为:

barry@barry-VirtualBox:~$ sudo cyclictest -p 80 -t5 -n
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.11 0.12 0.13 1/263 2860          

T: 0 ( 2843) P:80 I:1000 C:  28135 Min:      5 Act:  198 Avg:  200 Max:    7387
T: 1 ( 2844) P:80 I:1500 C:  18756 Min:     22 Act:  169 Avg:  188 Max:    6875
T: 2 ( 2845) P:80 I:2000 C:  14067 Min:      7 Act:   91 Avg:  149 Max:    7288
T: 3 ( 2846) P:80 I:2500 C:  11254 Min:     19 Act:  131 Avg:  155 Max:    6287
T: 4 ( 2847) P:80 I:3000 C:   9378 Min:     25 Act:   58 Avg:  172 Max:    6121
时间在可预期的范围内,没有出现标准kernel里面jitter达到331482的情况。 需要说明的是,这个jitter大到超过了我们的预期,达到了10ms量级,相信是受到了我们的测试都是在Virtualbox虚拟机进行的影响。按照其他文档显示,这个jitter应该在数十us左右。

我们在这个kernel里面运行ps aux命令,可以看出线程化了的irq:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.8  0.1   2880  1788 ?        Ss   18:39   0:03 init
root         2  0.0  0.0      0     0 ?        S    18:39   0:00 kthreadd
...

root        45  0.0  0.0      0     0 ?        S    18:39   0:00 irq/14-ata_piix
root        46  0.0  0.0      0     0 ?        S    18:39   0:00 irq/15-ata_piix
root        50  0.0  0.0      0     0 ?        S    18:39   0:00 irq/19-ehci_hcd
root        51  0.0  0.0      0     0 ?        S    18:39   0:00 irq/22-ohci_hcd
root        55  0.0  0.0      0     0 ?        S    18:39   0:00 irq/12-i8042
root        56  0.0  0.0      0     0 ?        S    18:39   0:00 irq/1-i8042
root        57  0.0  0.0      0     0 ?        S    18:39   0:00 irq/8-rtc0
root       863  0.0  0.0      0     0 ?        S    18:39   0:00 irq/19-eth0
root       864  0.0  0.0      0     0 ?        S    18:39   0:00 irq/16-eth1
root      1002  0.5  0.0      0     0 ?        S    18:39   0:01 irq/21-snd_inte
...

在其中编写一个RT 线程的应用程序,通常需要如下步骤:

  • Setting a real time scheduling policy and priority.
  • Locking memory so that page faults caused by virtual memory will not undermine deterministic behavior
  • Pre-faulting the stack, so that a future stack fault will not undermine deterministic behavior
例子test_rt.c,其中的mlockall是为了防止进程的虚拟地址空间对应的物理页面被swap出去,而stack_prefault()则故意提前导致stack往下增长8KB,因此其后的函数调用和局部变量的使用将不再导致栈增长(依赖于page fault和内存申请):

#include 
#include 
#include 
#include 
#include 
#include 

#define MY_PRIORITY (49) /* we use 49 as the PRREMPT_RT use 50
                            as the priority of kernel tasklets
                            and interrupt handler by default */

#define MAX_SAFE_STACK (8*1024) /* The maximum stack size which is
                                   guaranteed safe to access without
                                   faulting */

#define NSEC_PER_SEC    (1000000000) /* The number of nsecs per sec. */

void stack_prefault(void) {

        unsigned char dummy[MAX_SAFE_STACK];

        memset(dummy, 0, MAX_SAFE_STACK);
        return;
}

int main(int argc, char* argv[])
{
        struct timespec t;
        struct sched_param param;
        int interval = 50000; /* 50us*/

        /* Declare ourself as a real time task */

        param.sched_priority = MY_PRIORITY;
        if(sched_setscheduler(0, SCHED_FIFO, ¶m) == -1) {
                perror("sched_setscheduler failed");
                exit(-1);
        }

        /* Lock memory */

        if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
                perror("mlockall failed");
                exit(-2);
        }

        /* Pre-fault our stack */

        stack_prefault();

        clock_gettime(CLOCK_MONOTONIC ,&t);
        /* start after one second */
        t.tv_sec++;

        while(1) {
                /* wait until next shot */
                clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &t, NULL);

                /* do the stuff */

                /* calculate next shot */
                t.tv_nsec += interval;

                while (t.tv_nsec >= NSEC_PER_SEC) {
                       t.tv_nsec -= NSEC_PER_SEC;
                        t.tv_sec++;
                }
   }
}
编译之:gcc -o test_rt test_rt.c -lrt。本节就到这里,后续我们会有一系列博文来描述RT-Preempt Patch对kernel的主要改动,以及其工作原理。

你可能感兴趣的:(Linux,Kernel开发)