KVM系统虚拟化性能测试过程总结

buildroot编译

为啥要用buildroot
  1. 支持很多:交叉编译工具链、根文件系统生成、内核映像编译和引导加载程序编译。
  2. 使用简单:使用类似内核的menuconfig、gconfig和xconfig配置界面,使用buildroot构建基本系统很容易。
  3. 支持很多的包:很多benchmark的测试,qemu,kvmtools等都集成在里面。
基本介绍

目录结构

KVM系统虚拟化性能测试过程总结_第1张图片

  • config:配置文件
  • dl:下载的软件包
  • output:输出文件
  • package:软件包版本,编译配置信息

配置界面:

KVM系统虚拟化性能测试过程总结_第2张图片

主要关注:

  • Target options:用于为构建目标选择特性和配置参数
  • Toolchain:该选项用于配置工具链和编译器特性
  • System configuration:该选项用于配置生成的文件系统的配置文件和启动特性
  • Target packages:该选项用于选择和配置所需要的软件包和软件环境
  • Filesystem images:该选项用于配置经buildroot编译构建后的文件系统的镜像格式

虚拟机根文件系统需设置:

KVM系统虚拟化性能测试过程总结_第3张图片

配置保存

配置文件作为.config存储在顶级buildroot源目录中。它是一个完整的配置文件,它包含所有选项的值。efconfig只存储选择了非默认值选项的值,这样更容易阅读、修改,可以用于配置的自动化构建。对于默认的buildroot配置,defconfig是空的,一切都是默认的。

在configs/目录下,有许多已经配置好的*_defconfig,我们可以根据它来生成.config文件。

make *_defconfig

然后再:

make menuconfig

它会覆盖当前的.config文件,如果要保存,则可以使用:

make savedefconfig
升级qemu

默认的qemu不支持cortex-a55,最新的qemu8.2.0则支持。

下载最新的qemu-8.2.0.tar.xz,并把它放入dl目录。

修改package/qemu目录下的qemu.mk:

QEMU_VERSION = 8.2.0
QEMU_SOURCE = qemu-$(QEMU_VERSION).tar.xz
QEMU_SITE = http://download.qemu.org
QEMU_LICENSE = GPL-2.0, LGPL-2.1, MIT, BSD-3-Clause, BSD-2-Clause, Others/BSD-1c
QEMU_LICENSE_FILES = COPYING COPYING.LIB

并且在qemu的编译配置中(搜索宏QEMU_CONFIGURE_CMDS),添加编译参数:

--disable-hexagon-idef-parser

kernel编译

配置
make ARCH=arm64 CROSS_COMPILE=/home/yue/beauty/proj/rk356x_linux_release_v1.3.0b_20221213/prebuilts/gcc/linux-x86/aarch64/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu- menuconfig

选择KVM:

KVM系统虚拟化性能测试过程总结_第4张图片

配置console:

KVM系统虚拟化性能测试过程总结_第5张图片

关联根文件系统:

KVM系统虚拟化性能测试过程总结_第6张图片

编译
ake ARCH=arm64 CROSS_COMPILE=/home/yue/beauty/proj/rk356x_linux_release_v1.3.0b_20221213/prebuilts/gcc/linux-x86/aarch64/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu- j8

输出
  • arch/arm64/boot/Image
  • vmlinux

file一下发现:

  • Image: Linux kernel ARM64 boot executable Image, little-endian, 4K pages
  • vmlinux: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, BuildID[sha1]=dad4c3147e36034c9cb13a786bd8e238f740504b, with debug_info, not stripped

查看编译信息:

  SORTEX  vmlinux
  SYSMAP  System.map
  OBJCOPY arch/arm64/boot/Image
  Building modules, stage 2.
  MODPOST 418 modules

编译报错

在编译buildroot中的dtc库的时候:

error:multiple definition of `yylloc'

屏蔽dtc-parse.tab.c文件中1205行左右的YYLTYPE yylloc,或者extern YYLTYPE yylloc:

/* The semantic value of the lookahead symbol.  */
YYSTYPE yylval;
/* Location data for the lookahead symbol.  */
//YYLTYPE yylloc
# if defined YYLTYPE_IS_TRIVIAL && YYLTYPE_IS_TRIVIAL
  = { 1, 1, 1, 1 }
# endif
;

在编译buildroot中的ctest库时候:

buildroot/output/build/host-cmake-3.8.2/Source/cmServerProtocol.cxx:626:39: error: ‘numeric_limits’ is not a member of ‘std’

此时需要添加头文件引用:

#include 
#include 

在编译qemu的时候:

FAILED: target/hexagon/idef-parser

link meson-generated_idef-parser.tab.c.o libglib-2.0.so: error adding symbols: file in wrong format

使用file查看这两个文件发现,一个是x86,一个是aarch64

--disable-hexagon-idef-parser

信息查看

内核版本

cat /proc/version
Linux version 4.19.232 (root@yue-yi-machine) ((HEAD: f7165816db073abb32bfe4f754a317d687c7bbcf) (sdk version: rk356x_linux_release_20230710_v1.3.2f.xml) (gcc version 10.3.1 20210621 
root@RK356X:/#

发行版本

cat /etc/issue
Welcome to RK356X Buildroot

CPU信息:

root@RK356X:/# lscpu
Architecture:            aarch64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0-3
Vendor ID:               ARM
  Model name:            Cortex-A55
    Model:               0
    Thread(s) per core:  1
    Core(s) per cluster: 4
    Socket(s):           -
    Cluster(s):          1
    Stepping:            r2p0
    CPU max MHz:         1992.0000
    CPU min MHz:         408.0000
    BogoMIPS:            48.00

另外:cat /proc/cpuinfo可以查看每一个CPU信息。

**内存信息:**单位为MB

root@RK356X:/# free -m
               total        used        free      shared  buff/cache   available
Mem:            3837         106        3567           1         163        3690
Swap:              0           0           0

分区信息

me@ubuntu:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            875M     0  875M   0% /dev
tmpfs           185M  3.0M  182M   2% /run
/dev/mmcblk0p2   29G  4.0G   24G  15% /
tmpfs           924M     0  924M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           924M     0  924M   0% /sys/fs/cgroup
/dev/loop0       58M   58M     0 100% /snap/core20/1614
/dev/loop2       92M   92M     0 100% /snap/lxd/24065
/dev/loop1       62M   62M     0 100% /snap/lxd/22761
/dev/loop3       36M   36M     0 100% /snap/snapd/20674
/dev/mmcblk0p1  253M  121M  132M  48% /boot/firmware
tmpfs           185M     0  185M   0% /run/user/1000

查看是否开启了KVM:

查看开机信息:

[    0.011336] CPU: All CPU(s) started at EL2
[    0.106179] kvm [1]: IPA Size Limit: 44 bits
[    0.107506] kvm [1]: vgic interrupt IRQ9
[    0.107646] kvm [1]: Hyp mode initialized successfully

编译时确认:

如果是buildroot编译,可以查看kernel/arch/arm64/configs目录下,指定的配置文件是否有CONFIG_VIRTUALIZATION宏,如果没有,需要自己配置添加上。

命令查看:

zcat /proc/config.gz | grep "CONFIG_VIRTUALIZATION"

得到shell输出:

CONFIG_VIRTUALIZATION=y

firefly-rk3568环境搭建

sdk安装

下载sdk:Firefly | 让科技更简单,让生活更智能 (t-firefly.com)

  1. 解压SDK
chmod +x ./sdk_tools.sh

创建一个目录以存放SDK:比如我现在这个是3588的SDK,我想解压到上一层文件夹,避免污染当前目录

mkdir ../firefly_rk3588_SDK
./sdk_tools.sh --unpack -C ../firefly_rk3588_SDK
  1. 还原工作目录

选择刚才解压后的目录

./sdk_tools.sh --sync -C ../firefly_rk3588_SDK

可以使用上面脚本执行或者手动执行命令,然后进入刚刚解压后的目录

cd ../firefly_rk3588_SDK
.repo/repo/repo sync -l
.repo/repo/repo start firefly --all
  1. 更新SDK

前面2个步骤只在第一次解压SDK时执行,后续更新SDK只需进入SDK目录执行第3步骤,进行网络更新

.repo/repo/repo sync -c --no-tags
编译

执行:./build.sh,选择:rk3568-firefly

烧录

使用工具:RKDevTool_Release

KVM系统虚拟化性能测试过程总结_第7张图片

加载虚拟机镜像

使用tftp从主机下载开发板,因为是busybox里面的tftp,是一款应用于嵌入式开发系统上的一款小巧tftp工具,所以方法和普通tftp有异:

root@RK356X:/# tftp ?
BusyBox v1.34.1 (2023-12-23 16:25:21 CST) multi-call binary.
Usage: tftp [OPTIONS] HOST [PORT]

Transfer a file from/to tftp server
        -l FILE Local FILE
        -r FILE Remote FILE
        -g      Get file
        -p      Put file
        -b SIZE Transfer blocks in bytes
tftp -g -l Image -r Image 192.168.0.102

树莓派4B环境搭建

烧录

树莓派的烧录需要一张SD卡,并将其格式化fat32。使用raspberry Pi Imager工具把linux镜像烧录到SD卡中。

https://www.raspberrypi.com/software/

KVM系统虚拟化性能测试过程总结_第8张图片

配置修改

cmdline.txt需要修改,取消slient

console=serial0,115200 console=tty1 root=PARTUUID=686c0ceb-02 rootfstype=ext4 fsck.repair=yes rootwait splash plymouth.ignore-serial-consoles

config.txt需要修改,使能uart

enable_uart=1

如果需要启动u-boot,还得添加:

kernel=u-boot.bin

并将u-boot.bin/uImage/urootfs.cpio放入sd卡根目录中。

uboot中启动kernel:

setenv bootargs "8250.nr_uarts=1 console=ttyS0,115200"
fatload mmc 0:1 0x80000 uImage; fatload mmc 0:1 0x3800000 bcm2711-rpi-4-b.dtb; fatload mmc 0:1 0x5800000 urootfs.cpio; bootm 0x80000 0x5800000 0x3800000
运行

安装qemu,因为是debian系统,所以直接用apt命令了:

sudo apt install qemu-system

报警告:

hwmon1: Undervoltage detected

电压不足,我是用的usb连接的电脑端。换了根电源线,好一点。

后面直接接手机快充+配套电源线。可恶的警告⚠再也没出现了。可见一定要给供电能力配足。

测试

基准测试程序
  1. Dhrystone是一个用于测量处理器整形性能的简单基准测试
  2. Cachebench是评估计算机系统内存性能
  3. 内存带宽已经被认为能够影响系统性能
  4. Hackbench通过确定调度给定数目任务花费的时间来测量系统调度性能
unixbench能测什么
测试结果
linux kernel使用:4.19.232

unixbench版本:BYTE UNIX Benchmarks (Version 5.1.3)

qemu版本:最新的8.2.0

硬件:4核A55,4GB内存

测试结果:

在rk3568中,启用qemu,并分别在使能和不使能kvm的情况下启动虚拟机linux。

  • 不使能kvm性能很差,使用unixbench,Dhrystone只能达到硬件的2.04%
  • 使能kvm后性能提升非常大,使用unixbench,单多核Dhrystone均达到98.4%
  • 当使用kvmtool替换掉qemu后,使用unixbench,单多核Dhrystone也能达到98.35%,和qemu几乎相差不大
虚拟机linux kernel:4.19.232

host linux kernel:6.1.0

unixbench版本:BYTE UNIX Benchmarks (Version 5.1.3)

硬件:4核A72,2GB内存

测试结果:

在树莓派中使能KVM,并启动虚拟机Linux。

  • 虚拟机比真实物理机的Dhrystone测试结果更快,达到了100.7%
  • 差异点可能在于内核的版本和配置,没有一致
rk3568(A55-4核-4GB)
直接运行linux

第一次测试,把unixbench放入tmpfs测试:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: RK356X: GNU/Linux
   OS: GNU/Linux -- 4.19.232 -- #1 SMP Fri Dec 22 15:28:34 CST 2023
   Machine: aarch64 (unknown)
   Language:  (charmap=, collate=)
   CPU 0:  (48.0 bogomips)

   CPU 1:  (48.0 bogomips)

   CPU 2:  (48.0 bogomips)

   CPU 3:  (48.0 bogomips)

   01:02:26 up 35 min,  0 users,  load average: 0.09, 0.04, 0.01; runlevel

------------------------------------------------------------------------
Benchmark Run: Tue Jan 02 2024 01:02:26 - 01:30:47
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       12048408.6 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2729.1 MWIPS (10.1 s, 7 samples)
Execl Throughput                                702.0 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        286853.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           82849.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        687970.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              490535.2 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  34078.8 lps   (10.0 s, 7 samples)
Process Creation                               2246.3 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1476.0 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    460.7 lpm   (60.1 s, 2 samples)
System Call Overhead                         746013.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   12048408.6   1032.4
Double-Precision Whetstone                       55.0       2729.1    496.2
Execl Throughput                                 43.0        702.0    163.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     286853.2    724.4
File Copy 256 bufsize 500 maxblocks            1655.0      82849.8    500.6
File Copy 4096 bufsize 8000 maxblocks          5800.0     687970.7   1186.2
Pipe Throughput                               12440.0     490535.2    394.3
Pipe-based Context Switching                   4000.0      34078.8     85.2
Process Creation                                126.0       2246.3    178.3
Shell Scripts (1 concurrent)                     42.4       1476.0    348.1
Shell Scripts (8 concurrent)                      6.0        460.7    767.8
System Call Overhead                          15000.0     746013.0    497.3
                                                                   ========
System Benchmarks Index Score                                         418.2

------------------------------------------------------------------------
Benchmark Run: Tue Jan 02 2024 01:30:47 - 01:59:11
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       46934003.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    10671.0 MWIPS (10.1 s, 7 samples)
Execl Throughput                               2514.5 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        544406.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          151998.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1428537.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1915090.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 196978.4 lps   (10.0 s, 7 samples)
Process Creation                               7348.1 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4310.9 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    575.1 lpm   (60.2 s, 2 samples)
System Call Overhead                        2696112.3 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46934003.7   4021.8
Double-Precision Whetstone                       55.0      10671.0   1940.2
Execl Throughput                                 43.0       2514.5    584.8
File Copy 1024 bufsize 2000 maxblocks          3960.0     544406.0   1374.8
File Copy 256 bufsize 500 maxblocks            1655.0     151998.7    918.4
File Copy 4096 bufsize 8000 maxblocks          5800.0    1428537.7   2463.0
Pipe Throughput                               12440.0    1915090.3   1539.5
Pipe-based Context Switching                   4000.0     196978.4    492.4
Process Creation                                126.0       7348.1    583.2
Shell Scripts (1 concurrent)                     42.4       4310.9   1016.7
Shell Scripts (8 concurrent)                      6.0        575.1    958.5
System Call Overhead                          15000.0    2696112.3   1797.4
                                                                   ========
System Benchmarks Index Score                                        1221.1

第二次,把unixbench放入sda磁盘测试,文件拷贝速度显著减弱:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: RK356X: GNU/Linux
   OS: GNU/Linux -- 4.19.232 -- #1 SMP Fri Dec 22 15:28:34 CST 2023
   Machine: aarch64 (unknown)
   Language:  (charmap=, collate=)
   CPU 0:  (48.0 bogomips)

   CPU 1:  (48.0 bogomips)

   CPU 2:  (48.0 bogomips)

   CPU 3:  (48.0 bogomips)

   08:12:29 up 0 min,  0 users,  load average: 0.26, 0.10, 0.03; runlevel

------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 08:12:29 - 08:40:51
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       11976868.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2716.7 MWIPS (10.1 s, 7 samples)
Execl Throughput                                689.3 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         88528.7 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           25141.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        293936.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                              488819.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  33583.5 lps   (10.0 s, 7 samples)
Process Creation                               2216.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1384.7 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    435.5 lpm   (60.1 s, 2 samples)
System Call Overhead                         742257.1 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   11976868.7   1026.3
Double-Precision Whetstone                       55.0       2716.7    493.9
Execl Throughput                                 43.0        689.3    160.3
File Copy 1024 bufsize 2000 maxblocks          3960.0      88528.7    223.6
File Copy 256 bufsize 500 maxblocks            1655.0      25141.9    151.9
File Copy 4096 bufsize 8000 maxblocks          5800.0     293936.5    506.8
Pipe Throughput                               12440.0     488819.7    392.9
Pipe-based Context Switching                   4000.0      33583.5     84.0
Process Creation                                126.0       2216.5    175.9
Shell Scripts (1 concurrent)                     42.4       1384.7    326.6
Shell Scripts (8 concurrent)                      6.0        435.5    725.8
System Call Overhead                          15000.0     742257.1    494.8
                                                                   ========
System Benchmarks Index Score                                         314.9

------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 08:40:51 - 09:09:15
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       46773415.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    10638.8 MWIPS (10.1 s, 7 samples)
Execl Throughput                               2500.9 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        110127.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           29874.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        397006.1 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1909959.0 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 196549.5 lps   (10.0 s, 7 samples)
Process Creation                               6751.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4099.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    544.8 lpm   (60.2 s, 2 samples)
System Call Overhead                        2690986.9 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46773415.3   4008.0
Double-Precision Whetstone                       55.0      10638.8   1934.3
Execl Throughput                                 43.0       2500.9    581.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     110127.0    278.1
File Copy 256 bufsize 500 maxblocks            1655.0      29874.4    180.5
File Copy 4096 bufsize 8000 maxblocks          5800.0     397006.1    684.5
Pipe Throughput                               12440.0    1909959.0   1535.3
Pipe-based Context Switching                   4000.0     196549.5    491.4
Process Creation                                126.0       6751.0    535.8
Shell Scripts (1 concurrent)                     42.4       4099.2    966.8
Shell Scripts (8 concurrent)                      6.0        544.8    908.0
System Call Overhead                          15000.0    2690986.9   1794.0
                                                                   ========
System Benchmarks Index Score                                         824.5

qemu(a55-4核-2GB)

在rk3568的linux中不使能KVM,进行模拟:

qemu-system-aarch64 -M virt,virtualization=true -cpu cortex-a55 -nographic -smp 4 -m 2048 -kernel Image --append "console=ttyAMA0"

结果:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: RK3568_qemu: GNU/Linux
   OS: GNU/Linux -- 4.19.232 -- #2 SMP PREEMPT Fri Dec 29 09:46:36 CST 2023
   Machine: aarch64 (unknown)
   Language:  (charmap=, collate=)
   CPU 0:  (125.0 bogomips)

   CPU 1:  (125.0 bogomips)

   CPU 2:  (125.0 bogomips)

   CPU 3:  (125.0 bogomips)

   02:50:14 up 2 min,  1 user,  load average: 0.61, 0.30, 0.11; runlevel

------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 02:50:15 - 03:20:33
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables         274451.3 lps   (10.3 s, 7 samples)
Double-Precision Whetstone                       81.9 MWIPS (10.0 s, 7 samples)
Execl Throughput                                 20.0 lps   (29.6 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks          2864.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks             775.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks          9293.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                                4334.4 lps   (10.3 s, 7 samples)
Pipe-based Context Switching                    464.6 lps   (10.3 s, 7 samples)
Process Creation                                 37.3 lps   (30.3 s, 2 samples)
Shell Scripts (1 concurrent)                     43.6 lpm   (60.5 s, 2 samples)
Shell Scripts (8 concurrent)                      7.3 lpm   (65.9 s, 2 samples)
System Call Overhead                           5166.9 lps   (10.3 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0     274451.3     23.5
Double-Precision Whetstone                       55.0         81.9     14.9
Execl Throughput                                 43.0         20.0      4.6
File Copy 1024 bufsize 2000 maxblocks          3960.0       2864.5      7.2
File Copy 256 bufsize 500 maxblocks            1655.0        775.0      4.7
File Copy 4096 bufsize 8000 maxblocks          5800.0       9293.7     16.0
Pipe Throughput                               12440.0       4334.4      3.5
Pipe-based Context Switching                   4000.0        464.6      1.2
Process Creation                                126.0         37.3      3.0
Shell Scripts (1 concurrent)                     42.4         43.6     10.3
Shell Scripts (8 concurrent)                      6.0          7.3     12.1
System Call Overhead                          15000.0       5166.9      3.4
                                                                   ========
System Benchmarks Index Score                                           6.4

------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 03:20:33 - 03:54:14
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables         958472.9 lps   (10.8 s, 7 samples)
Double-Precision Whetstone                      305.9 MWIPS (10.0 s, 7 samples)
Execl Throughput                                 42.9 lps   (30.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         11160.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks            2090.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks         40914.0 KBps  (30.0 s, 2 samples)
Pipe Throughput                                9138.3 lps   (11.2 s, 7 samples)
Pipe-based Context Switching                    766.3 lps   (11.3 s, 7 samples)
Process Creation                                 74.5 lps   (31.5 s, 2 samples)
Shell Scripts (1 concurrent)                     59.5 lpm   (63.5 s, 2 samples)
Shell Scripts (8 concurrent)                      3.1 lpm   (78.5 s, 2 samples)
System Call Overhead                           5585.6 lps   (11.3 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0     958472.9     82.1
Double-Precision Whetstone                       55.0        305.9     55.6
Execl Throughput                                 43.0         42.9     10.0
File Copy 1024 bufsize 2000 maxblocks          3960.0      11160.0     28.2
File Copy 256 bufsize 500 maxblocks            1655.0       2090.4     12.6
File Copy 4096 bufsize 8000 maxblocks          5800.0      40914.0     70.5
Pipe Throughput                               12440.0       9138.3      7.3
Pipe-based Context Switching                   4000.0        766.3      1.9
Process Creation                                126.0         74.5      5.9
Shell Scripts (1 concurrent)                     42.4         59.5     14.0
Shell Scripts (8 concurrent)                      6.0          3.1      5.1
System Call Overhead                          15000.0       5585.6      3.7
                                                                   ========
System Benchmarks Index Score                                          13.1

qemu(host-4核-2GB)

在rk3568的linux中输入命令:

qemu-system-aarch64 -cpu host -m 2048 -enable-kvm -nographic -machine virt -smp 4 -kernel Image -append "console=ttyAMA0"

结果:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: RK3568_qemu: GNU/Linux
   OS: GNU/Linux -- 4.19.232 -- #2 SMP PREEMPT Fri Dec 29 09:46:36 CST 2023
   Machine: aarch64 (unknown)
   Language:  (charmap=, collate=)
   CPU 0:  (48.0 bogomips)

   CPU 1:  (48.0 bogomips)

   CPU 2:  (48.0 bogomips)

   CPU 3:  (48.0 bogomips)

   07:02:45 up 0 min,  1 user,  load average: 0.03, 0.01, 0.00; runlevel

------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 07:02:45 - 07:31:04
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       11819460.8 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2945.1 MWIPS (10.0 s, 7 samples)
Execl Throughput                               1025.8 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        301514.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           89249.3 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        715154.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              577421.2 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  31832.7 lps   (10.0 s, 7 samples)
Process Creation                               1459.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1259.9 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    526.5 lpm   (60.1 s, 2 samples)
System Call Overhead                         844864.8 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   11819460.8   1012.8
Double-Precision Whetstone                       55.0       2945.1    535.5
Execl Throughput                                 43.0       1025.8    238.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     301514.3    761.4
File Copy 256 bufsize 500 maxblocks            1655.0      89249.3    539.3
File Copy 4096 bufsize 8000 maxblocks          5800.0     715154.7   1233.0
Pipe Throughput                               12440.0     577421.2    464.2
Pipe-based Context Switching                   4000.0      31832.7     79.6
Process Creation                                126.0       1459.0    115.8
Shell Scripts (1 concurrent)                     42.4       1259.9    297.1
Shell Scripts (8 concurrent)                      6.0        526.5    877.5
System Call Overhead                          15000.0     844864.8    563.2
                                                                   ========
System Benchmarks Index Score                                         431.1

------------------------------------------------------------------------
Benchmark Run: Fri Dec 29 2023 07:31:04 - 07:59:26
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       46075093.1 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    11500.5 MWIPS (10.0 s, 7 samples)
Execl Throughput                               2586.5 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        617301.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          183004.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1446823.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2253111.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 311796.3 lps   (10.0 s, 7 samples)
Process Creation                               4987.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4130.5 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    564.8 lpm   (60.2 s, 2 samples)
System Call Overhead                        3029954.4 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46075093.1   3948.2
Double-Precision Whetstone                       55.0      11500.5   2091.0
Execl Throughput                                 43.0       2586.5    601.5
File Copy 1024 bufsize 2000 maxblocks          3960.0     617301.3   1558.8
File Copy 256 bufsize 500 maxblocks            1655.0     183004.5   1105.8
File Copy 4096 bufsize 8000 maxblocks          5800.0    1446823.8   2494.5
Pipe Throughput                               12440.0    2253111.7   1811.2
Pipe-based Context Switching                   4000.0     311796.3    779.5
Process Creation                                126.0       4987.0    395.8
Shell Scripts (1 concurrent)                     42.4       4130.5    974.2
Shell Scripts (8 concurrent)                      6.0        564.8    941.4
System Call Overhead                          15000.0    3029954.4   2020.0
                                                                   ========
System Benchmarks Index Score                                        1294.3

kvmtool(host-4核-2GB)
root@RK356X:/root# lkvm run --kernel Image -m 2048
 # lkvm run -k Image -m 2048 -c 4 --name guest-1568

其中kernel默认配置即可,rootfs同理,不需要更改ttyAMA0

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: RK3568_qemu: GNU/Linux
   OS: GNU/Linux -- 5.16.12 -- #3 SMP PREEMPT Wed Jan 10 16:17:21 CST 2024
   Machine: aarch64 (unknown)
   Language:  (charmap=, collate=)
   CPU 0:  (48.0 bogomips)

   CPU 1:  (48.0 bogomips)

   CPU 2:  (48.0 bogomips)

   CPU 3:  (48.0 bogomips)

   00:00:21 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00; runlevel

------------------------------------------------------------------------
Benchmark Run: Thu Jan 01 1970 00:00:21 - 00:28:42
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       11791200.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2942.2 MWIPS (10.0 s, 7 samples)
Execl Throughput                                417.8 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        245581.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           75954.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        597586.0 KBps  (30.0 s, 2 samples)
Pipe Throughput                              514077.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  26346.4 lps   (10.0 s, 7 samples)
Process Creation                                251.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1071.0 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    394.0 lpm   (60.2 s, 2 samples)
System Call Overhead                         715970.1 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   11791200.9   1010.4
Double-Precision Whetstone                       55.0       2942.2    534.9
Execl Throughput                                 43.0        417.8     97.2
File Copy 1024 bufsize 2000 maxblocks          3960.0     245581.3    620.2
File Copy 256 bufsize 500 maxblocks            1655.0      75954.9    458.9
File Copy 4096 bufsize 8000 maxblocks          5800.0     597586.0   1030.3
Pipe Throughput                               12440.0     514077.4    413.2
Pipe-based Context Switching                   4000.0      26346.4     65.9
Process Creation                                126.0        251.0     19.9
Shell Scripts (1 concurrent)                     42.4       1071.0    252.6
Shell Scripts (8 concurrent)                      6.0        394.0    656.7
System Call Overhead                          15000.0     715970.1    477.3
                                                                   ========
System Benchmarks Index Score                                         305.5

------------------------------------------------------------------------
Benchmark Run: Thu Jan 01 1970 00:28:42 - 00:57:05
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       46001918.1 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    11480.7 MWIPS (10.0 s, 7 samples)
Execl Throughput                               1861.1 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        566906.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          171430.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1294907.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2005545.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 215499.0 lps   (10.0 s, 7 samples)
Process Creation                               4414.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3166.2 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    422.4 lpm   (60.3 s, 2 samples)
System Call Overhead                        2611391.3 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   46001918.1   3941.9
Double-Precision Whetstone                       55.0      11480.7   2087.4
Execl Throughput                                 43.0       1861.1    432.8
File Copy 1024 bufsize 2000 maxblocks          3960.0     566906.0   1431.6
File Copy 256 bufsize 500 maxblocks            1655.0     171430.7   1035.8
File Copy 4096 bufsize 8000 maxblocks          5800.0    1294907.7   2232.6
Pipe Throughput                               12440.0    2005545.6   1612.2
Pipe-based Context Switching                   4000.0     215499.0    538.7
Process Creation                                126.0       4414.2    350.3
Shell Scripts (1 concurrent)                     42.4       3166.2    746.7
Shell Scripts (8 concurrent)                      6.0        422.4    704.0
System Call Overhead                          15000.0    2611391.3   1740.9
                                                                   ========
System Benchmarks Index Score                                        1104.2

树莓派4B(A72-4核-2GB)
直接运行linux

直接把unixbench放入tmpfs测试:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: raspberrypi: GNU/Linux
   OS: GNU/Linux -- 6.1.0-rpi7-rpi-v8 -- #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24)
   Machine: aarch64 (unknown)
   Language: en_US.utf8 (charmap="ANSI_X3.4-1968", collate="ANSI_X3.4-1968")
   CPU 0:  (108.0 bogomips)

   CPU 1:  (108.0 bogomips)

   CPU 2:  (108.0 bogomips)

   CPU 3:  (108.0 bogomips)

   04:53:13 up 10 min,  3 users,  load average: 0.69, 0.54, 0.31; runlevel Jan

------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 04:53:13 - 05:21:26
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       19203197.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3225.8 MWIPS (9.9 s, 7 samples)
Execl Throughput                               1132.7 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        190043.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           57100.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        520541.2 KBps  (30.0 s, 2 samples)
Pipe Throughput                              182851.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  37719.8 lps   (10.0 s, 7 samples)
Process Creation                               1996.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3365.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    455.4 lpm   (60.1 s, 2 samples)
System Call Overhead                         128006.6 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19203197.4   1645.5
Double-Precision Whetstone                       55.0       3225.8    586.5
Execl Throughput                                 43.0       1132.7    263.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     190043.9    479.9
File Copy 256 bufsize 500 maxblocks            1655.0      57100.4    345.0
File Copy 4096 bufsize 8000 maxblocks          5800.0     520541.2    897.5
Pipe Throughput                               12440.0     182851.4    147.0
Pipe-based Context Switching                   4000.0      37719.8     94.3
Process Creation                                126.0       1996.6    158.5
Shell Scripts (1 concurrent)                     42.4       3365.8    793.8
Shell Scripts (8 concurrent)                      6.0        455.4    759.0
System Call Overhead                          15000.0     128006.6     85.3
                                                                   ========
System Benchmarks Index Score                                         356.9

------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 05:21:26 - 05:49:02
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       27031362.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     5970.9 MWIPS (7.1 s, 7 samples)
Execl Throughput                               1539.6 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        285477.7 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           80235.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        798634.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                              257923.5 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  45554.8 lps   (10.0 s, 7 samples)
Process Creation                               4925.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3740.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    499.8 lpm   (60.1 s, 2 samples)
System Call Overhead                         178635.9 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   27031362.4   2316.3
Double-Precision Whetstone                       55.0       5970.9   1085.6
Execl Throughput                                 43.0       1539.6    358.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     285477.7    720.9
File Copy 256 bufsize 500 maxblocks            1655.0      80235.0    484.8
File Copy 4096 bufsize 8000 maxblocks          5800.0     798634.9   1377.0
Pipe Throughput                               12440.0     257923.5    207.3
Pipe-based Context Switching                   4000.0      45554.8    113.9
Process Creation                                126.0       4925.6    390.9
Shell Scripts (1 concurrent)                     42.4       3740.2    882.1
Shell Scripts (8 concurrent)                      6.0        499.8    832.9
System Call Overhead                          15000.0     178635.9    119.1
                                                                   ========
System Benchmarks Index Score                                         515.2

直接接入5v/3a的手机电源,性能确实显著提升了,可见一定要给足电源供电:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: raspberrypi: GNU/Linux
   OS: GNU/Linux -- 6.1.0-rpi7-rpi-v8 -- #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24)
   Machine: aarch64 (unknown)
   Language: en_US.utf8 (charmap="ANSI_X3.4-1968", collate="ANSI_X3.4-1968")
   CPU 0:  (108.0 bogomips)

   CPU 1:  (108.0 bogomips)

   CPU 2:  (108.0 bogomips)

   CPU 3:  (108.0 bogomips)

   08:27:14 up 0 min,  3 users,  load average: 1.07, 0.39, 0.14; runlevel Jan

------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 08:27:14 - 08:55:28
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       19259097.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3225.7 MWIPS (9.9 s, 7 samples)
Execl Throughput                               1109.5 lps   (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        172250.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           50151.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        479185.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                              183515.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  37330.6 lps   (10.0 s, 7 samples)
Process Creation                               1973.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3498.7 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    962.3 lpm   (60.0 s, 2 samples)
System Call Overhead                         128098.5 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19259097.7   1650.3
Double-Precision Whetstone                       55.0       3225.7    586.5
Execl Throughput                                 43.0       1109.5    258.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     172250.3    435.0
File Copy 256 bufsize 500 maxblocks            1655.0      50151.7    303.0
File Copy 4096 bufsize 8000 maxblocks          5800.0     479185.6    826.2
Pipe Throughput                               12440.0     183515.7    147.5
Pipe-based Context Switching                   4000.0      37330.6     93.3
Process Creation                                126.0       1973.5    156.6
Shell Scripts (1 concurrent)                     42.4       3498.7    825.2
Shell Scripts (8 concurrent)                      6.0        962.3   1603.9
System Call Overhead                          15000.0     128098.5     85.4
                                                                   ========
System Benchmarks Index Score                                         370.2

------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 08:55:28 - 09:23:44
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       74018347.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12887.5 MWIPS (9.9 s, 7 samples)
Execl Throughput                               3300.0 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        650225.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          193757.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1355042.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              705913.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 127388.2 lps   (10.0 s, 7 samples)
Process Creation                               7114.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   7725.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1020.3 lpm   (60.1 s, 2 samples)
System Call Overhead                         492634.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   74018347.3   6342.6
Double-Precision Whetstone                       55.0      12887.5   2343.2
Execl Throughput                                 43.0       3300.0    767.5
File Copy 1024 bufsize 2000 maxblocks          3960.0     650225.3   1642.0
File Copy 256 bufsize 500 maxblocks            1655.0     193757.5   1170.7
File Copy 4096 bufsize 8000 maxblocks          5800.0    1355042.7   2336.3
Pipe Throughput                               12440.0     705913.3    567.5
Pipe-based Context Switching                   4000.0     127388.2    318.5
Process Creation                                126.0       7114.5    564.6
Shell Scripts (1 concurrent)                     42.4       7725.8   1822.1
Shell Scripts (8 concurrent)                      6.0       1020.3   1700.4
System Call Overhead                          15000.0     492634.0    328.4
                                                                   ========
System Benchmarks Index Score                                        1149.4

重新跑ubuntu20.04.5发现性能更差。

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: ubuntu: GNU/Linux
   OS: GNU/Linux -- 5.4.0-1100-raspi -- #112-Ubuntu SMP PREEMPT Fri Nov 24 15:35:17 UTC 2023
   Machine: aarch64 (aarch64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0:  (108.0 bogomips)

   CPU 1:  (108.0 bogomips)

   CPU 2:  (108.0 bogomips)

   CPU 3:  (108.0 bogomips)

   04:38:47 up 6 min,  1 user,  load average: 0.24, 0.23, 0.10; runlevel 2024-01-04

------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 04:38:47 - 05:07:02
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       18051001.2 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3218.4 MWIPS (9.9 s, 7 samples)
Execl Throughput                               1029.2 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        105458.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           30120.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        294282.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                              159132.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  33518.0 lps   (10.0 s, 7 samples)
Process Creation                               2875.3 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2477.9 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    834.8 lpm   (60.0 s, 2 samples)
System Call Overhead                         202649.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   18051001.2   1546.8
Double-Precision Whetstone                       55.0       3218.4    585.2
Execl Throughput                                 43.0       1029.2    239.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     105458.2    266.3
File Copy 256 bufsize 500 maxblocks            1655.0      30120.0    182.0
File Copy 4096 bufsize 8000 maxblocks          5800.0     294282.6    507.4
Pipe Throughput                               12440.0     159132.6    127.9
Pipe-based Context Switching                   4000.0      33518.0     83.8
Process Creation                                126.0       2875.3    228.2
Shell Scripts (1 concurrent)                     42.4       2477.9    584.4
Shell Scripts (8 concurrent)                      6.0        834.8   1391.4
System Call Overhead                          15000.0     202649.0    135.1
                                                                   ========
System Benchmarks Index Score                                         325.8

------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 05:07:02 - 05:35:18
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       70864726.5 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12882.1 MWIPS (9.9 s, 7 samples)
Execl Throughput                               2917.2 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        205595.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           55555.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        492145.4 KBps  (30.0 s, 2 samples)
Pipe Throughput                              635438.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 148162.1 lps   (10.0 s, 7 samples)
Process Creation                               6958.9 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6744.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    922.1 lpm   (60.1 s, 2 samples)
System Call Overhead                         793322.3 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   70864726.5   6072.4
Double-Precision Whetstone                       55.0      12882.1   2342.2
Execl Throughput                                 43.0       2917.2    678.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     205595.6    519.2
File Copy 256 bufsize 500 maxblocks            1655.0      55555.4    335.7
File Copy 4096 bufsize 8000 maxblocks          5800.0     492145.4    848.5
Pipe Throughput                               12440.0     635438.3    510.8
Pipe-based Context Switching                   4000.0     148162.1    370.4
Process Creation                                126.0       6958.9    552.3
Shell Scripts (1 concurrent)                     42.4       6744.2   1590.6
Shell Scripts (8 concurrent)                      6.0        922.1   1536.9
System Call Overhead                          15000.0     793322.3    528.9
                                                                   ========
System Benchmarks Index Score                                         871.8

切换成root用户:

   System: ubuntu: GNU/Linux
   OS: GNU/Linux -- 5.4.0-1100-raspi -- #112-Ubuntu SMP PREEMPT Fri Nov 24 15:35:17 UTC 2023
   Machine: aarch64 (aarch64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0:  (108.0 bogomips)

   CPU 1:  (108.0 bogomips)

   CPU 2:  (108.0 bogomips)

   CPU 3:  (108.0 bogomips)

   07:55:25 up 1 min,  1 user,  load average: 0.14, 0.11, 0.04; runlevel 2024-01-04

------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 07:55:25 - 08:23:40
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       18111605.8 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3219.5 MWIPS (9.9 s, 7 samples)
Execl Throughput                                997.6 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        107925.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           29536.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        301787.4 KBps  (30.0 s, 2 samples)
Pipe Throughput                              164363.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  33460.7 lps   (10.0 s, 7 samples)
Process Creation                               2821.3 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2483.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    839.2 lpm   (60.1 s, 2 samples)
System Call Overhead                         203130.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   18111605.8   1552.0
Double-Precision Whetstone                       55.0       3219.5    585.4
Execl Throughput                                 43.0        997.6    232.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     107925.9    272.5
File Copy 256 bufsize 500 maxblocks            1655.0      29536.4    178.5
File Copy 4096 bufsize 8000 maxblocks          5800.0     301787.4    520.3
Pipe Throughput                               12440.0     164363.4    132.1
Pipe-based Context Switching                   4000.0      33460.7     83.7
Process Creation                                126.0       2821.3    223.9
Shell Scripts (1 concurrent)                     42.4       2483.8    585.8
Shell Scripts (8 concurrent)                      6.0        839.2   1398.7
System Call Overhead                          15000.0     203130.0    135.4
                                                                   ========
System Benchmarks Index Score                                         326.4

------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 08:23:40 - 08:51:58
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       69996456.5 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    12829.6 MWIPS (10.0 s, 7 samples)
Execl Throughput                               2890.7 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        202244.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           54757.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        491815.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              629127.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 146257.3 lps   (10.0 s, 7 samples)
Process Creation                               7002.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6768.1 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    920.9 lpm   (60.1 s, 2 samples)
System Call Overhead                         793749.6 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   69996456.5   5998.0
Double-Precision Whetstone                       55.0      12829.6   2332.6
Execl Throughput                                 43.0       2890.7    672.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     202244.3    510.7
File Copy 256 bufsize 500 maxblocks            1655.0      54757.0    330.9
File Copy 4096 bufsize 8000 maxblocks          5800.0     491815.7    848.0
Pipe Throughput                               12440.0     629127.6    505.7
Pipe-based Context Switching                   4000.0     146257.3    365.6
Process Creation                                126.0       7002.8    555.8
Shell Scripts (1 concurrent)                     42.4       6768.1   1596.2
Shell Scripts (8 concurrent)                      6.0        920.9   1534.8
System Call Overhead                          15000.0     793749.6    529.2
                                                                   ========
System Benchmarks Index Score                                         866.7

qemu(host-4核-1GB)
qemu-system-aarch64 -cpu host -m 1024 -enable-kvm -nographic -machine virt -smp 4 -kernel Image -append "console=ttyAMA0"

运行起来:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: RK3568_qemu: GNU/Linux
   OS: GNU/Linux -- 4.19.232 -- #3 SMP PREEMPT Wed Jan 3 10:24:38 CST 2024
   Machine: aarch64 (unknown)
   Language:  (charmap=, collate=)
   CPU 0:  (108.0 bogomips)

   CPU 1:  (108.0 bogomips)

   CPU 2:  (108.0 bogomips)

   CPU 3:  (108.0 bogomips)

   09:27:56 up 0 min,  1 user,  load average: 0.00, 0.00, 0.00; runlevel

------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 09:27:56 - 09:56:06
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       19394971.2 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3269.5 MWIPS (9.9 s, 7 samples)
Execl Throughput                               2316.3 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        436850.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          157413.2 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        825354.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                              749421.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  40439.3 lps   (10.0 s, 7 samples)
Process Creation                               3964.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3347.3 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    847.3 lpm   (60.0 s, 2 samples)
System Call Overhead                         638685.4 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19394971.2   1662.0
Double-Precision Whetstone                       55.0       3269.5    594.5
Execl Throughput                                 43.0       2316.3    538.7
File Copy 1024 bufsize 2000 maxblocks          3960.0     436850.9   1103.2
File Copy 256 bufsize 500 maxblocks            1655.0     157413.2    951.1
File Copy 4096 bufsize 8000 maxblocks          5800.0     825354.5   1423.0
Pipe Throughput                               12440.0     749421.9    602.4
Pipe-based Context Switching                   4000.0      40439.3    101.1
Process Creation                                126.0       3964.6    314.6
Shell Scripts (1 concurrent)                     42.4       3347.3    789.5
Shell Scripts (8 concurrent)                      6.0        847.3   1412.2
System Call Overhead                          15000.0     638685.4    425.8
                                                                   ========
System Benchmarks Index Score                                         663.1

------------------------------------------------------------------------
Benchmark Run: Wed Jan 03 2024 09:56:06 - 10:24:18
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       74545237.5 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    13074.2 MWIPS (9.9 s, 7 samples)
Execl Throughput                               4403.0 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1136175.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          361147.3 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1916223.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2880998.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 475862.9 lps   (10.0 s, 7 samples)
Process Creation                               7806.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6233.5 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    898.2 lpm   (60.2 s, 2 samples)
System Call Overhead                        2400486.5 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   74545237.5   6387.8
Double-Precision Whetstone                       55.0      13074.2   2377.1
Execl Throughput                                 43.0       4403.0   1023.9
File Copy 1024 bufsize 2000 maxblocks          3960.0    1136175.9   2869.1
File Copy 256 bufsize 500 maxblocks            1655.0     361147.3   2182.2
File Copy 4096 bufsize 8000 maxblocks          5800.0    1916223.8   3303.8
Pipe Throughput                               12440.0    2880998.3   2315.9
Pipe-based Context Switching                   4000.0     475862.9   1189.7
Process Creation                                126.0       7806.2    619.5
Shell Scripts (1 concurrent)                     42.4       6233.5   1470.2
Shell Scripts (8 concurrent)                      6.0        898.2   1497.0
System Call Overhead                          15000.0    2400486.5   1600.3
                                                                   ========
System Benchmarks Index Score                                        1878.7

在ubuntu下跑虚拟机:

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: RK3568_qemu: GNU/Linux
   OS: GNU/Linux -- 4.19.232 -- #3 SMP PREEMPT Wed Jan 3 10:24:38 CST 2024
   Machine: aarch64 (unknown)
   Language:  (charmap=, collate=)
   CPU 0:  (108.0 bogomips)

   CPU 1:  (108.0 bogomips)

   CPU 2:  (108.0 bogomips)

   CPU 3:  (108.0 bogomips)

   03:13:55 up 2 min,  1 user,  load average: 0.00, 0.00, 0.00; runlevel

------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 03:13:55 - 03:42:06
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       19022473.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3265.1 MWIPS (9.9 s, 7 samples)
Execl Throughput                               2390.8 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        436414.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          156531.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        816641.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                              750129.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  47353.1 lps   (10.0 s, 7 samples)
Process Creation                               4155.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2867.3 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    858.4 lpm   (60.0 s, 2 samples)
System Call Overhead                         639944.1 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   19022473.9   1630.0
Double-Precision Whetstone                       55.0       3265.1    593.6
Execl Throughput                                 43.0       2390.8    556.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     436414.6   1102.1
File Copy 256 bufsize 500 maxblocks            1655.0     156531.9    945.8
File Copy 4096 bufsize 8000 maxblocks          5800.0     816641.9   1408.0
Pipe Throughput                               12440.0     750129.7    603.0
Pipe-based Context Switching                   4000.0      47353.1    118.4
Process Creation                                126.0       4155.5    329.8
Shell Scripts (1 concurrent)                     42.4       2867.3    676.2
Shell Scripts (8 concurrent)                      6.0        858.4   1430.7
System Call Overhead                          15000.0     639944.1    426.6
                                                                   ========
System Benchmarks Index Score                                         666.4

------------------------------------------------------------------------
Benchmark Run: Thu Jan 04 2024 03:42:06 - 04:10:19
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       75362148.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    13078.8 MWIPS (10.0 s, 7 samples)
Execl Throughput                               4618.8 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1075601.7 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          370916.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1761060.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2935171.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 505143.3 lps   (10.0 s, 7 samples)
Process Creation                               8491.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6652.1 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    931.5 lpm   (60.1 s, 2 samples)
System Call Overhead                        2401011.6 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   75362148.4   6457.8
Double-Precision Whetstone                       55.0      13078.8   2378.0
Execl Throughput                                 43.0       4618.8   1074.1
File Copy 1024 bufsize 2000 maxblocks          3960.0    1075601.7   2716.2
File Copy 256 bufsize 500 maxblocks            1655.0     370916.8   2241.2
File Copy 4096 bufsize 8000 maxblocks          5800.0    1761060.8   3036.3
Pipe Throughput                               12440.0    2935171.9   2359.5
Pipe-based Context Switching                   4000.0     505143.3   1262.9
Process Creation                                126.0       8491.8    674.0
Shell Scripts (1 concurrent)                     42.4       6652.1   1568.9
Shell Scripts (8 concurrent)                      6.0        931.5   1552.5
System Call Overhead                          15000.0    2401011.6   1600.7
                                                                   ========
System Benchmarks Index Score                                        1912.0

差异点在哪里?可能在于差异点内核的版本和配置,没有一致。

​ write by xuxeu

你可能感兴趣的:(beautyOS,网络,服务器,linux)