测试目的:
---------------------------------------------------------
了解GraphicsMagick启用与禁用多线程(openmp),对GraphicsMagick性能的影响
编译参数使用 --disable-openmp表示禁用,未使用表示开启。
GM版本:
---------------------------------------------------------
GraphicsMagick-1.3.17
测试环境硬件配置:
---------------------------------------------------------
OS:Red Hat Enterprise Linux Server release 5.6 (Tikanga)
CPU: Intel(R) Xeon(TM) CPU 2.80GHz 4核心 (很一般的CPU,都不如笔记本的I5CPU 2410 双核4线程)
内存:8G
测试逻辑:
---------------------------------------------------------
一张1024*768像素的JPG图片,大小140KB,使用GM生成一个100x100的缩略图
一张3264x2448像素的JPG图片,大小2.6M,使用GM生成一个100x100的缩略图
测试命令:
---------------------------------------------------------
使用gm自带的测试命令:gm benchmark -iterations 100 -stepthreads 1 +原命令语句
-iterations 100 次数
-rawcsv 打印出测试结果,是csv格式的文本 ,标题: threads,iterations,user_time秒,elapsed_time秒
-stepthreads 1 线程增长步长,1表示每次加1个线程,一直加到OMP_NUM_THREADS环境变量的值 ,必须设置OMP_NUM_THREADS环境变量才可以真正使用起多线程(openmp)。本测试只想测试单线程就未设置 OMP_NUM_THREADS。
-duration 持续时间(秒)
OMP_NUM_THREADS环境变量
---------------------------------------------------------
OMP_NUM_THREADS环境变量,表示GM可使用的线程数。必须设置OMP_NUM_THREADS环境变量才可以真正使用起多线程(openmp)。
临时设置环境变量: (我是4核心的CPU)
export OMP_NUM_THREADS=4
永久设置环境变量:
编辑文件 vi /etc/profile
1、最后面加入:export OMP_NUM_THREADS=4
2、执行命令使其马上生效: # source /etc/profile
测试1: 1-4并发,每并发执行100次(小图片)一张1024*768像素的JPG图片,大小140KB
---------------------------------------------------------
启用多线程
编译参数:./configure --enable-shared --with-quantum-depth=8 --with-windows-font-dir=/usr/share/fonts/ms_font
gm benchmark -iterations 100 -stepthreads 1 convert -resize 100x100 -quality 90 +profile "*" /tmp/1024x768.jpg /tmp/100x100.jpg
测试结果:
Results: 1 threads 100 iter 18.46s user 18.47s total 5.414 iter/s 5.417 iter/cpu 1.00 speedup 1.000 karp-flatt (处理一张图片耗时184ms)
Results: 2 threads 100 iter 22.25s user 11.13s total 8.985 iter/s 4.494 iter/cpu 1.66 speedup 0.205 karp-flatt (处理一张图片耗时111ms)
Results: 3 threads 100 iter 36.84s user 12.29s total 8.137 iter/s 2.714 iter/cpu 1.50 speedup 0.498 karp-flatt (处理一张图片耗时122ms)
Results: 4 threads 100 iter 47.31s user 11.87s total 8.425 iter/s 2.114 iter/cpu 1.56 speedup 0.524 karp-flatt (处理一张图片耗时118ms)
结果解读:
4 threads:4个线程
100 iter:每线程执行100次
47.31s user:不知道
11.87s total:本次耗时 (同使用-rawcsv参数输出的总时间)
8.425 iter/s:处理8.425次/s
2.114 iter/cpu:处理2.114次/cpu
1.56 speedup:加快了1.56
0.524 karp-flatt:不知道
系统负载:
(1线程)20:51:11 up 6 days, 14:04, 5 users, load average: 0.28, 0.36, 0.34
(2线程)20:51:26 up 6 days, 14:04, 5 users, load average: 0.74, 0.46, 0.37
(3线程)20:51:36 up 6 days, 14:04, 5 users, load average: 1.25, 0.57, 0.41
(4线程)20:51:47 up 6 days, 14:05, 5 users, load average: 1.67, 0.69, 0.45
CPU使用率(4核):
(1线程)25%
(2线程)50%
(3线程)75%
(4线程)99%
测试2: 1-4并发,每并发执行100次(大图片)一张3264x2448像素的JPG图片,大小2.6M
---------------------------------------------------------
启用多线程
编译参数:./configure --enable-shared --with-quantum-depth=8 --with-windows-font-dir=/usr/share/fonts/ms_font
gm benchmark -iterations 100 -stepthreads 1 convert -resize 100x100 -quality 90 +profile "*" /tmp/3264x2448.jpg /tmp/100x100.jpg
测试结果:
Results: 1 threads 100 iter 205.42s user 205.45s total 0.487 iter/s 0.487 iter/cpu 1.00 speedup 1.000 karp-flatt (处理一张图片耗时2054ms)
Results: 2 threads 100 iter 247.33s user 124.87s total 0.801 iter/s 0.404 iter/cpu 1.65 speedup 0.216 karp-flatt (处理一张图片耗时1248ms)
Results: 3 threads 100 iter 371.59s user 125.78s total 0.795 iter/s 0.269 iter/cpu 1.63 speedup 0.418 karp-flatt (处理一张图片耗时1257ms)
Results: 4 threads 100 iter 470.94s user 125.02s total 0.800 iter/s 0.212 iter/cpu 1.64 speedup 0.478 karp-flatt (处理一张图片耗时1250ms)
系统负载:
(1线程)21:31:00 up 6 days, 14:44, 5 users, load average: 0.99, 0.84, 0.64
(2线程)21:37:10 up 6 days, 14:50, 5 users, load average: 1.87, 1.24, 0.84
(3线程)21:38:36 up 6 days, 14:51, 5 users, load average: 2.67, 1.65, 1.02
(4线程)21:41:35 up 6 days, 14:54, 5 users, load average: 3.85, 2.58, 1.48
CPU使用率(4核):
(1线程)25%
(2线程)49.5%
(3线程)74.5%
(4线程)99%
测试3: 1并发,每并发执行100次(小图片)一张1024*768像素的JPG图片,大小140KB
-----------------------------------------------------
禁用多线程
编译参数: ./configure --enable-shared --disable-openmp --with-quantum-depth=8 --with-windows-font-dir=/usr/share/fonts/ms_font
gm benchmark -iterations 100 -stepthreads 1 convert -resize 100x100 -quality 90 +profile "*" /tmp/1024x768.jpg /tmp/100x100.jpg
测试结果:
Results: 1 threads 100 iter 18.65s user 18.65s total 5.362 iter/s 5.362 iter/cpu 1.00 speedup 1.000 karp-flatt (处理一张图片耗时186ms)
系统负载:
(1线程) 21:15:52 up 6 days, 14:29, 5 users, load average: 0.48, 0.53, 0.45
CPU使用率(4核): (1线程)
21时15分59秒 CPU %user %nice %system %iowait %steal %idle
21时16分00秒 all 25.00 0.00 0.00 0.50 0.00 74.50
21时16分01秒 all 25.00 0.00 0.25 0.00 0.00 74.75
测试4: 1并发,每并发执行100次(大图片)一张3264x2448像素的JPG图片,大小2.6M
-----------------------------------------------------
禁用多线程
编译参数: ./configure --enable-shared --disable-openmp --with-quantum-depth=8 --with-windows-font-dir=/usr/share/fonts/ms_font
gm benchmark -iterations 100 -stepthreads 1 convert -resize 100x100 -quality 90 +profile "*" /tmp/3264x2448.jpg /tmp/100x100.jpg
测试结果:
Results: 1 threads 100 iter 204.13s user 204.15s total 0.490 iter/s 0.490 iter/cpu 1.00 speedup 1.000 karp-flatt (处理一张图片耗时2040ms)
系统负载:
21:21:00 up 6 days, 14:34, 5 users, load average: 0.91, 0.57, 0.46
CPU使用率(4核): (1线程)
21时20分54秒 CPU %user %nice %system %iowait %steal %idle
21时20分55秒 all 24.00 0.00 1.00 0.00 0.00 75.00
21时20分56秒 all 25.00 0.00 0.25 0.00 0.00 74.75
结论:
-----------------------------------------------------
对于核心CPU,建议开启GM 的多线程,性能提升大约40%。
如果你的服务器共有16核CPU (2路、4核8线程),也建议设置OMP_NUM_THREADS=4,线程再多后性能也没有什么提升了。
参考文章:
-----------------------------------------------------
GraphicsMagick OpenMP 性能比较(icc+iomp vs gcc+gomp)
OpenMP in GraphicsMagick