Download iozone tarball from http://www.iozone.org/.
Extract it:
# tar -xvf iozone3XXX.tar
# cd iozone3XXX/src/current/
Choose your platform(my platform is AMD64) and run # make AMD64
Read
This test measures the performance of reading an existing file.
Write
This test measures the performance of writing a new file. When a new file is written not only does the data need to be stored but also the overhead information for keeping track of where the data is located on the storage media. This overhead is called the “metadata” It consists of the directory information, the space allocation and any other data associated with a file that is not part of the data contained in the file. It is normal for the initial write performance to be lower than the performance of re-writing a file due to this overhead information.
Re-Read
This test measures the performance of reading a file that was recently read. It is normal for the performance to be higher as the operating system generally maintains a cache of the data for files that were recently read. This cache can be used to satisfy reads and improves the performance.
Re-Write
This test measures the performance of writing a file that already exists. When a file is written that already exists the work required is less as the metadata already exists. It is normal for the rewrite performance to be higher than the performance of writing a new file.
Read backwards
This test measures the performance of reading a file backwards. This may seem like a strange way to read a file but in fact there are applications that do this. MSC Nastran is an example of an application that reads its files backwards. With MSC Nastran, these files are very large (Gbytes to Tbytes in size). Although many operating systems have special features that enable them to read a file forward more rapidly, there are very few operating systems that detect and enhance the performance of reading a file backwards.
Random Read
This test measures the performance of reading a file with accesses being made to random locations within the file. The performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.
Random Write
This test measures the performance of writing a file with accesses being made to random locations within the file. Again the performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.
Random Mix
This test measures the performance of reading and writing a file with accesses being made to random locations within the file. Again the performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others. This test is only available in throughput mode. Each thread/process runs either the read or the write test. The distribution of read/write is done on a round robin basis. More than one thread/process is required for proper operation.
Backwards Read
This test measures the performance of reading a file backwards. This may seem like a strange way to read a file but in fact there are applications that do this. MSC Nastran is an example of an application that reads its files backwards. With MSC Nastran, these files are very large (Gbytes to Tbytes in size). Although many operating systems have special features that enable them to read a file forward more rapidly, there are very few operating systems that detect and enhance the performance of reading a file backwards.
Record Rewrite
This test measures the performance of writing and re-writing a particular spot within a file. This hot spot can have very interesting behaviors. If the size of the spot is small enough to fit in the CPU data cache then the performance is very high. If the size of the spot is bigger than the CPU data cache but still fits in the TLB then one gets a different level of performance. If the size of the spot is larger than the CPU data cache and larger than the TLB but still fits in the operating system cache then one gets another level of performance, and if the size of the spot is bigger than the operating system cache then one gets yet another level of performance.
Read Strided
This test measures the performance of reading a file with a strided access behavior. An example would be: Read at offset zero for a length of 4 Kbytes, then seek 200 Kbytes, and then read for a length of 4 Kbytes, then seek 200 Kbytes and so on. Here the pattern is to read 4 Kbytes and then Seek 200 Kbytes and repeat the pattern. This again is a typical application behavior for applications that have data structures contained within a file and is accessing a particular region of the data structure. Most operating systems do not detect this behavior or implement any techniques to enhance the performance under this type of access behavior.
This access behavior can also sometimes produce interesting performance anomalies. An example would be if the application’s stride causes a particular disk, in a striped file system, to become the bottleneck.
Fread
This test measures the performance of reading a file using the library function fread(). This is a library routine that performs buffered & blocked read operations. The buffer is within the user’s address space. If an application were to read in very small size transfers then the buffered & blocked I/O functionality of fread() can enhance the performance of the application by reducing the number of actual operating system calls and increasing the size of the transfers when operating system calls are made.
Freread
This test is the same as fread above except that in this test the file that is being read was read in the recent past. This should result in higher performance as the operating system is likely to have the file data in cache.
Fwrite
This test measures the performance of writing a file using the library function fwrite(). This is a library routine that performs buffered write operations. The buffer is within the user’s address space. If an application were to write in very small size transfers then the buffered & blocked I/O functionality of fwrite() can enhance the performance of the application by reducing the number of actual operating system calls and increasing the size of the transfers when operating system calls are made. This test is writing a new file so again the overhead of the metadata is included in the measurement.
Frewrite
This test measures the performance of writing a file using the library function fwrite().
This is a library routine that performs buffered & blocked write operations. The buffer is within the user’s address space. If an application were to write in very small size transfers then the buffered & blocked I/O functionality of fwrite() can enhance the performance of the application by reducing the number of actual operating system calls and increasing the size of the transfers when operating system calls are made.
This test is writing to an existing file so the performance should be higher as there are no metadata operations required.
Random Read/Write
This test measures the performance of reading/writing a file with accesses being made to random locations within the file. The performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.
Async I/O
Another mechanism that is supported by many operating systems for performing I/O is POSIX async I/O. The application uses the POSIX standard async I/O interfaces to accomplish this. Example: aio_write(), aio_read(), aio_error(). This test measures the performance of the POSIX async I/O mechanism.
Mmap
Many operating systems support the use of mmap() to map a file into a user’s address space. Once this mapping is in place then stores to this location in memory will result in the data being stored going to a file. This is handy if an application wishes to treat files as chunks of memory. An example would be to have an array in memory that is also being maintained as a file in the files system.
The semantics of mmap files is somewhat different than normal files. If a store to the memory location is done then no actual file I/O may occur immediately. The use of the msyc() with the flags MS_SYNC, and MS_ASYNC control the coherency of the memory and the file. A call to msync() with MS_SYNC will force the contents of memory to the file and wait for it to be on storage before returning to the application. A call to msync() with the flag MS_ASYNC tells the operating system to flush the memory out to storage using an asynchronous mechanism so that the application may return into execution without waiting for the data to be written to storage.
This test measures the performance of using the mmap() mechanism for performing I/O.
For Xen disk performance testing, i choose the following command:
# iozone -azR -i 0 -i 1 -i 2 -n 1G -g 4G -r 64K -M result_file -f test_file
For each option meaings, see The main parameters section.
Note:
If you wish to get accurate results for the entire range of performance for a platform you need to make sure that the maximum file size that will be tested is bigger than the buffer cache. If you don’t know how big the buffer cache is, or if it is a dynamic buffer cache then just set the maximum file size to be greater than the total physical memory that is in the platform.
In general you should be able to see three or four plateaus.
+ File size fits in processor cache.
+ File size fits in buffer cache
+ File size is bigger than buffer cache.
You may see another plateau if the platform has a primary and secondary processor caches. If you don’t see at least 3 plateaus then you probably have the maximum file size set too small. Iozone will default to a maximum file size of 512 Mbytes. This is generally sufficient but for some very large systems you may need to use the –g option to increase the maximum file size. See the file Run_rules document in the distribution for further information.
Additional notes on how to make the graphs
Iozone sends Excel compatible output to standard out. This may be redirected to a file and then processed with Excel. The normal output for Iozone as well as the Excel portion are in the same output stream. So to get the graphs one needs to scroll down to the Excel portion of the file and graph the data in that section.
There are several sets of graph data. “Writer report” is one example. When importing the file be sure to tell Excel to import with “delimited” and then click next, then click on the “space delimited” button. To graph the data just highlight the region containing the file size and record size and then click on the graph wizard.
The type of graph used is “Surface”. When the next dialog box pops up you need to select “Columns”.
After that the rest should be straight forward.
# iozone -azR -i 0 -i 1 -i 2 -n 1G -g 4G -r 64K -M -b result_file.xls -f test_file
Iozone: Performance Test of File I/O
Version $Revision: 3.414 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
Vangel Bojaxhi, Ben England.
Run began: Mon Apr 20 07:36:07 2015
Auto Mode
Cross over of record size disabled.
Excel chart generation enabled
Using minimum file size of 1024 kilobytes.
Using maximum file size of 10240 kilobytes.
Record Size 64 KB
Machine = Linux dhcp-66-73-122.englab.nay.redhat.com 3.10.0-229.el7.x86_64 #1 SMP Thu Jan 29 18:37:38 EST 201
Command line used: iozone -azR -i 0 -i 1 -i 2 -n 1M -g 10M -r 64K -M -b result_file.xls -f test_file
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
1024 64 564176 1943594 5813392 5813392 4589593 2358827
2048 64 594302 2081497 5986827 3136379 4295384 2090107
4096 64 591658 2084478 5926533 6195843 5634950 2345757
8192 64 592850 1848808 4917099 5776876 5326425 1677328
iozone test complete.
Excel output is below:
"Writer report"
"64"
"1024" 564176
"2048" 594302
"4096" 591658
"8192" 592850
"Re-writer report"
"64"
"1024" 1943594
"2048" 2081497
"4096" 2084478
"8192" 1848808
"Reader report"
"64"
"1024" 5813392
"2048" 5986827
"4096" 5926533
"8192" 4917099
"Re-Reader report"
"64"
"1024" 5813392
"2048" 3136379
"4096" 6195843
"8192" 5776876
"Random read report"
"64"
"1024" 4589593
"2048" 4295384
"4096" 5634950
"8192" 5326425
"Random write report"
"64"
"1024" 2358827
"2048" 2090107
"4096" 2345757
"8192" 1677328
Use office tool to open the result_file.xls, you will find the results contain three part: file read/write method, records sizes, file sizes and the Throughput(Kbytes/sec).
The Throughput is the disk performance indicator, the larger the better. On a final note, you can generate graphs using the Generate_Graphs and gengnuplot.sh located under iozone3XXX/src/current/, based on the iozone output.
# cd iozone3XXX/src/current/
# iozone -azR -i 0 -i 1 -i 2 -n 1G -g 4G -r 64K -M -b result_file.xls -f test_file > test_results.txt
# ./Generate_Graphs test_results.txt
-i # - Used to specify which tests to run. (0=write/rewrite, 1=read/re-read, 2=random-read/write 3=Read-backwards, 4=Re-write-record, 5=stride-read, 6=fwrite/re-fwrite, 7=fread/Re-fread, 8=random mix, 9=pwrite/Re-pwrite, 10=pread/Re-pread, 11=pwritev/Re-pwritev, 12=preadv/Re-preadv).
One will always need to specify 0 so that any of the following tests will have a file to measure.
-i # -i # -i # is also supported so that one may select more than one test.