.\"  Hey, Emacs, edit this file in -*- nroff-fill -*- mode
.\"-
.\" Copyright (c) 1997, 1998
.\"	Nan Yang Computer Services Limited.  All rights reserved.
.\"
.\"  This software is distributed under the so-called ``Berkeley
.\"  License'':
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\"    notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\"    notice, this list of conditions and the following disclaimer in the
.\"    documentation and/or other materials provided with the distribution.
.\" 3. All advertising materials mentioning features or use of this software
.\"    must display the following acknowledgement:
.\"	This product includes software developed by Nan Yang Computer
.\"      Services Limited.
.\" 4. Neither the name of the Company nor the names of its contributors
.\"    may be used to endorse or promote products derived from this software
.\"    without specific prior written permission.
.\"  
.\" This software is provided ``as is'', and any express or implied
.\" warranties, including, but not limited to, the implied warranties of
.\" merchantability and fitness for a particular purpose are disclaimed.
.\" In no event shall the company or contributors be liable for any
.\" direct, indirect, incidental, special, exemplary, or consequential
.\" damages (including, but not limited to, procurement of substitute
.\" goods or services; loss of use, data, or profits; or business
.\" interruption) however caused and on any theory of liability, whether
.\" in contract, strict liability, or tort (including negligence or
.\" otherwise) arising in any way out of the use of this software, even if
.\" advised of the possibility of such damage.
.\"
.\" $Id: rawio.1,v 1.5 1999/07/21 02:14:32 grog Exp grog $
.Dd 21 November 1999
.Dt RAWIO 1
.Sh NAME
.Os UNIX
.Nm rawio
.Nd Test performance of low-level storage devices
.Sh SYNOPSIS
.Nm
.Op Fl A Ar alignment
.Op Fl a
.Op Fl c Ar transfer-count
.Op Fl F
.Op Fl f
.Op Fl h
.Op Fl I Ar name
.Op Fl n Ar record-count
.Op Fl p Ar process-count
.Op Fl R
.Op Fl r 
.Op Fl s Ar size
.Op Fl v Ar verbosity
.Op Fl W Op Ar percentage
.Op Fl w Op Ar percentage
.Ar special
.Sh DESCRIPTION
.Nm
tests the speed of the low-level character I/O device
.Ar special
in a concurrent environment.  It is intended for comparisons of storage devices
on a single system, and is not suited for cross-platform performance testing.
.Pp
By default,
.Nm
spawns eight processes, each of which performs the same test.  Four tests are
available:
.Bl  -tag -width indent
.It Nm Random Read
The random read test reads varying length records from the specified
device
.Ar special ,
starting at random positions within the file.  The offset is necessary to
protect the disk label and any possible future extensions.
.It Nm Sequential Read
The sequential read test reads constant length records from the specified
device
.Ar special ,
starting at offset 32 sectors the beginning of the file.
.It Nm Random Write
The random write test writes varying length records to the specified device
.Ar special ,
starting at random positions within the file.  \fBTHIS TEST OVERWRITES DATA ON
THE SPECIAL DEVICE.  DO NOT USE IT ON A DRIVE WHICH CONTAINS IMPORTANT DATA.\fP.
.It Nm Sequential Write
The sequential write test writes constant length records to the specified device
.Ar special ,
starting at offset 32 sectors the beginning of the file.  The offset is
necessary to protect the disk label and any possible future extensions.  \fBTHIS
TEST OVERWRITES DATA ON THE SPECIAL DEVICE.  DO NOT USE IT ON A DRIVE WHICH
CONTAINS IMPORTANT DATA.\fP.
.El
.Pp
If no tests are specified with the options
.Fl R ,
.Fl r ,
.Fl W
or
.Fl w ,
.Nm
performs the random read and the sequential read tests, which are
non-destructive.
.Pp
.Nm
resembles
.Nm bonnie 
in some of the things it does.  It differs strongly from
.Nm bonnie
by using a raw disk device, which bypasses buffer cache.  As a result, some of
the tests that
.Nm bonnie
performs are meaningless, for example character I/O.
.Sh OPTIONS
.Bl -tag -width indent
.It Fl A Ar alignment
Perform all I/O with buffers aligned on a boundary divisible by
.Ar alignment .
Some devices, notably cheap RAID controllers, require this option.
.It Fl a
Perform all tests (Random Read, Sequential Read, Random Write, Sequential
Write).
.It Fl c Ar transfer-count
Specify the length of sequential transfers or the average length of random
transfers.  By default, the length of a random transfer can be up to twice this
value, though this can be changed with the
.Fl F
option.  The transfer count may be specified either in sectors (<512) or in
bytes (>= 512), and must be an integral number of sectors.  This value defaults
to 16384 bytes (32 sectors).  The maximum value is system-dependent.  On FreeBSD
it is 256 sectors (131072 bytes).
.Pp
The actual average transfer count for random transfers may vary from the value
specified due to edge effects: since the random values are fitted to the range 1
to 256, the actual average transfer size will be larger than requested for very
small values of
.Ar transfer-count ,
and for values of
.Ar transfer-count
over 128, it will be less.
.It Fl F
When performing random transfers, make all transfers of the same length as
sequential transfers.  By default, the transfer length is between 1 sector and
double the length of the sequential transfers.  See also the
.Fl c
option.
.It Fl f
When performing sequential transfers, start all the transfers at the same offset
into the device.  This has a dramatic effect on the throughput.  See
INTERPRETING THE RESULTS below for a discussion of this flag.
.It Fl h
Suppress headings in the output.  This can be useful to create output files
intended for processing by plotting utilities.
.It Fl I Ar name
Specify a name to be written in the results to identify the test.  If this
option is omitted,
.Nm
uses the name of the drive.
.It Fl n Ar record-count
Specify the total number of records to transfer.  The default value is 16384.
If the number is not divisible by the number of processes, the first
.Ar remainder
processes transfer one extra record.
.Pp
The sequential transfer tests will stop at the end of the device, which may
result in fewer records than indicated being transferred.
.It Fl p Ar process-count
Specify the number of processes to start.  The default value is 8.
.It Fl R
Perform a Random Read test, identified as 
.Ar RR
in the output.  This flag may be used in combination with other test
specification flags.
.It Fl r 
Perform a Sequential Read test, identified as 
.Ar SR
in the output.  This flag may be used in combination with other test
specification flags.
.It Fl S
When performing the random tests, use pseudo-random data from an internal table
instead of calling the random number generator.  This makes the results more
predictable and thus more repeatable.  This can be of use when comparing results
on different platforms, though it is not clear that the differences in the
random numbers generated play a significant factor in the differences between
consecutive measurements.
.It Fl s Ar size
Specify the size of
.Ar special 
in bytes.
.Nm
tries several different ways to determine the size of the device, but it is
possible that all will fail.  In this case, you should specify it manually.
This value will override any attempt to determine the size programmatically, so
it can also be of use to restrict the part of the device used by the random seek
tests.
.It Fl v Ar verbosity
Specify that more verbose output is desired.
.Ar verbosity
is an integer specifying the amount of information desired.
.It Fl W Op Ar percentage
Perform a Random Write test, identified as
.Ar RW
in the output.  This flag may be used in combination with other test
specification flags.  If you specify the optional
.Ar percentage
argument, this test will interleave read and write accesses, performing
.Ar percentage %
writes and 
.Ar (100 - percentage)%
reads.  \fBTHIS TEST OVERWRITES DATA ON THE SPECIAL DEVICE.  DO NOT USE IT ON A
DRIVE WHICH CONTAINS IMPORTANT DATA.\fP.
.It Fl w Op Ar percentage
Perform a Sequential Write test, identified as 
.Ar SW
in the output.  This flag may be used in combination with other test
specification flags.  If you specify the optional
.Ar percentage
argument, this test will interleave read and write accesses, performing
.Ar percentage %
writes and 
.Ar (100 - percentage)%
reads.  \fBTHIS TEST OVERWRITES DATA ON THE SPECIAL DEVICE.  DO NOT USE IT ON A
DRIVE WHICH CONTAINS IMPORTANT DATA.\fP.
.El
.Sh OUTPUT FORMAT
.Nm
can produce three different styles of output.  Without the
.Fl v
option,
.Nm
produces the following kind of output:
.Bd -literal
        Random read     Sequential read Random write    Sequential write
ID          K/sec  /sec    K/sec  /sec     K/sec  /sec     K/sec  /sec
rawdisk      81.0     5    149.4     5      88.1     6     129.1     4
.Ed
.Pp
Each test produces only a single line of output.  The values for each test are
the throughput in kilobytes per second, and the number of transfers per second.
.Pp
With the
.Fl v Ar 1
option,
.Nm
produces the following style of output:
.Bd -literal
Test    ID          K/sec       %User    %Sys  %Total  I/Os
RR     da0c        1200.8         0.2     2.4     2.6  800
SR     da0c         944.0         0.1     0.7     0.9  800
RW     da0c        1380.3         0.2     2.8     3.0  800
SW     da0c         947.8         0.0     0.8     0.9  800
.Ed
.Pp
The first column is an abbreviation for the test.  See the test descriptions
above.  The second column is the identifier for the test, which in this example
defaults to the name of the disk,
.Ar da0c
because none had been specified.
.Pp
The third column shows the aggregate data transfer speed.  The fields
.Ar %User ,
.Ar %Sys 
and
.Ar %Total
show the percentage user, system and total (user + system) CPU time used by the
processes.  The field
.Ar I/Os
shows the number of I/O requests issued to
.Ar special .
.Pp
The verbose output prints the following information:
.Bd -literal
Test name:                sample
Transfer count:            32768
Record count:                100
Process count:                 8
Device size:          1648000000
Test    ID         Time    K/sec   %User    %Sys  %Total  Reads   Writes
RR   sample   10.143686   1243.1     0.1     2.6     2.8  800     0
SR   sample   27.822950    942.1     0.0     0.8     0.9  800     0
RW   sample    9.470939   1321.3     0.1     2.8     2.8  0       800
SW   sample   27.719114    945.7     0.0     0.9     0.9  0       800
.Ed
.Pp
The 
.Ar time
field shows the elapsed time for the complete test, and the columns
.Ar Reads
and
.Ar Writes
show itemized information about the I/Os.  This format is likely to change and
become more useful.
.Pp
Either format is intended to make it easy to extract test information from a log
file.  For example, to extract information on a specific test on different
devices, you can enter:
.Bd -literal
$ \fBgrep ^RR stripe.1.log \fP
RR      s1k         220.8         0.0     2.2     2.2  512
RR      s2k         386.2         0.1     2.0     2.0  512
RR      s4k         620.6         0.2     2.2     2.3  512
RR      s8k         881.3         0.2     2.4     2.6  512
RR     s16k        1083.6         0.2     2.7     2.9  512
RR     s32k        1192.2         0.3     2.5     2.8  512
RR     s64k        1239.9         0.1     2.9     3.1  512
RR    s128k        1306.8         0.4     2.8     3.1  512
RR    s256k        1346.0         0.8     2.5     3.2  512
RR    s512k        1360.8         0.0     3.0     3.0  512
RR      s1m        1363.7         0.0     3.2     3.2  512
RR      s2m        1387.9         0.1     2.8     2.9  512
RR      s4m        1357.6         0.2     2.8     3.1  512
.Ed
.Pp
To extract information on a specific test,
.Bd -literal
$ \fBgrep s128k stripe.1.log\fP
SR    s128k        1719.1         0.0     1.7     1.7  512
RR    s128k        1306.8         0.4     2.8     3.1  512
SW    s128k        2126.9         0.1     2.1     2.2  512
RW    s128k        1516.4         0.0     3.4     3.4  512
.Ed
.Sh INTERPRETING THE RESULTS
.Nm
is designed to simulate the behaviour of real-world storage devices in some
common situations.  When analysing the results, it is important to understand
these situations.
.Bl -tag -width indent
.It Nm Random file access
Relatively true random access situations, such as are demonstrated by the random
read and write tests, occur with web page accesses.  Many database applications
also behave in this manner.  In each case, the issue is complicated by directory
or index access, so this test is idealized.
.Pp
In this test, the number of processes should be set to the approximate number of
requestors.  Performance will usually show an improvement with increasing number
of requestors.  This performance improvement will lessen with increasing number
of processes, and may show a drop with a large number.  This drop can be
attributed to a large number of factors, not the least of which is natural
measurement error.
.It Nm Sequential file access
True sequential file access occurs when the disk subsystem reads sequential
blocks off a disk.  It is extremely rare in a large system: it implies that only
one process is doing the reading.  As soon as two processes read, even if they
are reading the same file sequentially, the access is no longer purely
sequential as seen by the disk.  Instead, multiple read requests are issued for
the same spatially related blocks.
.Pp
Even this kind of access is relatively rare.  First, the file system buffer
cache will generally resolve these issues, so only one read will be issued.
Secondly, typical ``sequential access'' is more typified by an ftp server or
streaming video server, where multiple processes read relatively large files in
a sequential manner.  This is the model which the
.Nm
sequential access tests perform by default.  If you want to test multiple
sequential access to the same area of disk, use the
.Fl f
flag.
.Pp
The normal sequential access tests show a marked decrease in performance with
increasing number of requestor processes.  This is because the access becomes
more and more random with increasing number of requestors.  On the other hand,
with the
.Fl f
flag, performance improves dramatically with increasing number of requestors,
since now the on-device cache can satisfy most requests.  With 8 requestors,
performance improvements of 3000% to 4000% can be expected.  This is the
scenario that RAID array vendors like to show, since it can show really dramatic
performance.  Unfortunately, the figures are almost meaningless.
.It Nm The effect of transfer size
Modern storage devices transfer data at between 10 MB/s and 80 MB/s.  A typical
transfer of 8 kB thus takes between 100 \(*us and 800 \(*us.  By contrast,
typical positioning latency is in the order of 8 ms, between 10 and 80 times as
long.  Obviously the size of the transfer strongly affects the throughput.
.P
Unfortunately, it is often difficult to influence the transfer size.  Text web
pages, for example, tend to be less than 16 kB in size.
.Cm ftp
files and image data are larger, but it's often difficult to persuade the system
to transfer in larger quantities; a lot depends on the program performing the
access.  It's beyond the scope of this man page to discuss methods of improving
performance, but
.Nm
provides the mechanism for measuring the potential differences.  A typical
average transfer size for
.Nm ufs
is between 6 kB and 7 kB.
.El
.Sh GOTCHAS
.Nm
measures I/O system performance, so you should use it against the raw disk
device.  It will work against block devices, but you'll be measuring the
performance of buffer cache, not the underlying device.
.Pp
The
.Nm
write tests \fBoverwrite the data on the device\fP.  Don't use them on devices
containing data you care about.
.Sh SEE ALSO
.Xr bonnie 1 ,
.Xr iozone 1 ,
.Xr iostat 8
.Sh AUTHOR
Greg Lehey <grog@lemis.com>.