Monitoring bonnie++ Disk I/O

Abstract

Have you ever asked yourself how the disk performance measurement tool bonnie++ is working in the background? I monitored one cycle of bonnie++ and got nice graphs that I want to discuss in this blog.

Have you ever asked yourself what bonnie++, the disk performance measurement tool, does in the background while you stare at the command line? I monitored one cycle of bonnie++ and got nice graphs that I want to discuss in this blog.

The Tool

When it comes to judge storage systems performance, bonnie++ is among the tools that set the standard benchmarks. It measures different aspects like throughput and latency under different conditions: Random and sequential writes and reads of data and files. What exactly it will test depends on you; you pick the test that comes closest to your application and your use cases. For instance a mail storage system with Maildir storage organisation will create, read and delete a lot of small files in many directories.

This is how it works:

  1. First bonnie++ checks the storage for the throughput to determine how many bytes per second you can write an read from that device.
  2. Then it will check how many files it can create, re-read and delete per second to determine the latency of the storage system.

Monitoring

To monitor what bonnie++ is really doing on the storage system the net-snmp agent offers the diskIO table. For every partition you get the four key figures:

diskIONRead
The number of bytes read from this device since boot.
diskIONWritten
The number of bytes written to this device since boot.
diskIOReads
The number of read accesses from this device since boot.
diskIOWrites
The number of write accesses to this device since boot.

Note

These figures are Counter32 and you have to substract two consecutive measurement values from each other and device the result be the time step to get the thoughput in Bytes/s or I/O operations per second (IOPS). If you use SNMPv2 it would be better to get 64 bit counters diskIONReadX and diskIONWrittenX instead of the normal Counter32 because this counter will not overflow during your measurment.

Every network management system (OpenNMS, Zabbix, nagios) can retrieve and store the values via SNMP. These programs also can show nice graphs. It remains to interpret the graphs. Sample outputs are displayed in figures 1 and 2.

Figure 1: Disk I/O operations per second (IOPS) during a bonnie++ run.

Figure 2: Throughput in Bytes/s during a bonnie++ run.

As you can see I divided the graphs into seven sections. In the following I want to discuss the sections and to link it to the specific tasks bonnie++ performs. As you will see, you can easily read the performance limits of your setup from your graphs.

Section 1

In the first section bonnie++ does the byte write tests. The output is:

Writing a byte at a time...done
Writing intelligently...done

As you can see this test causes a huge write throughput but not much I/O operations. All the graph was generated during the "intelligent" output, which writes a lot of data in one blob onto the disk.

Section 2 and 3

In the second section bonnie++ tries to read the data just written onto disk and to write it out again. The monitoring now shows the data throughput in both directions. As expected the performance in both directions is half the write throughput.

In the third section bonnie++ only reads the data. The monitoring shows that the disk performs only the read operations. Again you get the full throughput.

Section 4

During the measurement of section 4 bonnie++ did perform the following operations:

Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.

As you can see, sequential file operations causes a high I/O load. Since this test works in a sequential mode the throughput is also quite high.

Section 5

While my monitoring systems recorded the data in section 5 bonnie++ creates files in a random order. It seems that there is not much difference in the performance parameters for sequential and random file creation. The performance patterns look quite similar.

Section 6 and 7

In the sections 6 and 7 bonnie++ tries to read and delete the files in a random manner again. As the monitoring shows here we have a huge performance drop. The output of the tool told me that in this run that the performance of reading and deleting the files randomly was only about 20% of the performance of creating it.

Interpretation

Depending on your application you have to check what perfomance data will fit most you use case. A mail server for instance would create and delete a lot of files randomly. So the performance limit will surely be the data in section 6 and 7. A database server will write its data more in a sequential manner on the disk.

Michael Schwartzkopff, 08. April 2013

   Monitoring    disk    IO    IOPS    bonnie