Postfix Deliverability Monitoring with Zabbix

Abstract

Administrators operating a postfix mail server expect from a monitorng system not only an alert if the server is down. The monitoring system also should collect some important data to get an impression of the performance of the mail server. As a very simple approach I want to present a solution for the Zabbix system. It gathers the information about active and deferred mails from the postqueue command and the numbers of sent and bounced mails from the mai log file.

Monitoring postfix Statistics with Zabbix

Administrators operating a postfix mail server should expect more from a monitoring system, than only monitoring whether the service is up or down. A sophisticated monitoring service additionally tells the operator about the mail server's deliverabilty status: How many mails are about to be delivered? How many have been deferred? How many messages have been bounced?

As a best practice approach I am going to present a solution for the Zabbix system. It gathers the information about active and deferred mail queues via Postfix' postqueue command and the numbers of sent and bounced mails from the mail log file.

Active and Deferred Mails

Getting the number of messages currently held in Postfix' active and deferred queue is easy because Postfix brings management tools to handle its queues. One simply needs to enter the postqueue command and count the lines with the appropiate IDs. Active mails prepend a * to the messages queue ID, while deferred mail IDs are followed by a white space.

So you get the number of all active mails with the command:

$ postqueue -p | egrep -c "^[0-9A-F]{10}[*]"

and the number of all deferred mail with:

$ postqueue -p | egrep -c "^[0-9A-F]{10}[^*]"

Caution!

In my example I used the default IDs of postfix. You have to check your setup if you use the long IDs. The regex for long IDs is: "^[0-9bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ]{15}". Please do not forget the "*" or "^*" at the end.

Sent and Bounced Mails

There are no Postfix management tools to retrieve the numbers of sent and bounced mail. You have to parse the mail log file. Luckily the logtail command helps to do the job. Everytime it runs on a log file it stores the position where it finished its work in a configurable file. So I use two files to remember the position of the last run for the sent and bounced mail checks. The command lines look like:

$ logtail -f /var/log/mail.log -o /var/local/zabbix/sent.logtail | \
      grep -c "postfix/smtp.*status=sent"
$ logtail -f /var/log/mail.log -o /var/local/zabbix/bounce.logtail | \
      grep -c "postfix/smtp.*status=bounced"

Using the Zabbix Agent

You can simple add these four lines to the configuration file of the Zabbix agent running on the mail server in the UserParameter section. You also have to allow the Zabbix agent to run these remote commands by setting the EnableRemoteCommands option to 1.

My zabbix_agentd.conf file looks like this:

EnableRemoteCommands=1
UserParameter=postfix.deferred[*],/usr/sbin/postqueue -p | egrep -c "^[0-9A-F]{10}[^*]"
UserParameter=postfix.active[*],/usr/sbin/postqueue -p | egrep -c "^[0-9A-F]{10}[*]"
UserParameter=postfix.sent[*],logtail -f /var/log/mail.log -o /var/local/zabbix/sent.logtail | grep -c "postfix/smtp.*status=sent"
UserParameter=postfix.bounce[*],logtail -f /var/log/mail.log -o /var/local/zabbix/bounce.logtail | grep -c "postfix/smtp.*status=bounced"

The item checks for the actice mail is of the type Zabbix agent and its key is postfix.active according to the definition the the Zabbix agent config file. I use the units msgs and retrieve the value every 300 seconds. I like to see my send and deferred mail items per minute and not per second. I use a custom muliplier of 0.2 while checking every 300 seconds.

Note

Please be sure to use a float information type. Otherwise you would get a flat zero line for low-traffic mail servers.

All four values, together with a simple TCP test for port 25 belong to a new application called postfix.

Zabbix Config

In addition to the simple items measuring the performance of my mail system I defined a trigger that warns me if too many mails are waiting the the deferred queue. You also could define a trigger that pops up if a considerable part of the mails you send out do bounce. You also want to define nice graphs that give you a simple overview about the activity of your mail server.

Figure 1: Mails sent by a mail server with several instances of postfix. Every instance of postfix gets its own color.

If you have any further questions, please mail me: ms@sys4.de

Michael Schwartzkopff, 06. August 2013