Monitoring squid Proxy with Zabbix

Abstract

If somebody operates a service he should also know about the services performance. This lays the foundation for every monitoring at all. In this article I am going to describe a way to monitor squid Proxy performance with the Zabbix network monitoring system.

If somebody operates a service he should also know about the service's performance. This is the foundation for every monitoring. In this article I am going to describe a way of monitoring the performance of a Squid Proxy using the Zabbix network monitoring system.

I'll use squid's Simple Network Management (SNMP) agent, because it is the best way to transfer all necessary data into Zabbix - without scripting. Basically every good application comes with its own SNMP agent that can be used for monitoring.

SNMP in squid

I won't cover SNMP agent setup in this article, since it already has been covered very well.

Note

I found the wiki site of the project helpful.

Please set up the ACLs of your agent in a way, that only monitoring stations (i.e. the Zabbix machine) have access to Squid's bulit-in SNMP agent. After a restart the squid agent is ready to answer requests.

Now download the squid MIB file from documentation directory of the squid server or from the source code of the project itself. Copy that file into the MIB directory of your Zabbix server. On SUSE and Red Hat systems the directory is /usr/share/snmp/mibs - on Debian it is /usr/share/mibs.

Now you can request all interesting information about the proxy walking through the MIB of the server -- the command for that is called snmpwalk. You need the IP address of the proxy server and the community string that your defined in the squid configuration:

$ snmpwalk -v 1 -c <community> -m +ALL -M+<MIB_DIRECTORY> <proxy>:3401 enterprises.3495

If you run the command you should see the SNMP agent collecting a lot of status and performance data. I'll use this data to monitor the proxy in Zabbix.

SNMP in Zabbix

Adding a SNMP item to Zabbix is quite simple: Just open the item list for the proxy and click the top right Create item button. As a sample I will add the cacheCpuTime item. It tells about the amount of CPU seconds consumed by the proxy.

Please use the name that the MIB defines for other items. It makes debugging easier.

Select SNMPv1 agent for the item type and enter cacheCpuTime also as key for the new item. I prefer to add the OID in numeric form. Then you do not need to add the squid.mib to the MIB storage of Zabbix. Your can get the OID easily from the output of the snmpwalk command if your enter the above command of little bit differently and directly ask for the cacheCpuTime:

$ snmpget -v 1 -c <community> -On -m +ALL -M+<MIB_DIRECTORY> <proxy>:3401  cacheCpuTime.0

The option -On tells the command to poll with the OID instead of the human readable interpretation. Please also note the .0 in the end. This is necessary because snmpget fetches exactly one node information from the SNMP tree. Your also could snmpwalk for cacheCpuTime.

Please enter the SNMP community string your defined before. But perhaps it makes more sense to use a macro {$SNMP_COMMUNITY} that you set when defining the host.

Please add the port number 3401 to the item configuration, since the squid SNMP agent does not use the standard Port 161. Please add the units, „%“ and save the value as it is.

Note

The update interval should not be less than 300 seconds unless you have a good reason to torture your proxy with frequent SNMP request.

Looking to the latest data after 5 minutes you should see the first results of your measurement. On that basis you could define triggers when the CPU load caused by squid is too high or make zabbix draw nice graphs of the CPU usage.

The Zabbix Template

To make the whole job easier for you I collected some nice values from the squid MIB and put together some items, triggers and graphs that represent the performance of the squid server. You can download the template from my private website .

Items

When applying the template to a host Zabbix collects the following items:

cacheClientHttpRequests
Number of HTTP requests per second received.
cacheCpuUsage
The percentage use of the CPU.
cacheHttpAllSvcTime.5
HTTP all service time. Median time over the last 5 minutes.
cacheHttpHitSvcTime.5
HTTP hit service time. Median time over the last 5 minutes.
cacheHttpMissSvcTime.5
HTTP miss service time. Median time over the last 5 minutes.
cacheMaxResSize
Maximum Resident Size in KB.
cacheMemPercent
Calculated from cacheMemUsage and MaxResSize.
cacheMemUsage
Total memory accounted for in KB.
cacheNumObjCount
Number of objects stored by the cache.
cacheRequestByteRatio.5
Byte Hit Ratio. Median over the last 5 minutes.
cacheRequestHitRatio.5
Request Hit Ratio. Median over the last 5 minutes.
cacheRequests
All protocol cache requests per second.

Triggers

At the moment there are no triggers defined. Please mail me if you have good suggestions.

Graphs

The template defines three graphs:

  1. Squid_Ratio shows the ratio of hits/non-hits for all requests and the bytes delivered. It gives you an impression what overall impact the squid software has. Sometimes the Byte hit ratio is (much) larger than 100%. That seems to be a bug in the SNMP agent of squid. And I did not find a possibility the limit the measured value to a maximum of 100.
  1. The Squid_Timing graph shows you the timing for HTTP objects that were delivered after a cache hit, a cache miss and the mean value over all HTTP objects. In this graph you will see that objects from the cache are delivered much faster that objects that have to be fetched from the internet. The average time should be inbetween both curves. Sometimes you will see peaks in the graphs. This happens when large objects (that take their time) have to be fetched from the internet and very few other (easily retrieved) objects are requested.
  1. The Squid_Usage shows you the Memory and CPU usage (left axis) versus the incoming cache requests (right axis). This graph gives you a nice impression about the perfomance of your system.
Michael Schwartzkopff, 16. November 2012