DNS requests dropped at the firewall

Abstract

My customer saw a very strange behaviour of DNS packets. Some of the packets where dropped at the firewall. Since this caused a short delay I was asked to do some analysis.

My customer saw a very strange behaviour of DNS packets. Some of the packets where dropped at the firewall. Since this caused a short delay I was asked to do some analysis. It turned out I had to debug an IPv4/IPv6 glibc problem. But first things first…

What happened exactly? Name resolution on the linux host always had some mysterious delay. To test it I used the getent command like this:

# getent passwd

The following figure illustrates the problem the host ran in:

Figure 1: Timing of the packets on the network.

The packets in detail:

  1. The Linux host sent two DNS requests to the DNS server behind a firewall asking for the AAAA and the A records of an other host.
  2. The DNS server answered both packets. The firewall let the first packet pass but dropped the second packet.
  3. After one second the Linux host sent out the two packets again. But this time it queried for the AAAA record, waited for the answer and only then sent out the request for the A record. This time the firewall let pass both answer packets.

After all, a very strange behaviour of the firewall. But only at first glance!

A detailed investigation showed that the Linux host sent out all packets from the same source port. Obviously the source and destination IP address and the destination port (udp/53) also were identical.

So when the first request paket passed the firewall it added a entry in its state table with these four criteria. The second request packet did not trigger a second entry in the state table since it did not differ from the existing entry. The firewall let the first answer packet pass and deleted the entry in the state table. The second answer packet was, correctly, dropped by the firewall.

After a little investigation (thanks Robert!) I found that the glibc offers a option to disable this behaviour:

single-request (since glibc 2.10)
                   sets RES_SNGLKUP in _res.options.  By default, glibc performs
                   IPv4 and IPv6 lookups in parallel since version 2.9.  Some
                   appliance DNS servers cannot handle these queries properly
                   and make the requests time out.  This option disables the
                   behavior and makes glibc perform the IPv6 and IPv4 requests
                   sequentially (at the cost of some slowdown of the resolving
                   process).

Adding the single-request option to resolv.conf solved the problem.

Michael Schwartzkopff, 15. March 2013

   DNS    glibc    firewall