Why are DNS lookups from certain pods so slow?
Certain containers that are based on Alpine Linux, such as NGINX, have issues with DNS query handling that can cause them to not process valid DNS responses. A small percentage of DNS lookups from these pods can take over 5 seconds instead of the typical <80ms. If the timeout that is set for your application is less than 5 seconds, the DNS lookups can time out.
To test whether your pod has this problem, log in to it and verify that curl is installed. You can install curl with the apk add curl
command. Then, inside the pod run the following command.
for i in $(seq 1 50); do curl -k -w "time_namelookup: %{time_namelookup}\n" -so /dev/null "https://www.ibm.com/"; done
This issue might be caused by the way DNS responses are handled in these pods. For example, when an IPv6 (AAAA record) and an IPv4 (A record) response are returned at nearly the same time, one of the two responses might not be processed, causing the DNS client to not register the response. After a 5 second delay, the DNS client sends out a new request.
You can resolve this issue in one of the following ways.
- (Preferred) Use a base image for your container that does not have this problem, such as Alpine 3.18. You can also replace Alpine with ubi-minimal.
- Add
options single-request-reopen
oroptions single-request
options to the client pod's/etc/resolv.conf
file. For example, use the followingpostStart
command to add these options to your pod. These options tell the DNS client to send out only one request (A or AAAA) at a time, and avoids the problem where two responses come back at nearly the same time.
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- "/bin/echo 'options single-request-reopen' >> /etc/resolv.conf"