DNS Resolution Delay: The Silent Killer That Blocks Your Threads

Posted by Designer_Bug9592@reddit | programming | View on Reddit | 16 comments

The Blocking Problem Everyone Forgets

Here’s the thing about DNS lookups that catches people off guard. When your service needs to connect to another service, it has to resolve the hostname to an IP address. In most programming languages, this happens through a synchronous system call like getaddrinfo(). That means the thread making the request just sits there, doing nothing, waiting for the DNS response.

Normally this takes 2-5 milliseconds and nobody notices. You have a thread pool of 200 threads, each request takes maybe 50ms total, and you’re processing thousands of requests per second without breaking a sweat. The occasional DNS lookup is just noise in the overall request time.

But when DNS gets slow, everything changes. Imagine your DNS resolver is now taking 300ms to respond. Every thread that needs to establish a new connection is now blocked for 300ms just waiting for DNS. During that time, incoming requests pile up in the queue. More threads pick up queued requests, and they also need new connections, so they also get stuck on DNS. Before you know it, your entire thread pool is blocked waiting for DNS responses, and your service is effectively dead even though your CPU is at 15% and you have plenty of memory.

https://howtech.substack.com/p/dns-resolution-delay-the-silent-killer

https://github.com/sysdr/howtech/tree/main/dns_resolution