The Domain Name System (DNS) protocol is used primarily to find out the IP address of a computer, given its domain or host name. It is because of the DNS protocol, human beings are able to associate meaningful names to computers, instead of remembering the IP address of each computer.
DNS can also be used for other purposes like getting the domain name of a computer from its IP address (reverse lookup), getting the IP address of the mail server corresponding to a domain name (MX record parameter of a DNS message), getting the actual canonical host name of a machine, given its alias address (CNAME parameter), load balancing among a set of web servers serving the same domain etc. DNS is used by many application layer protocols like HTTP, SMTP, FTP etc. for translating domain names into IP addresses.
Basic Principle of Operation of DNS
DNS is a simple, UDP based client-server application layer protocol, based on a hierarchical and distributed data base. DNS clients send DNS query messages using UDP, to remote DNS servers, to resolve hostnames into IP addresses.
Organization of DNS Servers and DNS DataBase
The DNS data base, containing the mapping of domain name to IP address, of millions of computers, is distributed across a huge number of distributed DNS servers, organized in a hierarchical fashion. The DNS server hierarchy consists primarily of four levels of servers, namely
- Root DNS servers
- Top level Domain (TLD) servers
- Authoritative DNS servers
- Authoritative sub-domain DNS servers
The DNS server hierarchy is shown in the diagram given below:
As shown in the diagram above,
- At the highest level, Root DNS servers contain DNS records pertaining to the DNS servers of all the TLDs like .com, .edu, .org etc.
- At the second level, TLD servers contain the DNS records of all the Authoritative DNS servers for a specific TLD. For e.g. .com TLD server contains the records of all the authoritative DNS servers of the .com domain (e.g. google.com authoritative DNS server details, yahoo.com authoritative DNS server details etc.).
- The third level in the DNS data base hierarchy consists of individual domain level authoritative DNS servers, that are responsible for resolving the DNS entries corresponding to a specific domain (e.g. google.com, yahoo.com, stanford.edu domains).
- The fourth level in the DNS data base hierarchy consists of sub-domain level authoritative DNS servers, that are responsible for resolving the DNS entries corresponding to specific sub-domains (e.g. cs.stanford.edu authoritative DNS server is responsible for resolving the host names of all computers belonging to the computer science department of stanford university). Also, at the sub-domain level, there can be further hierarchy and DNS protocol is flexible enough to allow any level of further hierarchy. For e.g. there could be a further sub-domain like research.cs.stanford.edu pertaining to the research wing of the computer science department of stanford university.
For reliability, load balancing and redundancy purposes, each DNS server is replicated by at least one more machine. For example, there are more than 13 Root DNS servers distributed across different geographical locations across the globe.
DNS Message Exchange
When an application in the end computer wants to resolve a host name, it contacts the DNS client software in the computer to resolve the host name. The DNS client software then sends a DNS query message to its configured local DNS server (ISP’s DNS server), using UDP as the underlying transport protocol. DNS servers usually wait on UDP port number 53.
If the local DNS server has the resolved entry already in its cache and if that entry is recent (not an outdated stale entry), then the local DNS server replies back with a DNS reply message, that contains the IP address corresponding to the queried host name.
If the local DNS server does not have the entry in its local cache, then it queries one of the root DNS servers. Based on the queried domain, the root DNS server sends back the IP address of the next level TLD server to the local DNS server. For e.g. if the query is for the host name google.com, then the root server returns back the IP address of the TLD server corresponding to the .com domain. The local DNS server then sends a new DNS query to the .com TLD server. The .com TLD server then sends back the IP address of the Authoritative DNS server of google.com, to the local DNS server. The local DNS server then sends the DNS query to the Authoritative DNS server of google.com and gets the IP address of the google.com hostname resolved.
Once it get the hostname resolved, the local DNS server replies back to the DNS client with the resolved name.
An example DNS query
The figure given below illustrates the typical steps involved in resolving a hostname named “matlab.math.mit.edu”.
As indicated in the figure, the process of resolving the hostname “research.math.mit.edu” by an end user, involves a total of 10 DNS messages, with DNS messages being sent to DNS servers distributed at different places. At each level, the query is redirected back to the corresponding domain/sub-domain server, till it finally reaches the actual DNS server responsible for resolving the complete hostname.
In the above example, the first DNS query sent by the end computer to the local DNS server is an example of a recursive DNS query, because the end computer requests the local DNS server to resolve the hostname on its behalf, by asking the local DNS server to recursively query other DNS servers to resolve the hostname. Rest of the DNS queries are all sent by the local DNS server and they are examples of iterative DNS queries, because they are all sent by the same local DNS server, one after the other. The type of the DNS query (recursive/iterative) is a parameter that can be specified in the DNS query message.
Since DNS uses the unreliable UDP as the underlying transport layer protocol, if DNS messages are lost, then it is the responsibility of the DNS protocol or applications that use the DNS protocol to retransmit DNS messages.
As for the message formats, DNS messages are based on standard TLV (Type, Length, Value), with the type specifying the different types of DNS messages like query, reply etc. and the value containing the actual IP address, CNAME, MX record etc.