Getting to Know DNS! Part 1

It’s been a while since I’ve created this blog but until now, I haven’t really talked about one of my favorite topic where networking and computers in general are concerned: the domain name system or better known as DNS. I remember being asked recently on just how does the Internet work? Many users know that the Internet itself is very big but they don’t have a clue as to just how computers from one side of the planet can talk to another computer on the other side. To put it in more broad terms, do you have any idea on just what the heck happens between the time you enter in an URL address in your favorite browser and pressing the Enter key to the webpage actually appearing on your screen? Have you ever wondered just how did your computer, sitting somewhere in the world, was able to communicate with a web server belonging to Facebook or Twitter, which also could be located anywhere in the world as well? If so, then this article series is definitely for you! While DNS is definitely a huge topic, the aim of this three article series is not meant to turn you into a DNS administrator or expert by the time you’re finished with it! What I promise will happen is you getting a lot more clearer picture at just how computers can manage find each other in the biggest computer network in existence today, the big ol’ Internet!

Humans Are Stupid

Doh!One of the first things you should understand about DNS is how computers actually communicate with each other. You see, computer systems believe that us humans are dumb. Why? Because computers locate each other via numbers which they believe is much more efficient than using letters, symbols and numbers. However, us humans are very bad at remembering a large quantity of numbers. Heck, I’m the type of guy who have a hard time remembering a single phone number! For us, remembering a name such as Yahoo.com is so much more easier than trying to remember 206.190.36.45! That long string of number is called an IP address and it actually maps to the Yahoo.com domain name. So, when a user types in Yahoo.com into their browser, what the computer actually needs to do is translate or map that domain name to an IP address. This process is called name resolution. Well, that was just one example. There are literally millions and millions of web pages out there but each and every time you enter in an URL address, name resolution is performed.

So now you’re probably thinking that solving this problem is very simple. Just create a master database with all the name and IP address mappings listed within and call it a day. Anytime a computer needed to resolve a name, just have it consult this master database file. Surely that will work, right? The answer is yes, it definitely will. In fact, that’s how it originally worked as I talk and explain about the HOST file in the next section. The next question you then have to ask yourself is just who in the world will manage this database?! Due to the sheer size of the Internet, it will be next to impossible to keep this master database file updated. I’m sure every single second that goes on there is some sort of change within the thousands and thousands of individual computer networks that have a presence on the Internet. Good luck having someone interview for that job!

The Dreaded HOST File

To put things in perspective and give you an idea of just why a system such as DNS was very much needed, you’d have to go back to the late 60′s and early 70′s where the military successfully created one of the first computer network. This network was called the ARPANET. What this network did is of no great importance where DNS is concerned. What is important in our discussion here however is to understand that at this time, there weren’t many computers that the administrators needed to keep track of because the network was considered private. The Internet as we know of it today obviously didn’t exist at that specific time period. If there were only 15 computers on the network at any given time, communicating with them is as simple as using a file to map the computer’s host name to their known IP address. In fact, that’s exactly what they did! This simple database or file was called the HOST file and its main job is to allow a computer to find the IP address of another computer on the network via its host name. For example, if my computer  needed to find the IP address for a host computer called COMPUTER01, then it would look inside the HOST file. Within this file, there are two pieces of information and those simply are the the IP address for a given computer host or name. So, once having found the host name for COMPTUER01 within the HOST file, the computer would also learn of its IP address! Therefore, if you had 15 computers on the network, then you would have 15 entries within the HOST file. Simple but efficient. Most importantly, it actually worked great……..at first.

For the curious, the HOST file on a Windows computer is usually located at:

C:\Windows\System32\Drivers\etc

You can use Notepad to open the file. Below is a picture of how an unaltered HOST file will look like:

HOST File

Ever wonder what happens when a computer is given only the IP address and instead needs to look up a host name instead of the other way around? Well, the same thing happens. A name resolution would be required. This process, however, rather than being labeled as a “forward lookup”, would be labeled as a “reverse lookup” instead. The focal point of this article and the next is solely on forward lookup because that type of lookup is what DNS servers all over the globe have to perform the majority of the time.

Where Did it All Go Wrong?!

I guess the main theme with the ARPANET was simplicity. Everything needed to be as simple as possible. Why go through the trouble of creating a complex communication system when only a few computers were joined to the network? Now that you know just what exactly the HOST file is and how it looks like, your next question should be just how did they manage it? The answer, as you might have guessed already, is manually by hand! A person responsible for the HOST file made sure that any new computer hosts that are joined the network as well as was deleted or had their host name and/or IP address changed were also reflected in the HOST file. The file would then either be placed on a central server or distributed manually to all the other computers on the network. If COMPUTER01 changed its IP address, then the administrator for the master HOST file had to make sure to make this change as quickly as possible otherwise other computers on the network wouldn’t be able to communicate with it! Even if the administrator quickly made the change to the file, there still could be problems because a computer might not have updated their HOST file to this newest version! As you can quickly see, even on a small network such as the ARPANET how a much more efficient method for name resolution was needed. As the ARPANET grew in size (again, it’s not imperative to know why or how it grew but just the fact that it grew to a much more bigger size is enough), so did the need for a more efficient system for computers to map host names to IP addresses!

Make it Go Away!

Well, by now, you should realize that having a person manually updating the HOST file for hundreds and millions of computer hosts on a network is literally asking that person to commit career suicide. It’s just not feasible nor is it probably even possible! Well, luckily in 1983, a computer scientist named Paul Mockapetris created the Domain Name System and you can thank or worship whomever deity you choose so that he did! It’s a brilliant system and it works extremely well. I’m guessing by the way that we still use it today is a testament of the system’s reliability and more importantly, scalability strengths. At this point, you’d expect me to completely drop the subject of the dreaded HOST file but that is where you are wrong my dear readers. You see, the HOST file is actually still in use today even with DNS succeeding it! For backward compatibility purposes and for very specific scenarios, the HOST file still exists within our systems. In fact, this may come as a shock to some but as I’ll explain in the next article, a computer resolving a host name actually looks at the HOST file for the answer first before making an attempt at using DNS!

The Domain Name System

Growing in SizeAlright, enough talk about the HOST file. Every talk about DNS can’t be complete without mentioning the HOST file because it is important you understand how name resolution worked in the past to really understand how DNS truly saves the day. With DNS, you can call it a hierarchical system with many different levels and branches. Think of it like this. If you have a really big task to accomplish, wouldn’t it make sense to break that task up into smaller portions and delegate that task to different groups of individuals? Well, this is the building block for DNS. As the network grew, it simply was not possible to have one governing body to rule them all. Instead, DNS breaks the namespace into more manageable chunks so that different organizations manage a specific portion of the namespace. Well, OK, what I just said wasn’t really all that true. There is actually a governing body that rule over the Internet namespace, sort of.

To more easily picture this, think of DNS as a pyramid with many different levels. A single period or dot separates each level of the hierarchy. At the very top of the hierarchy, you have the root domain. The root domain is actually the topmost level of DNS and is actually represented as just a single “.” or period. Right below this root domain is what are called the top level domains. This is what many of you are familiar with I’m sure. Top level domains include the .com, .net, .info, .biz, .org, .edu, .mil and a host of many others. In fact, each country actually has their own country domain to represent them based on their country code. For the United States, we have the .us domain. For Hong Kong and China, they have the .hk and .cn domains, respectively. As you might have guessed already, each top level domain has a specific purpose, or at least that’s how its suppose to go. For the .com domain, its mainly meant for commercial businesses. For educational institutions, they would use the .edu domain. Military websites usually end with the .mil top level domain. There are definitely exceptions to these rules and so they are not set in stone. In fact, I just broke one of the rules myself! Here at www.anotherwindowsblog.com, I am definitely not a commercial business but I can still use the .com domain. Yes folks, a bit of money (or a lot in some cases) usually goes toward this being possible!

DNS Pyramid

Immediately below the top level domains, we have our second level domain and here is where things get more interesting. Second level domains are where mere mortals like us actually get to own a piece of the Internet, sort to speak. At each level of the DNS pyramid or domain level, they each are maintained by different organizations. If this wasn’t the case, I’m sure mass chaos would ensue! The root domain is maintained by a very special group of people. They in turn delegate authority of the .com, .net, .info, .mil and all the other second level domains to other organizations. These organizations in turn delegate authority of second level domains to normal businesses and companies that want to have a public presence on the Internet. In most cases, this level is also where Internet Service Providers (ISPs) reside at.

What are some of the second level domains you ask? Here are some examples: CNN, Facebook, Twitter, Yahoo, Microsoft, ESPN and a host of others. I’m sure you get the idea. If that still doesn’t ring a bell, how about looking at it from this angle: cnn.com, facebook.com, twitter.com, yahoo.com and microsoft.com. Looks more familiar right? Well of course it does! This is how we get to websites within our browsers everyday! What this means is that those companies actually took the time to register their company names within the .com top level domain. They either paid a yearly fee to the organization that manages the .com top level domain or through some other third party organization. This allows them to have a public presence on the Internet because whenever a client wants to reach a server located within the Microsoft.com domain, the .com DNS servers have the necessary information to point the user to the right location. This will be much clearer in my next article.

Although these companies have registered their domain names publicly on the Internet, there is nothing stopping me from creating a test network or lab using the same name! For example, I could easily create my own local network with a domain name of microsoft.com. and I know for sure I won’t be receiving any letters in the mail from Microsoft themselves to see me in court. The problem with this approach is anytime I need a public presence on the Internet. As you might have suspected, Internet registrars will not let me register for the Microsoft.com domain name because it has already been registered by the Microsoft team themselves.

Sub-domains and FQDN’s

By now, you should have a better understanding of the DNS system, if just a bit more. Continuing on, your next question would probably be just where the heck does the WWW portion come into play? So far, I’ve talked about the root, top level and second level domains. So is WWW another domain level on the DNS pyramid? To better understand the answer, we now focus our attention on subdomains and fully qualified domain names. Let’s use Microsoft as the example here. If Microsoft registered for the Microsoft domain name within the .com top level domain, wouldn’t it make sense for Microsoft to be in charge of any other domains they want to create under the Microsoft.com parent domain name? Of course it does! When Microsoft wants to create a new domain under Microsoft.com, what they are doing is creating a sub or child domain. For example, Microsoft could decide to create a new domain within their company for their sales department and name it Sales. The sales child domain would now fall under its parent domain of Microsoft.com. Together, the entire domain would be sales.microsoft.com. Microsoft doesn’t really need permission to create this child domain. They just need to make sure that users can reach it. If users can connect to Microsoft.com, which is the “root” domain at Microsoft headquarters, then it is the responsibility of Microsoft themselves to make sure that users can also reach computers within the sales.microsoft.com domain. The .com domain is just responsible for directing users to Microsoft.com, in most cases.

Subdomains

The last piece of the DNS puzzle is the computer hosts themselves and how they fit into DNS. This part is very important to understand because it forms the basis of name resolution. Continuing the Microsoft example, they can have a number of hosts within the Microsoft domain and similarly so within their Sales domain. If a physical computer in the parent domain (Microsoft) is labeled Alice, how would you think this computer’s label within the DNS hierarchy would look like? Simple. Once again, we just add another dot after the label to separate the different levels of the DNS pyramid. So, the complete computer name for a computer named Alice within the Microsoft.com domain would be: Alice.Microsoft.com. When labeled this way, this can also be considered the fully qualified domain name (FQDN) of the computer. A FQDN label is basically a computer’s name from the most bottom part of the DNS pyramid all the way up to the root domain of DNS. In other words, a simple look at a FQDN tells you where it is that specific computer host sits within the DNS pyramid. One look at the FQDN I’ve given earlier immediately lets me know that there is a computer with a name of Alice within the Microsoft domain, which is registered under the .com domain and of course, that in turn is under the root domain.

Going with our child domain example, how would the FQDN of a computer host with a name of Bob look within the Sales domain? Simple. Once again, we just tact on the extra information. So, the FQDN would look like: Bob.Sales.Microsoft.com. Once again, given this information, we can easily see how this specific computer fits in the DNS hierarchy from way down bottom all the way back up to the root.

You should remember that the topmost domain in the DNS pyramid, the root, is an actual domain and it’s is not there just to look pretty! When talking about FQDN, the root domain actually gets appended to the label as well. Because the root domain is just represented as a single dot, a FQDN should always end with a dot as well. Microsoft.com is incomplete. Microsoft.com. is the actual FQDN. However, most browsers automatically append this special “.” for you when you enter in a URL address because while many users know about top level domains such as .com and .net, they most likely have no clue about the root domain, which sits above the top level domain! You most likely don’t belong in this category anymore after reading reading this article! Hey, you’re now considered smarter than the rest of the average Joes where name resolution is concerned!

Hold Up, Wait a Minute…

By now you may have noticed a simple pattern when looking at a FQDN. The left most portion (or the beginning) of the FQDN represents an actual computer host name. In other words, it represents an actual computer on a network. By now something should have struck you as very odd and peculiar. If what I just said was true, then am I actually telling you that when you type in a URL address of www.cnn.com that the www part is actually a real computer behind the scenes? Well, yes, that is exactly what I’m saying! When you enter in an URL address such as www.cnn.com, what your computer actually does is request the actual IP address for the computer named www within the cnn.com domain. In almost all cases, the address returned is the IP address for the computer named www, which in all likelihood is a web server of some sort. This isn’t always the case as companies deploy many security solutions to protect their resources but for the nature of this discussion, you can go ahead and believe just that to simplify things. In the next article, I will actually go more into the details of the name resolution process so you can see exactly what happens.

Coming Up Next…

In the next article, I’ll actually be explaining what name servers are and the data that stored within them. In this article, I’ve laid down the very basics of the DNS structure and namespace. This was obviously not meant to be a technical article and I’ve tried my best to really make things as simple as possible without overloading you with different terminologies. Here are some of the key pieces of information you need to understand from this article prior to continuing on to the next:

  • Understand how computers communicate at a very high level. The key takeaway is that humans use names such as www.cnn.com while computers use IP addresses, or numbers, such as 192.168.1.1, to represent the same piece of information. This ultimately leads to a need for name resolution.
  • Understand how the HOST file works. Although this file is rarely used in all but the most specific of scenarios and circumstances, it gives you a good understanding for why a system such as the DNS was sorely needed.
  • Understand how the DNS pyramid, or hierarchy to be more precise, looks like. You should understand that the system is broken down to different levels, which can be managed by different organizations.
  • Understand what a FQDN looks like and how it is used to map a specific host from the most bottom of the DNS hierarchy all the way back up to the root domain and vice-versa.

Once you are confident in your knowledge, you can safely move on to the next article where things get a bit more technical!

VN:F [1.9.22_1171]
Rating: 5.0/5 (3 votes cast)
Getting to Know DNS! Part 1, 5.0 out of 5 based on 3 ratings

Poll

For Windows 8 users on desktops and laptops, how often do you actually use "apps" downloaded through the Microsoft Store?

View Results

Loading ... Loading ...

Comments

  1. hey simon ur articles are interesting .. i always wondered about how internet works and this article clarifies me alot but still i wanna about the authorities who manages the root level domain like .com ? how they do it and why i can’t create a new root level domain and a top level domain and have fully control over my site ……..(well this is a stupid ques. but plz can u tell me )…how they whole dam thing work ?

    • Hey Anadi. I’m glad you found the articles interesting. As to your questions, the answer is very simple: delegation. If you read the DNS articles, you should know by now the idea behind the HOST file. If you, me and our next door neighbors are able to create our own domains any way we see fit, then who will be responsible for them throughout the entire Internet which of course is utilized around the world by millions and millions of people every second? There will only be mass chaos! The “who” is not important here but more so the “why”. To control something as big as the Internet, there must be some sort of hierarchy and flow of delegation from the top all the way to the bottom otherwise things will just not work.

      As to the “how” part, I suggest you read the second article in this series and learn about resource records. As for having full control of your site, you actually can make this possible but I can guarantee you that if you don’t know what you’re doing, you won’t get very far! You could technically register for a domain name and have the name resolve to a DNS server that you personally own. That server in turn will point to a web server that you also have absolute full control over. In your home network, this means having these servers turned on 24/7 otherwise no one will be able to visit your site. Most users don’t do this because it’s not practical. Personally I (among thousands of other people) pay $10 a year for a domain name. We then pay about $10 a month for a company to host our site and provide the DNS services. All we then have to do is configure our website however we see fit.

      • … thanks very much….i have read all the 3 parts i learned about records . my most fav. is the process of name resolution request….thanks now i know better …..!

        • That’s great to hear! I understand that for a beginner, there will be many many questions in the beginning. If you read all three articles and understood it, then you are way more knowledgeable than the average people out there. Makes you feel special, eh?! If you have any more questions regarding this topic, feel free to write them here as a comment or email them to me via the link to my contact form which you can find at the very top of this blog.

  2. Nice and informative article as always. Although I am still confused on 1 issue:
    There are so many DNS providers. Does that mean each have its own dynamic database. So, if I purchase a domain , its entry gets added in somewhere. So, how all DNS providers know of this ? Is it continuous syncing going on every time. this seems similar to hosts file concept but just automatic.

    • Good question Ankur! The whole idea behind DNS is its distributed nature. However, while distributed, it still functions and behaves like one big integrated database.

      At a very high level, when you register a domain name, your registrar creates a record for that domain name on their name servers. Because the registrar has a special relationship with the the top level domains (.com, .net., .biz etc etc..), they are allowed to help your domain name get propagated throughout those servers in turn. What this basically does is telling the top level domain name servers that if someone wants to find the IP address for your domain name, head over to this name server instead. This name server, of course, will be located within the company of where you registered for your domain in the first place although this is definitely not always the case.

      For example, I registered my name via GoDaddy. However, I’m using Hostgator as my web host. Therefore, every time someone wants to visit my website, they get the IP information from the Hostgator DNS servers instead of GoDaddy. However, each year, I still have to pay GoDaddy to help me renew my domain name. Which name servers I use is of no importance nor do they probably care. However, I myself need to make sure that I am using the right name servers because GoDaddy are the registrar I am using to help me get “noticed”. If I use the wrong name servers, than GoDaddy will also register this information in the .com domain name servers and users will not be able to visit my site.

      I will actually go into more detail in answering this question in my next article. You just beat me to the punch! Once you understand how DNS name resolution works, then I promise you that everything will be much more clear.

Speak Your Mind

*


(humans only, please) *