Domain Name System
Purpose
The purpose of the domain name system is to translate domain names into
IP numbers. Secondary purposes include translating IP numbers
back
into domain names, and keeping track of MAIL HOSTS for each domain
name.
It is possible to have an IP number and yet have no domain
name.
Our lab printer is an example of such a machine.
Organization
The domain name system is a heirarchical system, just like a file
system.
Names are to be read right to left (reverse order). It must be
heirarchical
since otherwise the administration becomes unweildly. There is no
canonical list for the same reason.
The top level has the names .com .org .gov .mil .net .edu and the
country
names, and a bunch of others like .aero. The generic names are all managed for the next year by a
company
in Ann Arbor. The contry names are managed by organizations based
in that country. (Weird fact. the Federated States of
Miconesia
and Armenia have both sold rights to radio stations).
Some nations split the county names into .edu, .gov, like we
do.
Japan has a .edu.jp. Other nations do not, like the Netherlands.
Names
Domain names are not limited to roman letters, but allow all of unicode. Other parts of the URL generally
allow whatever the hosts's OS allows.
Maximum component length is 63 bytes, and max total length is 255
bytes.
THERE IS NO CONNECTION BETWEEN DOMAIN NAMES AND PHYSICAL
LOCATIONS.
There is no connection between domain names and which physical
networks.
a.a.com and a.b.com may or may not be in the same builing and the same
LAN. They could even be the same machine.
Each domain name coresponds to one IP address, but more than one
name
can corespond to the same address.
There is no canonical list of domain names. No one knows all
the
names on the internet.
There is no way to tell, just looking at a name, if it descibes a
subdomain
or a specific host. What is cs.purdue.edu? cs.nmu.edu?
Generally each host will have a default domain name that if appends
to any unqualified names. Some hosts have more than one, and try
each one in order.
DNS Software
Dns servers (sometimes called nameservers) are programs
that
listen on udp port 53 (tcp port 53 is optionally used). Note that
since they use UDP ports, the service is unreliable, out of
order.
Resolvers are pieces of software that clients use to look up
names.
Often resolvers are shared libraries that the client can link to, and
that
offer functions like gethostbyname(). All resolvers and
nameservers
speak the same DNS protocol. If a packet is lost on the internet,
resolvers generally use a timeout and retransmit mechanism.
Each dns namespace is required to have both a primary nameserver and
a backup nameserver. These must have different access paths to
the
internet (cannot use the same cablemodem). Lots of companies will
provide this service for you if you want.
DNS Lookup
Different locations can translate the same name to different IP numbers.
Netflix.com might translate to a different server in California and Connecticut.
There are two ways to do a dns lookup. You can either use
recursive
mode, or iterative mode. I *think* almost everyone uses iterative mode.
Recursive mode:
You ask the nameserver to find the answer. If it does not know,
it asks a second name server, which might ask a third, and so on.
The answers propogate backwards until the final anwser gets to
you.
For euclid to look up xinu.cs.purdue.edu ...
-
Euclid asks lisa.nmu.edu (our nameserver)
-
Lisa does not know, so it asks 'dot'. Euclid knows dot since all
machines know dot.
-
Dot does not know, so it asks the nameserver for .edu (which it knows
since
.edu is directly under dot).
-
The nameserver for .edu does not know, so it asks the nameserver for
purdue.edu
(which it is required to know).
-
The nameserver for purdue.edu does not know, so it asks the nameserver
for cs.purdue.edu (which it is required to know).
-
This nameserver knows the answer, and replies to cs.purdue.edu
-
The nameserver for cs.purdue.edu replies with the answer to purdue.edu
-
The nameserver for .purdue.edu replies with the answer to .edu
-
The nameserver for edu replies with the answer to 'dot'.
-
The nameserver for 'dot' replies with the anser to lisa.nmu.edu
-
Lisa.nmu.edu replies with the answer to euclid
Iterative mode:
You ask the nameserver for the answer. If it does not know, it
will point you up the chain until it does
-
Euclid asks lisa.nmu.edu (our nameserver).
-
Lisa does not know, but does tell us the address for 'dot'.
-
We ask dot for the IP number for the nameserver of 'edu'
-
Dot tells us the IP number for edu.
-
We ask edu for the IP number of the nameserver of purdue.edu.
-
Edu's nameserver answers.
-
We ask the namserver of purdue.edu for the IP number of the
nameserver
of cs.purdue.edu
-
The nameserver answers
-
We ask the nameserver for cs.purdue.edu for the IP number of
xinu.cs.purdue.edu
-
That nameserver answers. We have an answer!!
Ways to Make this Faster
Caching. Keep track of recently asked questions, and the
associated
answers. Also keep track of negative answers (host not
found).
This works even better for recursive queries, since the cache becomes a
centralized resource for the whole organization/network/group.
All
cached entries are marked non-authoritative, and also have a time to
live
field to prevent stale data form lasting forever.
Collapse the heirarchy. Have dot not only just the big six
plus
country codes, but also store the answers for all of .com Have
the
nameserver for purdue.edu store all the answers for the whole
university
(which also reduces the total number of nameservers, which might help
in
the world of $$).
Reverse Name Lookups
To lookup an IP number aaa.bbb.ccc.ddd and get the hostname,
simply
run a standard querry on ddd.ccc.bbb.aaa.in-addr.arpa
Other Things the DNS Can Tell You
A dns lookup can tell you
Recored ID |
Contents |
A |
IPv4 number for the host |
AA |
IPv6 number for the host |
CNAME |
The cannonical hostname (useful if it's an alias) |
HINFO |
CPU and OS type |
MINFO |
Mail information |
MX |
The mail exchange (what machine receives mail for this
machine) |
NS |
The authoratative name server for this machine. |
SOA |
A list of names for which this nameserver is authoratative. |
TXT |
Notes
|
SPF
|
Sender Prefered From .. an
anti-fake-email idea.
|
DNSSEC
DNSSEC is a cryptographic signing of the DNS records. Otherwise, anyone who can
effect your packets and send you wrong answers. It uses public key crypt
and a complicated system to sign records to prove that they are correct.
DNSSEC is used by some but no all organizations.
Fake DNS
One easy way to implement the censorship system is to force the DNS server
to provide a bad IP number for the people you don't want seen. This only allows
you to block web sites, and not web pages (Wikipedia has both wholesome
and icky content). It can be defeated by having people use IP numbers
instead of names.
Charter used to provide an IP number to their server for every non-existant
DNS lookup. They could show you ads. If you opted out of it, they would
give you a different IP number for every non-existent DNS lookup, one that
went to a "web site not found" page. This messed up email.
Interesting Questions
-
Why should a nameserver know the IP number of it's parent in the
heirarchy?
Would the IP number be enough?
-
How would you get a list of all names in the DNS?
-
Would it make since to allow querries like a*l.com?
Links
International domain names at http://www.nunames.nu/lldemo/default.htm.
Worlds Longest Domain names at http://www.oreillynet.com/onlamp/blog/2005/06/the_worlds_longest_domain_name.html.
List of all top level domains at http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains.
How many hostnames are ther at http://www.domaintools.com/internet-statistics/.