|
| |
On the Internet, the Domain Name System (DNS) associates various sorts of
information with so-called domain names; most importantly, it serves as the
"phone book" for the Internet by translating human-readable computer hostnames,
e.g. en.wikipedia.org, into the IP addresses, e.g. 66.230.200.100, that
networking equipment needs for delivering information. It also stores other
information such as the list of mail exchange servers that accept email for a
given domain. In providing a worldwide keyword-based redirection service, the
Domain Name System is an essential component of contemporary Internet use.
Uses
The most basic use of DNS is to translate hostnames to IP addresses. It is in
very simple terms like a phone book. For example, if you want to know the
internet address of en.wikipedia.org, the Domain Name System can be used to tell
you it is 145.97.39.155. DNS also has other important uses.
Preeminently, DNS makes it possible to assign Internet destinations to the human
organization or concern they represent, independently of the physical routing
hierarchy represented by the numerical IP address. Because of this, hyperlinks
and Internet contact information can remain the same, whatever the current IP
routing arrangements may be, and can take a human-readable form (such as "wikipedia.org")
which is rather easier to remember than an IP address (such as 66.230.200.100).
People take advantage of this when they recite meaningful URLs and e-mail
addresses without caring how the machine will actually locate them.
The Domain Name System (DNS) distributes the responsibility for assigning domain
names and mapping them to IP networks by allowing an authoritative server for
each domain to keep track of its own changes, avoiding the need for a central
registrar to be continually consulted and updated.
History
The practice of using a name as a more human-legible abstraction of a machine's
numerical address on the network predates even TCP/IP, and goes all the way to
the ARPAnet era. Back then however, a different system was used, as DNS was only
invented in 1983, shortly after TCP/IP was deployed. With the older system, each
computer on the network retrieved a file called HOSTS.TXT from a computer at SRI
(now SRI International). The HOSTS.TXT file mapped numerical addresses to names.
A hosts file still exists on most modern operating systems, either by default or
through configuration, and allows users to specify an IP address (eg.
192.0.34.166) to use for a hostname (eg. www.example.net) without checking DNS.
As of 2006, the hosts file serves primarily for troubleshooting DNS errors or
for mapping local addresses to more organic names. Systems based on a hosts file
have inherent limitations, because of the obvious requirement that every time a
given computer's address changed, every computer that seeks to communicate with
it would need an update to its hosts file.
The growth of networking called for a more scalable system: one that recorded a
change in a host's address in one place only. Other hosts would learn about the
change dynamically through a notification system, thus completing a globally
accessible network of all hosts' names and their associated IP Addresses.
At the request of Jon Postel, Paul Mockapetris invented the Domain Name System
in 1983 and wrote the first implementation. The original specifications appear
in RFC 882 and 883. In November 1987, the publication of RFC 1034 and RFC 1035
updated the DNS specification and made RFC 882 and RFC 883 obsolete. Several
more-recent RFCs have proposed various extensions to the core DNS protocols.
In 1984, four Berkeley students — Douglas Terry, Mark Painter, David Riggle and
Songnian Zhou — wrote the first UNIX implementation, which was maintained by
Ralph Campbell thereafter. In 1985, Kevin Dunlap of DEC significantly re-wrote
the DNS implementation and renamed it BIND (Berkeley Internet Name Domain,
previously: Berkeley Internet Name Daemon). Mike Karels, Phil Almquist and Paul
Vixie have maintained BIND since then. BIND was ported to the Windows NT
platform in the early 1990s.
Due to BIND's long history of security issues and exploits, several alternative
nameserver/resolver programs have been written and distributed in recent years.
How DNS works in theory
Domain names, arranged in a tree, cut into zones, each served by a
nameserver.The domain name space consists of a tree of domain names. Each node
or leaf in the tree has one or more resource records, which hold information
associated with the domain name. The tree sub-divides into zones. A zone
consists of a collection of connected nodes authoritatively served by an
authoritative DNS nameserver. (Note that a single nameserver can host several
zones.)
When a system administrator wants to let another administrator control a part of
the domain name space within his or her zone of authority, he or she can
delegate control to the other administrator. This splits a part of the old zone
off into a new zone, which comes under the authority of the second
administrator's nameservers. The old zone becomes no longer authoritative for
what goes under the authority of the new zone.
A resolver looks up the information associated with nodes. A resolver knows how
to communicate with name servers by sending DNS requests, and heeding DNS
responses. Resolving usually entails iterating through several name servers to
find the needed information.
Some resolvers function simplistically and can only communicate with a single
name server. These simple resolvers rely on a recursing name server to perform
the work of finding information for them.
Parts of a domain name
A domain name usually consists of two or more parts (technically labels),
separated by dots. For example wikipedia.org.
The rightmost label conveys the top-level domain (for example, the address
en.wikipedia.org has the top-level domain org).
Each label to the left specifies a subdivision or subdomain of the domain above
it. Note that "subdomain" expresses relative dependence, not absolute
dependence: for example, wikipedia.org comprises a subdomain of the org domain,
and en.wikipedia.org comprises a subdomain of the domain wikipedia.org. In
theory, this subdivision can go down to 127 levels deep, and each label can
contain up to 63 characters, as long as the whole domain name does not exceed a
total length of 255 characters. But in practice some domain registries have
shorter limits than that.
A hostname refers to a domain name that has one or more associated IP addresses.
For example, the en.wikipedia.org and wikipedia.org domains are both hostnames,
but the org domain is not.
The Domain Name System consists of a hierarchical set of DNS servers. Each
domain or subdomain has one or more authoritative DNS servers that publish
information about that domain and the name servers of any domains "beneath" it.
The hierarchy of authoritative DNS servers matches the hierarchy of domains. At
the top of the hierarchy stand the root nameservers: the servers to query when
looking up (resolving) a top-level domain name (TLD).
Iterative and recursive queries:
An Iterative query is one where the DNS server may provide a partial answer to
the query (or give an error). DNS servers must support non-recursive queries.
A recursive query is one where the DNS server will fully answer the query (or
give an error). DNS servers are not required to support recursive queries and
both the resolver (or another DNS acting recursively on behalf of another
resolver) negotiate use of recursive service using bits in the query headers.
Address resolution mechanism
(This description deliberately uses the fictional .example TLD in accordance
with the DNS guidelines themselves.)
In theory a full host name may have several name segments, (e.g
ahost.ofasubnet.ofabiggernet.inadomain.example). In practice, in the experience
of the majority of public users of Internet services, full host names will
frequently consist of just three segments (ahost.inadomain.example, and most
often www.inadomain.example).
For querying purposes, software interprets the name segment by segment, from
right to left, using an iterative search procedure. At each step along the way,
the program queries a corresponding DNS server to provide a pointer to the next
server which it should consult.
A DNS recursor consults three nameservers to resolve the address
www.wikipedia.org.As originally envisaged, the process was as simple as:
the local system is pre-configured with the known addresses of the root servers
in a file of root hints, which need to be updated periodically by the local
administrator from a reliable source to be kept up to date with the changes
which occur over time.
query one of the root servers to find the server authoritative for the next
level down (so in the case of our simple hostname, a root server would be asked
for the address of a server with detailed knowledge of the example top level
domain).
querying this second server for the address of a DNS server with detailed
knowledge of the second-level domain (inadomain.example in our example).
repeating the previous step to progress down the name, until the final step
which would, rather than generating the address of the next DNS server, return
the final address sought.
The diagram illustrates this process for the real host www.wikipedia.org.
The mechanism in this simple form has a difficulty: it places a huge operating
burden on the root servers, with each and every search for an address starting
by querying one of them. Being as critical as they are to the overall function
of the system such heavy use would create an insurmountable bottleneck for
trillions of queries placed every day. The section DNS in practice describes how
this is addressed.
Circular dependencies and glue records
Name servers in delegations appear listed by name, rather than by IP address.
This means that a resolving name server must issue another DNS request to find
out the IP address of the server to which it has been referred. Since this can
introduce a circular dependency if the nameserver referred to is under the
domain that it is authoritative of, it is occasionally necessary for the
nameserver providing the delegation to also provide the IP address of the next
nameserver. This record is called a glue record.
For example, assume that the sub-domain en.wikipedia.org contains further
sub-domains (such as something.en.wikipedia.org) and that the authoritative
nameserver for these lives at ns1.something.en.wikipedia.org. A computer trying
to resolve something.en.wikipedia.org will thus first have to resolve
ns1.something.en.wikipedia.org. Since ns1 is also under the
something.en.wikipedia.org subdomain, resolving something.en.wikipedia.org
requires resolving ns1.something.en.wikipedia.org which is exactly the circular
dependency mentioned above. The dependency is broken by the glue record in the
nameserver of en.wikipedia.org that provides the IP address of
ns1.something.en.wikipedia.org directly to the requestor, enabling it to
bootstrap the process by figuring out where ns1.something.en.wikipedia.org is
located.
In practice
When an application (such as a web browser) tries to find the IP address of a
domain name, it doesn't necessarily follow all of the steps outlined in the
Theory section above. We will first look at the concept of caching, and then
outline the operation of DNS in "the real world."
Caching and time to live
Because of the huge volume of requests generated by a system like DNS, the
designers wished to provide a mechanism to reduce the load on individual DNS
servers. To this end, the DNS resolution process allows for caching (i.e. the
local recording and subsequent consultation of the results of a DNS query) for a
given period of time after a successful answer. How long a resolver caches a DNS
response (i.e. how long a DNS response remains valid) is determined by a value
called the time to live (TTL). The TTL is set by the administrator of the DNS
server handing out the response. The period of validity may vary from just
seconds to days or even weeks.
Caching time
As a noteworthy consequence of this distributed and caching architecture,
changes to DNS do not always take effect immediately and globally. This is best
explained with an example: If an administrator has set a TTL of 6 hours for the
host www.wikipedia.org, and then changes the IP address to which
www.wikipedia.org resolves at 12:01pm, the administrator must consider that a
person who cached a response with the old IP address at 12:00pm will not consult
the DNS server again until 6:00pm. The period between 12:01pm and 6:00pm in this
example is called caching time, which is best defined as a period of time that
begins when you make a change to a DNS record and ends after the maximum amount
of time specified by the TTL expires. This essentially leads to an important
logistical consideration when making changes to DNS: not everyone is necessarily
seeing the same thing you're seeing. RFC 1537 helps to convey basic rules for
how to set the TTL.
Note that the term "propagation", although very widely used in this context,
does not describe the effects of caching well. Specifically, it implies that
when you make a DNS change, it somehow spreads to all other DNS servers
(instead, other DNS servers check in with yours as needed), and that you do not
have control over the amount of time the record is cached (you control the TTL
values for all DNS records in your domain, except your NS records and any
authoritative DNS servers that use your domain name).
Some resolvers may override TTL values, as the protocol supports caching for up
to 68 years or no caching at all. Negative caching (the non-existence of
records) is determined by name servers authoritative for a zone which MUST
include the SOA record when reporting no data of the requested type exists. The
MINIMUM field of the SOA record and the TTL of the SOA itself is used to
establish the TTL for the negative answer. RFC 2308
Many people incorrectly refer to a mysterious 48 hour or 72 hour propagation
time when you make a DNS change. When one changes the NS records for one's
domain or the IP addresses for hostnames of authoritative DNS servers using
one's domain (if any), there can be a lengthy period of time before all DNS
servers use the new information. This is because those records are handled by
the zone parent DNS servers (for example, the .com DNS servers if your domain is
example.com), which typically cache those records for 48 hours. However, those
DNS changes will be immediately available for any DNS servers that do not have
them cached. And any DNS changes on your domain other than the NS records and
authoritative DNS server names can be nearly instantaneous, if you choose for
them to be (by lowering the TTL once or twice ahead of time, and waiting until
the old TTL expires before making the change).
In the real world
DNS resolving from program to OS-resolver to ISP-resolver to greater
system.Users generally do not communicate directly with a DNS resolver. Instead
DNS-resolution takes place transparently in client-applications such as
web-browsers, mail-clients, and other Internet applications. When an application
makes a request which necessitates a DNS lookup, such programs send a resolution
request to the local DNS resolver in the local operating system, which in turn
handles the communications required.
The DNS resolver will almost invariably have a cache (see above) containing
recent lookups. If the cache can provide the answer to the request, the resolver
will return the value in the cache to the program that made the request. If the
cache does not contain the answer, the resolver will send the request to one or
more designated DNS servers. In the case of most home users, the Internet
service provider to which the machine connects will usually supply this DNS
server: such a user will either have configured that server's address manually
or allowed DHCP to set it; however, where systems administrators have configured
systems to use their own DNS servers, their DNS resolvers point to separately
maintained nameservers of the organization. In any event, the name server thus
queried will follow the process outlined above, until it either successfully
finds a result or does not. It then returns its results to the DNS resolver;
assuming it has found a result, the resolver duly caches that result for future
use, and hands the result back to the software which initiated the request.
Broken resolvers
An additional level of complexity emerges when resolvers violate the rules of
the DNS protocol. Some people have suggested that a number of large ISPs have
configured their DNS servers to violate rules (presumably to allow them to run
on less-expensive hardware than a fully-compliant resolver), such as by
disobeying TTLs, or by indicating that a domain name does not exist just because
one of its name servers does not respond.
As a final level of complexity, some applications (such as web-browsers) also
have their own DNS cache, in order to reduce the use of the DNS resolver library
itself. This practice can add extra difficulty when debugging DNS issues, as it
obscures the freshness of data, and/or what data comes from which cache. These
caches typically use very short caching times — of the order of one minute.
Internet Explorer offers a notable exception: recent versions cache DNS records
for half an hour.
Other applications
The system outlined above provides a somewhat simplified scenario. The Domain
Name System includes several other functions:
Hostnames and IP addresses do not necessarily match on a one-to-one basis. Many
hostnames may correspond to a single IP address: combined with virtual hosting,
this allows a single machine to serve many web sites. Alternatively a single
hostname may correspond to many IP addresses: this can facilitate fault
tolerance and load distribution, and also allows a site to move physical
location seamlessly.
There are many uses of DNS besides translating names to IP addresses. For
instance, Mail transfer agents use DNS to find out where to deliver e-mail for a
particular address. The domain to mail exchanger mapping provided by MX records
accommodates another layer of fault tolerance and load distribution on top of
the name to IP address mapping.
Sender Policy Framework and DomainKeys instead of creating their own record
types were designed to take advantage of another DNS record type, the TXT
record.
To provide resilience in the event of computer failure, multiple DNS servers are
usually provided for coverage of each domain, and at the top level, thirteen
very powerful root servers exist, with additional "copies" of several of them
distributed worldwide via Anycast.
DNS primarily uses UDP on port 53 to serve requests. Almost all DNS queries
consist of a single UDP request from the client followed by a single UDP reply
from the server. TCP comes into play only when the response data size exceeds
512 bytes, or for such tasks as zone transfer. Some operating systems such as
HP-UX are known to have resolver implementations that use TCP for all queries,
even when UDP would suffice.
Extensions to DNS
EDNS is an extension of the DNS protocol which enhances the transport of DNS
data in UDP packages, and adds support for expanding the space of request and
response codes. It is described in RFC 2671.
Standards
RFC 882 Concepts and Facilities (Deprecated by RFC 1034)
RFC 883 Domain Names: Implementation specification (Deprecated by RFC 1035)
RFC 920 Specified original TLDs: .arpa, .com, .edu, .org, .gov, .mil and
two-character country codes
RFC 1032 Domain administrators guide
RFC 1033 Domain administrators operations guide
RFC 1034 Domain Names - Concepts and Facilities.
RFC 1035 Domain Names - Implementation and Specification
RFC 1101 DNS Encodings of Network Names and Other Types
RFC 1123 Requirements for Internet Hosts -- Application and Support
RFC 1183 New DNS RR Definitions
RFC 1706 DNS NSAP Resource Records
RFC 1876 Location Information in the DNS (LOC)
RFC 1886 DNS Extensions to support IP version 6
RFC 1912 Common DNS Operational and Configuration Errors
RFC 1995 Incremental Zone Transfer in DNS
RFC 1996 A Mechanism for Prompt Notification of Zone Changes (DNS NOTIFY)
RFC 2136 Dynamic Updates in the domain name system (DNS UPDATE)
RFC 2181 Clarifications to the DNS Specification
RFC 2182 Selection and Operation of Secondary DNS Servers
RFC 2308 Negative Caching of DNS Queries (DNS NCACHE)
RFC 2317 Classless IN-ADDR.ARPA delegation
RFC 2671 Extension Mechanisms for DNS (EDNS0)
RFC 2672 Non-Terminal DNS Name Redirection (DNAME record)
RFC 2782 A DNS RR for specifying the location of services (DNS SRV)
RFC 2845 Secret Key Transaction Authentication for DNS (TSIG)
RFC 2874 DNS Extensions to Support IPv6 Address Aggregation and Renumbering
RFC 3403 Dynamic Delegation Discovery System (DDDS) (NAPTR records)
RFC 3696 Application Techniques for Checking and Transformation of Names
RFC 4398 Storing Certificates in the Domain Name System
RFC 4408 Sender Policy Framework (SPF) (SPF records)
Types of DNS records
List of DNS record types
Important categories of data stored in DNS include the following:
An A record or address record maps a hostname to a 32-bit IPv4 address.
An AAAA record or IPv6 address record maps a hostname to a 128-bit IPv6 address.
A CNAME record or canonical name record is an alias of one name to another. The
A record to which the alias points can be either local or remote - on a foreign
name server. This is useful when running multiple services (like an FTP and a
webserver) from a single IP address. Each service can then have its own entry in
DNS (like ftp.example.com. and www.example.com.)
An MX record or mail exchange record maps a domain name to a list of mail
exchange servers for that domain.
A PTR record or pointer record maps an IPv4 address to the canonical name for
that host. Setting up a PTR record for a hostname in the in-addr.arpa. domain
that corresponds to an IP address implements reverse DNS lookup for that
address. For example (at the time of writing), www.icann.net has the IP address
192.0.34.164, but a PTR record maps 164.34.0.192.in-addr.arpa to its canonical
name, referrals.icann.org.
An NS record or name server record maps a domain name to a list of DNS servers
authoritative for that domain. Delegations depend on NS records.
An SOA record or start of authority record specifies the DNS server providing
authoritative information about an Internet domain, the email of the domain
administrator, the domain serial number, and several timers relating to
refreshing the zone.
An SRV record is a generalized service location record.
A TXT Record allows an administrator to insert arbitrary text into a DNS record.
For example, this record is used to implement the Sender Policy Framework and
DomainKeys specifications.
NAPTR records ("Naming Authority Pointer") are a newer type of DNS record that
support regular expression based rewriting.
Other types of records simply provide information (for example, a LOC record
gives the physical location of a host), or experimental data (for example, a WKS
record gives a list of servers offering some well known service such as HTTP or
POP3 for a domain).
When sent over the internet, all records use the common format specified in RFC
1035 shown below.
RR (Resource Record) Fields Field Description Length (Octets)
NAME Name of the node to which this record pertains. (variable)
TYPE Type of RR. For example, MX is type 15. 2
CLASS Class code. 2
TTL Signed time in seconds that RR stays valid. 4
RDLENGTH Length of RDATA field. 2
RDATA Additional RR-specific data. (variable)
For a complete list of DNS Record types consult IANA DNS Parameters.
Internationalized domain names
Internationalized domain name
While domain names technically have no restrictions on the characters they use
and can include non-ASCII characters, the same is not true for host names. Host
names are the names most people see and use for things like e-mail and web
browsing. Host names are restricted to a small subset of the ASCII character set
that includes the Roman alphabet in upper and lower case, the digits 0 through
9, the dot, and the hyphen. (See RFC 3696 section 2 for details.) This prevented
the representation of names and words of many languages natively. ICANN has
approved the Punycode-based IDNA system, which maps Unicode strings into the
valid DNS character set, as a workaround to this issue. Some registries have
adopted IDNA.
Security issues
DNS was not originally designed with security in mind, and thus has a number of
security issues. DNS responses are traditionally not cryptographically signed,
leading to many attack possibilities; DNSSEC modifies DNS to add support for
cryptographically signed responses. There are various extensions to support
securing zone transfer information as well.
Even with encryption it still doesn't prevent the possibility that a DNS server
could become infected with a virus (or for that matter a disgruntled employee)
that would cause IP addresses of that server to be redirected to a malicious
address with a long TTL. This could have far reaching impact to potentially
millions of internet users if busy DNS servers cache the bad IP data. This would
require manual purging of all affected DNS caches as required by the long TTL
(up to 68 years).
Some domain names can spoof other, similar-looking domain names. For example,
"paypal.com" and "paypa1.com" are different names, yet users may be unable to
tell the difference when the user's typeface (font) does not clearly
differentiate the letter l and the number 1. This problem is much more serious
in systems that support internationalized domain names, since many characters
that are different, from the point of view of ISO 10646, appear identical on
typical computer screens.
Legal users of domains
Registrant
Most of the NICs in the world receive an annual fee from a legal user in order
for the legal user to utilize the domain name (i.e. a sort of a leasing
agreement exists, subject to the registry's terms and conditions). Depending on
the various naming convention of the registries, legal users become commonly
known as "registrants" or as "domain holders".
ICANN holds a complete list of domain registries in the world. One can find the
legal user of a domain name by looking in the WHOIS database held by most domain
registries.
For most of the more than 240 country code top-level domains (ccTLDs), the
domain registries hold the authoritative WHOIS (Registrant, name servers, expiry
dates, etc.). For instance, DENIC, Germany NIC, holds the authoritative WHOIS to
a .DE domain name.
However, some domain registries, such as for .COM, .ORG, .INFO, etc., use a
registry-registrar model. There are hundreds of Domain Name Registrars that
actually perform the domain name registration with the end user (see lists at
ICANN or VeriSign). By using this method of distribution, the registry only has
to manage the relationship with the registrar, and the registrar maintains the
relationship with the end users, or 'registrants'. For .COM, .NET domain names,
the domain registries, VeriSign holds a basic WHOIS (registrar and name servers,
etc.). One can find the detailed WHOIS (registrant, name servers, expiry dates,
etc.) at the registrars.
Since about 2001, most gTLD registries (.ORG, .BIZ, .INFO) have adopted a
so-called "thick" registry approach, i.e. keeping the authoritative WHOIS with
the various registries instead of the registrars.
Administrative contact
A registrant usually designates an administrative contact to manage the domain
name. In practice, the administrative contact usually has the most immediate
power over a domain. Management functions delegated to the administrative
contacts may include (for example):
the obligation to conform to the requirements of the domain registry in order to
retain the right to use a domain name
authorization to update the physical address, e-mail address and telephone
number etc. in WHOIS
Technical contact
A technical contact manages the name servers of a domain name. The many
functions of a technical contact include:
making sure the configurations of the domain name conforms to the requirements
of the domain registry
updating the domain zone
providing the 24×7 functionality of the name servers (that leads to the
accessibility of the domain name)
Billing contact
The party whom a NIC invoices.
Name servers
Namely the authoritative name servers that host the domain name zone of a domain
name.
Politics
Critics commonly claim abuse of domains by monopolies or near-monopolies, such
as VeriSign, Inc. Particularly noteworthy was the VeriSign Site Finder system
which redirected all unregistered .com and .net domains to a VeriSign webpage.
For example, at a public meeting with VeriSign to air technical concerns about
SiteFinder , numerous people, active in the IETF and other technical bodies,
explained how they were surprised by VeriSign's changing the fundamental
behavior of a major component of Internet infrastructure, not having obtained
the customary consensus. SiteFinder, at first, assumed every Internet query was
for a website, and it monetized queries for incorrect domain names, taking the
user to VeriSign's search site. Unfortunately, other applications, such as many
implementations of email, treat a lack of response to a domain name query as an
indication that the domain does not exist, and that the message can be treated
as undeliverable. The original VeriSign implementation broke this assumption for
mail, because it would always resolve an erroneous domain name to that of
SiteFinder. While VeriSign later changed SiteFinder to have this behavior only
in response to true Web queries, there was still widespread protest about
VeriSign's action being more in its financial interest than in the interest of
the Internet infrastructure component for which VeriSign was the steward.
Despite widespread criticism, VeriSign only reluctantly removed it after the
Internet Corporation for Assigned Names and Numbers (ICANN) threatened to revoke
its contract to administer the root name servers. ICANN published the extensive
set of letters exchanged, committee reports, and ICANN decisions .
There is also significant disquiet regarding the United States' political
influence over ICANN. This was a significant issue in the attempt to create a
.xxx top-level domain and sparked greater interest in alternative DNS roots that
would be beyond the control of any single country.
Truth in Domain Names Act
Anticybersquatting Consumer Protection Act
In the United States, the "Truth in Domain Names Act" (actually the
"Anticybersquatting Consumer Protection Act"), in combination with the PROTECT
Act, forbids the use of a misleading domain name with the intention of
attracting people into viewing a visual depiction of sexually explicit conduct
on the Internet.
| |
|