The Basic Process
Your browser formed a connection to a Web server, requested a page and received it.
The browser broke the URL into three parts:
- The protocol (“http”)
- The server name (“www.howstuffworks.com”)
- The file name (“web-server.htm”)
The browser communicated with a name server to translate the server name “www.arpittak.com” into an IP Address, which it uses to connect to the server machine. The browser then formed a connection to the server at that IP address on port 80.
Following the HTTP protocol, the browser sent a GET request to the server, asking for the file “http://www.arpittak.com/index.htm.” (Note that cookies may be sent from browser to server with the GET request.)
The server then sent the HTML text for the Web page to the browser. (Cookies may also be sent from server to browser in the header for the page.) The browser read the HTML tags and formatted the page onto your screen.
Clients and Servers
In general, all of the machines on the Internet can be categorized as two types: servers and clients. Those machines that provide services (like Web servers or FTP servers) to other machines are servers. And the machines that are used to connect to those services are clients. When you connect to Yahoo! at http://www.yahoo.com to read a page, Yahoo! is providing a machine (probably a cluster of very large machines), for use on the Internet, to service your request. Yahoo! is providing a server. Your machine, on the other hand, is probably providing no services to anyone else on the Internet. Therefore, it is a user machine, also known as a client. It is possible and common for a machine to be both a server and a client, but for our purposes here you can think of most machines as one or the other.
A server machine may provide one or more services on the Internet. For example, a server machine might have software running on it that allows it to act as a Web server, an e-mail server and an FTP server. Clients that come to a server machine do so with a specific intent, so clients direct their requests to a specific software server running on the overall server machine. For example, if you are running a Web browser on your machine, it will most likely want to talk to the Web server on the server machine. Your Telnet application will want to talk to the Telnet server, your e-mail application will talk to the e-mail server, and so on…
To keep all of these machines straight, each machine on the Internet is assigned a unique address called an IP address. IP stands for Internet protocol, and these addresses are 32-bit numbers, normally expressed as four “octets” in a “dotted decimal number.” A typical IP address looks like this: 18.104.22.168
The four numbers in an IP address are called octets because they can have values between 0 and 255, which is 28 possibilities per octet.
Every machine on the Internet has a unique IP address. A server has a static IP address that does not change very often. A home machine that is dialing up through a modem often has an IP address that is assigned by the ISP when the machine dials in. That IP address is unique for that session — it may be different the next time the machine dials in. This way, an ISP only needs one IP address for each modem it supports, rather than for each customer.
Because most people have trouble remembering the strings of numbers that make up IP addresses, and because IP addresses sometimes need to change, all servers on the Internet also have human-readable names, called domain names. For example, www.arppittak.com is a permanent, human-readable name.
The name http://www.arpittak.com actually has three parts:
- The host name (“www”)
- The domain name (“arpittak”)
- The top-level domain name (“com”)
A set of servers called domain name servers (DNS) maps the human-readable names to the IP addresses. These servers are simple databases that map names to IP addresses, and they are distributed all over the Internet. Most individual companies, ISPs and universities maintain small name servers to map host names to IP addresses. There are also central name servers that use data supplied by VeriSign to map domain names to IP addresses.
Any server machine makes its services available to the Internet using numbered ports, one for each service that is available on the server. For example, if a server machine is running a Web server and an FTP server, the Web server would typically be available on port 80, and the FTP server would be available on port 21. Clients connect to a service at a specific IP address and on a specific port.
Each of the most well-known services is available at a well-known port number. Here are some common port numbers:
- echo 7
- daytime 13
- qotd 17 (Quote of the Day)
- ftp 21
- telnet 23
- smtp 25 (Simple Mail Transfer, meaning e-mail)
- time 37
- nameserver 53
Once a client has connected to a service on a particular port, it accesses the service using a specific protocol. The protocol is the pre-defined way that someone who wants to use a service talks with that service. The “someone” could be a person, but more often it is a computer program like a Web browser. Protocols are often text, and simply describe how the client and server will have their conversation.
Perhaps the simplest protocol is the daytime protocol. If you connect to port 13 on a machine that supports a daytime server, the server will send you its impression of the current date and time and then close the connection. The protocol is, “If you connect to me, I will send you the date and time and then disconnect.”