Connection-Oriented Versus Connectionless Networks
Simply stated, a connection-oriented protocol involves a dedicated connection between a host and a client computer. Being connection-oriented means that there is a single specific path for data to travel from one computer to the other, as shown in Figure 1.
Figure 1: A connection-oriented session
The advantage here is that data is guaranteed to arrive in the order in which it was sent, as each packet follows the same exact route. So if the packets shown in Figure 1 include a single character each and are sent in the order “H”, “E”, “L”, “L”, “O”, then the receiving computer is guaranteed to receive the packets in the same order.
TCP, or Transmission Control Protocol, is one type of networking protocol you have certainly heard of before. TCP uses a “windowing” system to transmit and acknowledge packets. Here, packets no larger than an agreed upon “window” size, such as 64 or 128 kilobytes, are sent. As packets are received, an acknowledgment is sent back to the original sender so it knows its okay to send the next packet. If enough time goes by that the sender does not receive an acknowledgment, it can then assume that the packet never got there and resend the lost data. This is a simplified explanation of how TCP error-checking works. TCP is a connection-oriented protocol, so you know that packets arrive in the order in which they are sent. I make this distinction because, as you will see, connectionless protocols make no guarantees as to the order in which packets are sent—or received!
Figure 2 A connectionless UDP session
Another protocol, called
UDP, or User Datagram Protocol, is a commonly used connectionless protocol. Like TCP, UDP defines a protocol for how data is sent across a network. Unlike TCP, UDP does not specify any particular connection for data to flow, making it possible to broadcast data to an entire network of computers rather than a single connected computer. A simple UDP session may look like that depicted in Figure 2, above.
In Figure 2, the characters that make up the string ‘HELLO’ are sent in a specific order, but not through any specific route. It is up to the data packets themselves to find their way to their final destination. Therefore, they arrive in the order in which they get there, which is not necessarily the order in which they were sent. For instance, say the “H” packet takes a certain route to its destination computer. Halfway there, it gets bogged down in traffic and must be rerouted. Packets “E”, “L”, and “L” are then sent on their way as well. By the time the “O” packet is sent, traffic has cleared up on the original route taken by “H”, and it has a clean trip to its destination. Immediately after “O” makes it to its destination, “E”, “L”, and “L” arrive. Finally, the leading “H” packet arrives, after being rerouted halfway across the world. The receiving computer can now reassemble the message into its original form. We don’t need to get into how this is done, since it is taken care of automatically by the underlying hardware and software. What you will need to know is how the various network protocols can be established and used.
Note that data sent over a UDP connection will generally be faster, since data is free to find its own route to the destination. However, UDP is also less reliable since packets can be lost and are not acknowledged in any particular way.
Now that you know some of the protocols for establishing network connections, you need a way to actually address and deliver the data. Since we are dealing with networking within the Internet, it only makes sense to use the Internet Protocol, or IP. IP specifies both the format for data to be sent as well as the addressing system. IP data packets are also referred to as datagrams and include both the data to be sent as well as the destination address. Think of it in terms of the postal system—a piece of mail is a packet containing both data (the message) as well as the destination address (usually the name, street, city, state, and ZIP code of the receiver). Mail is then sent through a specific protocol, whether it is FedEx, UPS, or the good old-fashioned government mailing service.
You are already very familiar with the addressing scheme used by IP, whether you know it or not. IP uses both dotted quad as well as domain naming schemes to address data packets. Dotted quad notation contains 32 bits of information about the recipient and comes in the form xxx.xxx.xxx.xxx. Here, xxx refers to an eight-bit number ranging from 0 to 256. Therefore, 208.186.46.20 is one valid example of an address in dotted quad format. I won’t get into what these numbers mean here; as long as you can get the current Internet address from the host computer, you will have what you need to make the connection.
As I mentioned above, IP addressing also comes in a domain name flavor. The Internet address java.sun.com is one example of a domain name. Since it is generally easier for humans to remember words than numbers, domain names are used just as a convenience for people to remember. Domain names can be converted into machine-readable dotted quad form by a DNS, or Domain Name System, server. If one server does not know how to resolve an address, it will ask the next one, and so on until the address is resolved. So when you type “
http://java.sun.com/” into your Web browser, the DNS will automatically convert it to
http://192.18.97.71 for you. Try typing in both of these address formats in separate Web browsers; you’ll find that they refer to the exact same place.