Computer Networks: The Dry, Invisible Magic of the Internet - Part 1: The Hourglass

Publish date: Jun 3, 2023
Last updated: Sep 16, 2023

Tags:

The Dry, Invisible Magic of the Internet

Over the Spring of 2023 I took a course on computer networks. In this series I explain for a lay audience how some of the most mundane, yet magical, aspects the internet work and why you should care.

Part 1: The Hourglass Shape of the Internet

What happens when you send an image on the internet

The Internet is shaped like an hourglass. But what could that possibly mean?

Before digging into the so-called hourglass shape of the internet, it’s necessary to talk a bit about how computers store and transmit data. Frequently, network engineers will refer to either the OSI model or the TCP model to describe how a piece of information stored on one computer is sent to another. These models represent a “stack” of technologies, each of which is concerned with a specific job. A layer in the stack doesn’t know or care what the other layers are doing. It is only responsible for taking its input, doing its work on it, and passing it as output to another layer. This is a common approach to building complex systems with software and computers. By encapsulating functionality in this way, it becomes easier (in theory) to make a change to how one layer accomplishes its task, without making changes to the other layers. Engineers can change how a layer does its job, for example to improve performance or security, and as long as the next layer gets the input that it needs, everything will still work.

To explain how the Internet works in this way, imagine an image sitting on your computer like a Bingo card. A digital image could be thought of as an X by Y table that contains a value or set of values for each cell. Each value corresponds to a color or brightness on the screen when it’s displayed. Very inaccurately and roughly, it could be stored as something like below.

I	M	A	G	E
3	8	5	2	5
1	4	2	1	6
9	7	0	4	1
3	5	9	0	6
7	2	4	5	0

At the beginning there is a header with some information about rest of the data, like what type of image the data should be interpreted as as well as the dimensions, which the computer can use to know how to display the image correctly. Then there is the actual data of the image itself, which represents the individual pixels to be displayed on the screen. For more information on a real image format rather than my fictional Bingo format, have a look at Wikipedia’s breakdown of the infamous GIF file format.

So we have an image on a computer. The job of the network stack is to take that image data, chop it into little pieces, and forward each of the pieces to the other computer. Then, on the other side, the network stack performs the work to reassemble the image, piece by piece, into the same format as the original file on the other machine.

As mentioned, there are two ways that this process is commonly described by engineers. The first is called the Open Systems Interconnection Model (OSI). The OSI model doesn’t strictly represent how things work in the real world, but provides a conceptual reference for how layered communication systems can be built. Even though the OSI model doesn’t break down exactly to the technologies used in the real world, it’s very helpful to think about what work must be performed, or is often performed, when network communication happens. The table below breaks down the OSI model from a receiver’s perspective. It’s important to note that these steps take place in order from bottom to top (1-7) on the receiver side. The top of the table is the last step in receiving the image. On the sender side, the process goes top to bottom (7-1). The sender starts at the application level (7) and passes the data down the stack.

The OSI Model

Layer	Function
(7) Application	Your application has received all of the decrypted and decompressed pieces of the image and is able to assemble it into a whole, which is an exact copy of the image on the other computer. You are now able to view the image on your screen.
(6) Presentation Layer	Your computer application has received all of the pieces of the image and does some work to prepare it for you. It might be decompressing the image from a smaller format that was used to save space while sending the image to you, or it might be decrypting the image if the application that sent it to you used encryption to ensure that other parties on the internet could not intercept and view the image
(5) Session Layer	Your computer application indicates that it is ready to send and receive an image from another computer. This is like opening a door or a window in the building and waving a flag that says, “I’m ready to send!” or “I’m ready to receive!”
(4) Transport Layer	This is your computer’s mail room. On the sending side, the image will be disassembled into small pieces before sending. On the receiving side, the Transport Layer tracks each of the pieces as they come in and ensures it has received all of the pieces. If a piece goes missing, the Transport Layer can request that it be resent. The Transport Layer is best known for protocols like the Transmission Control Protocol (TCP) and User Datagram Protocol (UDP), which are essentially two different modes for running the mail room. In TCP the mail room is responsible for ensuring all the packets are received and ensuring that the mail room and roads are fairly shared between different customers. In UDP, the mail room is only responsible for getting the package out the door: send it and forget it! The Application Layer on the other side must then do the work of sorting out if everything has arrived and is intact.
(3) Network Layer	In the Network Layer, the sending computer takes a piece of the image, marks it with its source and destination address, and decides what the next best hop is in order to push that piece along on its way to the destination. When the package reaches the next hop, that “station” does the same thing, selecting what it thinks is the next best hop, until finally the package reaches its destination. The Network Layer is best known for IP (Internet Protocol) addresses, which are the key to identifying the source and destination as the the piece of information is sent.
(2) Data Link Layer	The Data Link Layer carries the information between two “stations” or nodes. It ensures that the physical medium, in the Physical Layer, can be efficiently shared by multiple users, and it may also detect and correct errors that happen as the piece of information is sent across the physical medium. In some sense, the Data Link layer is like traffic lights or air traffic control. The traffic light doesn’t control where the cars go, but it prevents them from colliding when using the shared roadways. It provides a framework to create orderly flows of traffic in a shared and limited physical space.
(1) Physical Layer	The Physical layer prepares the piece of the image for the physical medium it’s being sent on. To continue the analogy, this might be like readying the package to be carried by an airplane, or a boat, or a truck. The Physical Layer ensures that the piece of the image can be properly sent with the medium. Each leg of the route that the Network Layer selects might use a different technology for the Physical layer. The image might pass through an optical cable medium, an electrical cable medium, or radio waves at different times during its trip. The classic example of a Physical layer technology is the “modulator-demodulator,” better known as the modem. The modem sends out electrons, photons, or radio waves in a pattern that can be decoded by another modem on the other end. When you signal with morse code using a flashlight, you are a modem.

Worth noting is that parts of this stack occur multiple times while sending information between computers. When one piece of information is sent out the door from your computer to another, it will go up and down between layers 1 and 3 multiple times as it hops between “stations” (routers) on its way to the destination. It will likely pass through several physical media, and different pieces may take different routes depending on the conditions of the network and the decisions that the “stations” make.

In general, all of these functions are performed using the TCP/IP stack, which unlike the OSI model is composed of concrete, real-world technologies.

Down the Stack and Up the Stack (TCP/IP)

Sender		Receiver
↓Application		↑Application
↓Transport Layer		↑Transport Layer
↓Internet Layer		↑Internet Layer
↓Link Layer	↔	↑Link Layer

The functions in the real world performed by the TCP/IP stack are similar to the OSI model, with Data and Physical layers condensed into the Link layer, and Session, Presentation, and Application layers condensed into the single Application Layer. This corresponds to the hardware and software typically found in a single computer. Typically, the application will handle collecting data from a user, encrypting it, and creating a session to send and receive data. Each subsequent layer adds a header (a small piece of information at the start of the data, the same concept discussed in the Bingo image above) which is used by the computer to perform the functions of that layer throughout transmission and on each end of the connection.

Layer	State of Data	Main Features
Application	Data	Contains the data being sent over the Internet
Transport	TCP/UDP Header + Data	Contains the source and destination ports, some error checking
Internet	IP Header + TCP/UDP Header + Data	Contains the source and destination IP address
Link Layer	Link Layer Header + TCP/UDP Header + Data	Contains the source and destination MAC address (when using Ethernet)

There is some more nuance and variability in the process than described here, however, for the most part this is a reasonable way of understanding how the Internet works. Data is split, labeled, and then sent in small pieces across a variety of paths and physical media from one source address to a final destination address.

We are now ready to get to the main topic: the hourglass shape of the Internet.

What is the Hourglass?

How is the internet an hourglass? Is it shaped like a figure eight, an infinity sign, are there a lots of loops in the paths, with a few intersections? Are there broader loops at the edges of it with a main connection point in the middle.

Not quite.

The “hourglass” shape refers to the protocols that are used to perform the work of the Internet - exactly the same layers of the TCP/IP stack discussed above. As was noted previously, there are many different physical media used to transmit data via the Internet: electronic, optical, radio, etc. And for each of these physical media, there are many Data Link technologies and specifications used to send signals across them. This means that the Link Layer base of the TCP/IP stack is very broad, encompassing thousands of technologies that have emerged, evolved, and faded away over the years. There is a lot of activity, with individuals and companies working very quickly to improve the reliability and efficiency of this layer of the Internet. You remember, possibly, your family replacing modems or wireless access points every few years. You likely witnessed the evolution of cellphones through the various Gs: 2g, 3g, 4g, 5g, and so on. Right now, you are witnessing the emergence of the Starlink global satellite network. Each of these examples showcases the intense activity, physical construction, and pace of innovation at the Link Layer of the Internet. This innovation results in a wide set of technologies at the bottom of the TCP/IP stack.

Meanwhile, at the top of the stack, in the Application Layer, there is also significant breadth. Every tile on your phone’s home screen is an application, and the protocols that these applications use also vary widely depending on the services that the application provides and are subject to quick evolution. A web browser, like Firefox or Chrome, uses Hyper Text Transfer Protocol (HTTP) to allow users to access HTML pages and other types of data over the Internet. E-mail uses Simple Mail Transfer Protocol (SMTP) or Post Office Protocol (POP) to exchange messages. Streaming services have gone through a variety of protocols to allow users to play videos over an Internet connection, including HTTP Live Streaming (HLS), Dynamic Adaptive Streaming over HTTP (MPEG-DASH), WebRTC, and many more. Like in the Link Layer, there is a healthy growth, innovation, and selection among protocols that consistently expands the quality and variety of features that can be provided. This results in the wide top of the Internet’s hourglass shape.

Browser	Soundcloud	Netflix	Email	Discord
HTTP	HLS	MPEG-DASH	SMTP	WebRTC
TCP	TCP	TCP	TCP	UDP
		IP
	Ethernet	802.11	DOCSIS
Coaxial Cable	Twisted Pair	Optical Fiber	CDMA	TDMA

Finally, we come to the waist, or the belly, of the hourglass: the Internet Protocol (IP). In the middle of the sandwich is IP, most commonly IPv4, which is responsible for providing addresses to computers and other hosts on the Internet, and for routing packets from source to destination across a series of machines.

What’s the Matter with the Hourglass?

Unlike the protocols and technologies at the top and bottom of the TCP/IP stack, the IP protocol has changed very little since it was created, despite some significant challenges and inefficiencies. What challenges and inefficiencies? Well, we ran out of IPv4 addresses. IPv4, the most commonly used IP protocol, uses a 32 bit address made up of four octects (one byte / 8 bits). Typically they look something like 17.5.7.3. Where each octet can be a base 10 number between 0-255. One bit has two possible states. 2^32 is 4,294,967,296 or roughly four billion addresses, which only covers a little over half the global population. But there are many more computers and internet-connected machines than there are people, which results in a significant shortage of addresses.

The solution that engineers devised was to cordon local area networks (LANs), which are essentially home or business networks, and allow computers on the local network to communicate with other machines on the Internet over a single IP address. This is done via a method called Network Address Translation (NAT). Essentially NAT allows many hosts on the same local network to share an IP address on the public Internet, thus reducing consumption of the public addresses.

Another issue with the Internet Protocol is that it does not provide any built-in mechanisms for handling congestion, which is a problem today that is handled primarily by TCP. For example, when many computers are streaming data from Youtube and Netflix, these flows of data compete for bandwidth on the Internet. Currently TCP, in the Transport Layer (4) of the OSI model has to do the work to manage congestion and ensure that the network is efficiently or fairly shared. However TCP can only do so by guessing that the network is congested based on behavior it observes from the perspective of a single computer connected to the network. There is no central monitoring or control feature in IP or TCP that can manage congestion from single network-wide view.

From a security standpoint the Internet Protocol is also subject to packet sniffing, whereby the intermediaries that carry your information across the global network can inspect and read data that you send (unless you encrypt it). Those intermediaries can also modify the packets, or they can pretend to have the same IP address as you and receive packets that should be going to your computer. Thus, security features that provide privacy, identity verification, and integrity of data, are often built in via the Application Layer and the IP layer is not trusted to provide these guarantees.

Fixing IP

Given the above issues with IP exhaustion, traffic congestion, and security, there is good reason for researchers and industry engineers to invest in improving the IP layer of the Internet. However, because of the crucial and singular function that it performs at the waist of the hourglass, it is very difficult to modify the IP protocol without introducing breaking changes to the rest of the stack.

What would it take to fix or replace IP? It’s unclear. A replacement known as IPv6 has been around for decades but has failed to gain enough traction to completely replace IPv4. IPv6 solves the address exhaustion problem and provides several other improvements to efficiency.

Likewise, other protocols existed before IP such as Novell’s IPX, the X.25 network protocol used in Frame Relay, the ATM network layer signaling protocol (Akhshabi & Dovrolis).

Based on simulations of the evolution of the Internet’s protocols, Saamer Akhshabi and Constantine Dovrolis argue that the dominance of IP is not due to being technically superior, more user friendly, or older than other protocols. Instead, they suggest that neighboring protocols like TCP and UDP can provide a buffer that protects the middle of the stack from challengers, and that the evolution of the Internet’s protocol stack tends to veer toward an hourglass shape, irrespective of which protocol sits at the waist. To truly challenge IP, they suggest, a new protocol must provide a radically different experience with not just better quality than IP, but a set of functions that is entirely impossible within the older paradigm.

So far we haven’t seen that occur and IPv4 has been able to provide many new services and experiences to Internet users via changes in the underlying Physical Layer or the upper Application Layer. However it’s possible to imagine infrastructure, economic, or political changes so profound that they would give rise to a radically different networking protocol. Quantum networks, for example, could present a disruptive and altogether different way of designing computer networks which would provide a completely new experience for Internet users. Economic factors as the Internet continues to grow could result in replacement of the IPv4 protocol becoming more critical. Finally, political pressures around national control of the Internet, security, and censorship could also result in pressures to change the Internet Protocol. In short, in order to see a wholesale replacement of IPv4 it is possible that many other facets of everyday life would need to change first.