Routing

Table of Contents

Basic Routing Concepts
Routing Table
Interior Gateway Protocol
External Gateway Protocol
Non-Routable Address Space

Basic Routing Concepts

The Internet is an incredibly impressive technological achievement. It meshes together millions of individual networks and allows communications to flow between them. From almost anywhere in the world, you can now access data from almost anywhere else. Often in just fractions of a second. The way communications happen across all these networks, allowing you to access data from the other side of the planet, is through routing. From a very basic standpoint,

A router is a network device that forwards traffic depending on the destination address of that traffic. A router is a device that has at least two network interfaces, since it has to be connected to two networks to do its job.

Basic routing has just a few steps:

A router receives a packet of data on one of its interfaces.
The router examines the destination IP of this packet.
The router then looks up the destination network of this IP in its routing table.
The router forwards that out though the interface that's closest to the remote network. As determined by additional info within the routing table.

These steps are repeated as often as needed until the traffic reaches its destination. Let's imagine a router connected to two networks. We'll call the first network, Network A and give it an address space of 192.168.1.0/24. We'll call the second network, Network B and give it an address space of 10.0.0.0/24. The router has an interface on each network. On Network A, it has an IP of 192.168.1.1 and on Network B, it has an IP of 10.0.254. Remember, IP addresses belong to networks, not individual nodes on a network. A computer on Network A with an IP address of 192.168.1.100 sends a packet to the address 10.0.0.10. This computer knows that 10.0.0.10 isn't on its local subnet. So it sends this packet to the MAC address of its gateway, the router.

The router's interface on Network A receives the packet because it sees that destination MAC address belongs to it. The router then trips away the data-link layer encapsulation, leaving the network layer content, the IP datagram.

Now, the router can directly inspect the IP datagram header for the destination IP field. It finds the destination IP of 10.0.0.10. The router looks at it's routing table and sees that Network B, or the 10.0.0.0/24 network, is the correct network for the destination IP. It also sees that, this network is only one hop away. In fact, since it's directly connected, the router even has the MAC address for this IP in its arc table.

Next, the router needs to form a new packet to forward along to Network B. It takes all of the data from the first IP datagram and duplicates it. But decrements the TTL field by one and calculates a new checksum.

Then it encapsulates this new IP datagram inside of a new Ethernet frame. This time, it sets its own MAC address of the interface on network B as the source MAC address. Since it has the MAC address of 10.0.0.10 in its arc table, it sets that as the destination MAC address. Lastly, the packet is sent out of its interface on Network B and the data finally gets delivered to the node living at 10.0.0.10.That's a pretty basic example of how routing works.

But let's make it a little more complicated and introduce a third network. Everything else is still the same. We have network A whose address space is 192.168.1.0/24. We have network B whose address space is 10.0.0/24. The router that bridges these two networks still has the IPs of 192.168.1.1 on Network A and 10.0.0.254 on Network B.

But let's introduce a third network, Network C. It has an address space of 172.16.1.0/23. There is a second router connecting network B and network C. It's interface on network B has an IP of 10.0.0.1 and its interface on Network C has an IP of 172.16.1.1. This time around our computer at 192.168.1.100 wants to send some data to the computer that has an IP of 172.16.1.100. We'll skip the data-link layer stuff, but remember that it's still happening, of course. The computer at 192.168.1.100 knows that 172.16.1.100 is not on its local network, so it sends a packet to its gateway, the router between Network A and Network B. Again, the router inspects the content of this packet. It sees a destination address of 172.16.1.100 and through a lookup of its routing table, it knows that the quickest way to get to the 172.16.1.0/23 network is via another router. With an IP of 10.0.0.1. The router decrements the TTL field and sends it along to the router of 10.0.0.1. This router then goes through the motions, knows that the destination IP of 172.16.1.100 is directly connected and forwards the packet to its final destination. That's the basics of routing. The only difference between our examples and how things work on the Internet is scale. Routers are usually connected to many more than just two networks. Very often, your traffic may have to cross a dozen routers before it reaches its final destination. And finally, in order to protect against breakages, core Internet routers are typically connected in a mesh, meaning that there might be many different paths for a packet to take. Still, the concepts are all the same. Routers inspect the destination IP, look at the routing table to determine which path is the quickest and forward the packet along the path.

Routing Table

Routing itself is pretty simple concept and you'll find that routing tables aren't that much more complicated. The earliest routers were just regular computers of the era. They had two network interfaces, bridge to networks, and auto-routing table that was manually updated. In fact, all major operating systems today, still have a routing table that they consolt before transmitting data. You could still build your own router today, if you had a computer with two network interfaces and it manually updated routing table. Routing tables can vary a ton depending on the make and class of the router, but they all share a few things in common. The most basic routing table will have four columns.

Destination network, this column would contain a row for each network that the router knows about, this is just the definition of the remote network, a network ID, and the net mask. These could be stored in one column inside a notation, or the network ID and net mask might be in a separate column. Either way, it's the same concept, the router has a definition for a network and therefore knows what IP addresses might live on that network. When the router receives an incoming packet, it examines the destination IP address and determines which network it belongs to. A routing table will generally have a catchall entry, that matches any IP address that it doesn't have an explicit network listing for.

Next hop, this is the IP address of the next router that should receive data intended for the destination networking question or this could just state the network is directly connected and that there aren't any additional hops needed.
Total hops, this is the crucial part to understand routing and how routing tables work, on any complex network like the Internet, there will be lots of different paths to get from point A to point B. Routers try to pick the shortest possible path at all times to ensure timely delivery of data but the shortest possible path to a destination network is something that could change over time, sometimes rapidly, intermediary routers could go down, links could become disconnected, new routers could be introduced, traffic congestion could cause certain routes to become too slow to use. We'll get to know how routers know the shortest path in an upcoming video. For now, it's just important to know that for each next hop and each destination network, the router will have to keep track of how far away that destination currently is. That way, when it receives updated information from neighboring routers, it will know if it currently knows about the best path or if a new better path is available.
Interface, the router also has to know which of its interfaces it should for traffic matching the destination network out of.

In most cases, routing tables are pretty simple. The really impressive part is that, many core Internet routers have millions of rows in the routing tables. These must be consulted for every single packet that flows through a router on its way to its final destination.

Interior Gateway Protocol

We've covered the basics of how routing works and how routing tables are constructed, and they're both really pretty basic concepts. The real magic of routing is in the way that routing tables are always updated with new information about the quickest path to destination networks. The protocols we'll be learning will help you identify routing problems on any network you might support. In order to learn about the world around them, routers use what are known as routing protocols. These are special protocols the routers use to speak to each other in order to share what information they might have. This is how a router on one side of the planet can eventually learn about the best path to a network on the other side of the planet.

Routing protocols fall into two main categories, interior gateway protocols, and exterior gateway protocols.

Interior gateway protocols are used by routers to share information within a single autonomous system. In networking terms, an autonomous system is a collection of networks that all fall under the control of a single network operator.

The best example of an autonomous system would be a large corporation that needs to route data between their many offices an each of which might have their own local area network.

Another example is the many routers employed by an Internet service provider who's reaches are usually national in scale. You can contrast this with exterior gateway protocols, which are used for the exchange of information between independent autonomous systems.

Interior gateway protocols are further split into two categories, link state routing protocols and distance-vector protocols.

Their goals are super similar, but the routers that employ them share different kinds of data to get the job done.

Distance-vector protocol

Distance-vector protocols are an older standard. A router using a distance-vector protocol basically just takes its routing table, which is a list of every network known to it and how far away these networks are in terms of hops. Then the router sends this list to every neighboring router, which is basically every router directly connected to it. In computer science, a list is known as a vector. This is why a protocol that just sends a list of distances to networks is known as a distance-vector protocol. With a distance-vector protocol, routers don't really know that much about the total state of an autonomous system, they just have some information about their immediate neighbors. For a basic glimpse into how distance vector protocols work, let's look at how two routers might influence each other's routing tables.

Router A has a routing table with a bunch of entries. One of these entries is for 10.1.1.0/24 network, which we'll refer to as Network X. Router A believes that the quickest path to Network X is through its own interface 2, which is where Router C is connected. Router A knows that sending data intended for Network X through interface 2 to Router C means it'll take four hops to get to the destination. Meanwhile, Router B is only two hops removed from Network X, and this is reflected in its routing table. Router B using a distance vector protocol sends the basic contents of its routing table to Router A. Router A sees that Network X is only two hops away from Router B even with the extra hop to get from Router A to Router B. This means that Network X is only three hops away from Router A if it forwards data to Router B instead of Router C. Armed with this new information, Router A updates its routing table to reflect this. In order to reach Network X in the fastest way, it should forward traffic through its own interface 1 to Router B.

Link State Protocol

Now distance vector protocols are pretty simple, but they don't allow for a router to have much information about the state of the world outside of their own direct neighbors. Because of this, a router might be slow to react to a change in the network far away from it. This is why link state protocols were eventually invented. Routers using a link state protocol taking more sophisticated approach to determining the best path to a network. Link state protocols get their name because each router advertises the state of the link of each of its interfaces. These interfaces could be connected to other routers, or they could be direct connections to networks. The information about each router is propagated to every other router on the autonomous system. This means that every router on the system knows every detail about every other router in the system. Each router then uses this much larger set of information and runs complicated algorithms against it to determine what the best path to any destination network might be.

External Gateway Protocol

Exterior gateway protocols are used to communicate data between routers representing the edges of an autonomous system.

Since routers sharing data using interior gateway protocols are all under control of the same organization. Routers use exterior gateway protocols when they need to share information across different organizations. Exterior gateway protocols are really key to the Internet operating how it does today. So, thanks exterior gateway protocols.

The Internet is an enormous mesh of autonomous systems. At the highest levels, core Internet routers need to know about autonomous systems in order to properly forward traffic.

Since autonomous systems are known and defined collections of networks, getting data to the edge router of an autonomous system is the number one goal of core Internet routers.

The Internet Assigned Numbers Authority(IANA), is a non-profit organization that helps manage things like IP address allocation.

The Internet couldn't function without a single authority for these sorts of issues. Otherwise, anyone could try and use any IP space they wanted, which would cause total chaos online.

Along with managing IP address allocation, the IANA is also responsible for ASN, or Autonomous System Number allocation. ASNs are numbers assigned to individual autonomous systems. Just like IP addresses, ASNs are 32-bit numbers. But, unlike IP addresses, they're normally referred to as just a single decimal number, instead of being split out into readable bits. There are two reasons for this. First, IP addresses need to be able to represent a network ID portion and a host ID portion for each number. This is more easily accomplished by splitting the number in four sections of eight bits, especially back in the day when address classes ruled the world. An ASN, never needs to change in order for it to represent more networks or hosts. Its just the core Internet routing tables that need to be updated to know what the ASN represents. Second, ASNs are looked at by humans, far less often, than IP addresses are. So because it can be useful to be able to look at the IP 9.100.100.100 and know that 9.0.0.0/8 address space is owned by IBM, ASNs represent entire autonomous systems. Just being able to look up the fact that AS19604 belongs to IBM is enough.

Unless you one day end up working at an Internet service provider, understanding more details about how exterior gateway protocols work, is out of scope for most people in IT. But grasping the basics of autonomous systems, ASNs, and how core Internet routers route traffic between them, is important to understand some of the basic building blocks of the Internet.

Non-Routable Address Space

And now, a brief history lesson. Even as far back as 1996, it was obvious that the internet was growing at a rate that couldn't be sustained. When IP was first defined, it defined an IP address as a single 32-bit number. A single 32-bit number can represent 4,294,967,295 unique numbers which definitely sounds like a lot, but as of 2017, there are an estimated 7.5 billion humans on earth. This means that the IPv4 standard doesn't even have enough IP addresses available for every person on the planet. It also can account for entire data centers filled with thousands and thousands of computers required for large scale technology companies to operate. So, in 1996, RFC 1918 was published.

RFC stands for Request for Comments, and has a long standing way for those responsible for keeping the internet running to agree upon the standard requirements to do so.

RFC 1918, outlined a number of networks that would be defined as non-routable address space. Non-routable address space is basically exactly what it sounds like. They are ranges of IPs set aside for use by anyone that cannot be routed to. Not every computer connected to the internet needs to be able to communicate with every other computer connected to the internet.

Non-routable address space allows for nodes on such a network to communicate with each other but no gateway router will attempt to forward traffic to this type of network.

This might sound super limiting, and in some ways, it is. In a future module, we'll cover a technology known as NAT or Network Address Translation. It allows for computers on non-routable address space to communicate with other devices on the internet. But for now, let's just discuss non-routable address space in a vacuum. RFC 1918 defined three ranges of IP addresses that will never be routed anywhere by co-routers. That means that they belong to no one and that anyone can use them. In fact, since they are separated from the way traffic moves across the internet, there's no limiting to how many people might use these addresses for their internal networks. The primary three ranges of non-routable address space are 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16. These ranges are free for anyone to use for their internal networks. It should be called out that interior gateway protocols will route these address spaces. So, they are appropriate for use within an autonomous system but exterior gateway protocols will not.

References:
https://www.coursera.org/learn/computer-networking/lecture/eCwJA/basic-routing-concepts
https://www.coursera.org/learn/computer-networking/lecture/BVuUA/routing-tables
https://www.coursera.org/learn/computer-networking/lecture/QvF9H/interior-gateway-protocols
https://www.coursera.org/learn/computer-networking/lecture/6zn7s/exterior-gateway-protocols
https://en.wikipedia.org/wiki/Border_Gateway_Protocol
https://www.coursera.org/learn/computer-networking/lecture/5S8Qc/non-routable-address-space

week2-03_Routing