Chapter 5: Network Layer: Control Plane
Welcome to Last Minute Lecture.
This free chapter overview is designed to help students review and understand key concepts.
These summaries supplement not replaced the original textbook and may not be redistributed or resold.
For complete coverage, always consult the official text.
Welcome to the Deep Dive where we crack open complex topics armed with the best research and reveal the surprisingly simple insights hidden within.
Today we're tackling something that feels like magic every single day, the internet.
You type in a web address, hit enter, and poof, content appears.
But how?
Behind that seemingly instant connection is a massive intricate brain making constant decisions.
That's right.
And we're diving into the network layers control plane.
Think of it as the internet sophisticated GPS.
It's the part that determines exactly how your data travels across the globe.
And you've given us a fantastic guide for this exploration, a chapter from computer networking, a top -down approach by Corrosan and Ross, really solid stuff.
Our mission is to distill the most vital insights from this text, make those complex concepts clear, practical, and stick to that top -down approach from the app to use right down to the infrastructure.
Right.
So when we've talked about the network layer before, we often focus on the data plane.
That's the physical act of forwarding packets, right?
Like a mail carrier literally delivering a letter based on the address.
It's the execution part.
Exactly.
And if the data plane is the mail carrier just following instructions,
the control plane is the entire postal services operation center, the brain.
It figures out where to send that packet in the first place.
It computes, maintains, and installs the forwarding rules, those tables that the data plane then uses.
It sets the directions.
Okay.
So this brain can work in a couple of fundamentally different ways.
The book lays out two core approaches for how these control decisions get made.
First, there's the traditional method, per -router control.
Yeah, this is the classic model.
A routing algorithm runs inside each individual router.
Think of each router like a driver with their own map, maybe talking only to the cars right next to them, sharing little bits of traffic info.
Each one builds its own piece of the bigger network map based on these local chats.
This is how the internet has mostly worked for decades.
Protocols like OSPF and BGP.
They're built on this idea.
Decentralize, then.
Each router for itself, almost.
Pretty much, yeah, in terms of making the routing decisions.
And then there's the newer approach, which sounds quite different.
Logically centralized control.
This is where software -defined networking, or SDN, comes in.
Ah, yes.
SDN.
Here's where things get really interesting, I think.
Instead of every router figuring things out individually, you have a single central mastermind.
Now, it might be physically distributed across several servers for reliability, but logically it's one brain.
It calculates all the forwarding rules for all the routers.
Like a central air traffic controller directing all the planes.
Exactly like that.
A perfect analogy.
This central controller then pushes those optimized rules down to all the individual routers, which becomes simpler forwarding devices.
And it's not just theory.
Google uses this in their global B4 network to connect data centers super efficiently.
AT &T is also moving heavily into virtualization based on these ideas.
It's a big shift.
Okay, so whether it's centralized or per router, the main goal is still finding the best path, right?
The least cost path.
Absolutely.
That's the fundamental problem routing tries to solve.
Find the most efficient way from A to B.
And cost here can mean different things.
It could be physical distance, link speed, maybe even a monetary cost assigned by an admin.
Think tolls or speed limits on a map.
Routing algorithms are the tools for solving this, and they get classified in a few ways.
First, that centralized link state or LS versus decentralized distance vector or DV split we just touched on.
So link state is the all -knowing controller view and distance vector is more like local gossip.
That's a good way to put it.
LS assumes the algorithm has a complete global view of the whole network map.
Everyone shares everything.
DV is decentralized.
No single node knows everything.
They just talk to their direct neighbors and iteratively exchange these distance vectors, estimates of costs, and gradually figure out the paths, like friends sharing their best routes, constantly updating based on what they hear.
Okay, makes sense.
But the book mentioned something surprising.
Most modern internet routing is load -insensitive.
They don't really react to traffic jams.
It seems counterintuitive, doesn't it?
You'd think optimizing for current congestion would be priority number one.
Yeah, why wouldn't they?
Well, early attempts did try to be load -sensitive,
but it caused major problems.
Imagine routers constantly trying to reroute traffic away from a busy link.
They might all switch at once, causing the new link to become congested.
Then they'd switch back.
It led to oscillations and instability,
kind of like traffic sloshing back and forth unpredictably.
So for overall network stability, predictability often wins out over trying to react to every little change in load.
A key insight.
Fascinating.
Okay, let's dig into those two main algorithm types.
First up, link state, LS routing using Dijkstra's algorithm.
Right, so with LS, step one, each router figures out its local connections.
Who am I directly connected to and what's the cost of each link?
That's its local link state.
Step two, it broadcasts this information to every other router in the network domain.
Flooding, essentially.
So after this flooding, every single router has an identical complete map of the network topology and link costs.
Everyone gets the same map.
Exactly.
Then step three, each router independently runs Dijkstra's algorithm on that map.
Dijkstra's iteratively finds the shortest paths from that router to all other possible destinations in the network.
It's like every taxi driver having the exact same perfectly updated city map and each one calculating their own best routes from where they are.
But you mentioned potential problems.
The book talks about oscillations if link costs do reflect congestion, like traffic jams bouncing around.
Yeah, that's the pathology.
If link costs are based on current traffic load, you can get into these unstable loops.
Router A sees congestion, reroutes through B.
Now B gets congested, so it reroutes through A back and forth.
One way to help mitigate this, as the book notes, is to introduce some randomization.
Maybe routers don't all recalculate or advertise at the exact same time.
Staggering things helps prevent self -synchronization.
Okay.
Now, the other side of the coin.
Distance vector, DV routing, based on Bellman -Ford.
Right.
DV is fundamentally different.
It's distributed, asynchronous, iterative.
No global map here.
Each node only knows about its direct neighbors and what those neighbors claim their best path costs are to other destinations.
The core calculation uses the Bellman -Ford equation.
Essentially, a node figures out its lowest cost to a destination by checking with each neighbor.
It asks,
what's your cost to get there?
Then adds its own cost to reach that neighbor, picks the neighbor that offers the lowest total.
It's like asking your friends for directions again.
What's your best time to the restaurant?
Plus, my time to get to your house.
Precisely.
And they do this iteratively, exchanging these distance vectors periodically, or when costs change, gradually converging on the best paths.
Sounds more organic, maybe, but it has that big issue, the count to infinity problem.
Ah, yes.
This is DV's Achilles heel.
It happens because bad news, like a link cost increasing or a link failing, completely travels very slowly through the network.
Good news propagates quickly.
But bad news?
Not so much.
The book gives an example, figure 5 .7, I think.
Imagine router Y routes to X via Z.
If the link cost between Y and Z goes way up, Y might hear from another neighbor, say W, that W can reach X via Y.
Now Y thinks it can reach X via W, and W thinks it can reach X via Y.
They form a routing loop.
Packets bounce back and forth, or hop counts increasing indefinitely, counting to infinity.
Ouch.
That sounds like it could break things badly.
Is there a fix?
Well, partial fix.
It's called poisoned reverse.
It's quite clever, actually.
Poisoned reverse?
Sounds dramatic.
You lie to your neighbor.
Basically, yes.
If router Z routes packets destined for X through its neighbor Y, then Z will advertise to Y that its own distance to X is infinite.
This lie prevents Y from ever trying to route packets for X back through Z.
Because Z is telling it, don't even think about coming this way to get to X.
It effectively prevents those simple two -node loops.
But it doesn't solve loops involving three or more routers.
So it helps, but it's not a complete solution.
Okay, so comparing LS and DV, it sounds like LS is maybe faster to converge, more robust to errors, but has higher message overhead initially.
That's a good summary.
LS floods information everywhere, so more initial messages.
But once everyone has the map, calculations are local and usually fast.
It's less prone to loops and bad information spreading far.
DV has less overhead per exchange, just talking to neighbors.
But convergence can be slow, especially with bad news.
And it's vulnerable to loops and the count to infinity issue.
The book mentions a real -world ISP meltdown in 1997 caused by a DV protocol issue took hours to fix.
So no single winner.
Nope, both have their place.
And as we'll see, the real internet actually uses both, but in different contexts.
Right, which brings us perfectly to routing in the real internet.
Why the need for two separate protocols like OSPF and BGP?
Why not just one big global algorithm?
Great question.
The book points to two huge reasons.
Scale and administrative autonomy.
First, scale.
The internet has hundreds of millions, maybe billions of routers now.
Having every router know about every other router and exchange link states or distance vectors globally, it's just computationally infeasible.
The overhead would be insane.
Too much information.
Way too much.
And second, administrative autonomy.
The internet isn't one monolithic network, it's a network of networks.
Each network, usually run by an internet service provider, ISP, or a large organization, wants to manage its own infrastructure, choose its own internal routing policies, use its preferred routing protocols internally, and, frankly, hide its internal complexity from the outside world.
They need control.
So the solution is to carve the internet up into these autonomous systems, or AS.
Exactly.
And AS is basically a chunk of the internet under a single administrative control.
Think of your ISP's network, or Google's network, or a university's network.
Each AS gets a unique number, an ASN.
And this structure leads directly to the two types of routing protocols.
Okay, so routing within an AS.
That's intra -AS routing.
And the protocol often used there is OSPF.
Correct.
OSPF stands for Open Shortest Path First.
It's the most common intra -AS protocol.
And guess what?
OSPF is a link state protocol.
Ah, so inside their own network, an AS uses the everyone gets the map approach.
Precisely.
Routers within a single AS flood their link state information only to other routers within that same AS.
Then they all independently run Dijkstra's algorithm to calculate the best paths within their own AS.
Network admins configure the link costs, maybe based on speed or just policy, to guide traffic flow inside their network.
OSPF also has features like security, authenticating updates, using multiple paths if they have equal cost, and importantly, hierarchy.
Large A's can be divided into areas to limit the scope of flooding and improve scalability further.
Makes sense.
Keep the complexity manageable within your own domain, but then you need to connect these AS together.
That's where BGP comes in.
Yes.
BGP, the border gateway protocol, is the intra -AS routing protocol.
It's often called the glue of the internet because it holds all these independent AS together.
So BGP operates between the AS.
What does it actually do?
It has two main jobs.
First, prefix reachability.
An AS uses BGP to tell its neighboring S's, hey, I exist, and here are the IP address prefixes, subnets, that you can reach through me.
This is how routes are advertised across the internet.
Second, best route determination.
When a router learns about multiple ways to reach a particular prefix, maybe through different neighboring S,
BGP provides the rules and information needed to choose the best path.
And how do these AS actually talk BGP to each other?
They establish connections between specific routers, often called border or gateway routers.
BGP actually runs over TCP for reliability.
When the connection is between routers and different AS, it's called EBGP, external BGP.
When routers within the same AS use BGP to share the external reachability information they've learned, so all internal routers know how to get out, that's called IBGP,
often all BGP speaking routers within an AS have IBGP connections to each other in a full mesh.
Okay, and BGP advertisements carry attributes.
The AS path sounds really important.
Critically important.
The AS path attribute is literally the list of AS numbers that a road advertisement has passed through to reach you.
Its primary purpose is loop detection.
If a router receives a BGP update and sees its own AS number already in the AS path, it knows that path represents a loop and it immediately rejects it.
Simple, but super effective.
Another key attribute is nextTHOP.
This tells the router the IP address of the first router in the next AS, along the path towards the destination prefix.
This is crucial because it connects the inter -AS decision to the intra -AS routing.
How do I get to that specific nextTHOP router using OSPF?
Right, it bridges the two worlds.
Now, route selection.
I love the term hot potato routing.
Just get the packet out of my network, ASAPO.
Yeah, that's evocative.
And it is part of BGP route selection.
It means if you have multiple paths out of your AS to reach a destination, choose the path that lets you hand off the packet to the next AS with the lowest intra -AS cost.
Get it off your plate quickly.
But crucially, hot potato isn't the first thing BGP considers.
The selection process is actually a sequence of steps.
First, it checks local preference.
An AS administrator can assign higher preference values to certain paths learned via BGP, often based on business agreements, like preferring a customer route over a peer route.
Policy rules here.
Then, if local preference is equal, it looks for the path with the shortest AS path.
This is key.
It means BGP often inherently prefers paths that traverse fewer networks, which generally correlates with a bit of performance and less complexity.
Only if the paths are still tied after local preference and AS path length does it fall back to hot potato routing.
There are further tiebreakers too, but those are the main ones.
So it's not purely selfish.
Shortest AS path sounds pretty globally beneficial.
It often is, yes.
It balances the AS's local policy needs with a preference for globally shorter paths.
And BGP enables cool things like IP Anycast, right?
Where multiple servers can share one IP address.
Exactly.
With Anycast, you advertise the same IP prefix from multiple locations around the world using BGP.
BGP's normal route selection process, especially shortest AS path, naturally causes routers to direct user traffic towards the closest instance of that IP address, topologically speaking.
It's used heavily for things like DNS root servers.
There are only 13 root IP addresses, but hundreds of physical servers worldwide use Anycast to respond to those addresses.
Your request just goes to the nearest one.
Some CDNs also use it, though maybe less so for actual content delivery due to TCP connection state potentially shifting mid -session if routes change.
That makes sense.
And routing policy is clearly a huge driver in BGP.
The book talks about ISPs using BGP advertisements carefully to enforce business deals.
Absolutely fundamental.
Imagine a large backbone ISP, let's call it W, has two customers, ISPX and ISPY.
W will advertise roots to X and Y and advertise roots from X and Y to the rest of the internet.
But W will typically not advertise routes learned from X to Y or routes learned from Y to X.
Why?
Because W doesn't want X and Y sending traffic through W's network just to reach each other, unless they're paying specifically for that transit service.
BGP advertisements, or lack thereof, are the mechanism for enforcing these policies.
So to sum up why we have separate intra -AS, BGP and intra -AS OSPF protocols.
It boils down to policy, intra -AS is driven by business, intra -AS less so.
Scale, intra -AS has to handle the global internet, intra -AS just one network.
And performance, intra -AS can optimize purely for performance metrics while intra -AS might choose a policy compliant path even if it's technically longer or slower.
They solve different problems.
Got it.
And the book ties it all together with a practical example.
How does a new company actually get online?
Yeah, it's a great way to see it in action.
You get an IP address block from an ISP, get a domain name, you set up DNS.
But how does the rest of the world know how to reach your new servers at those IP addresses?
Your ISP uses BGP.
They configure their border routers to start advertising your specific IP prefix, linked to their AS, to their BGP neighbors.
Those neighbors propagate the advertisement and soon, through the magic of BGP, the entire internet learns a path to reach you.
It all comes back to BGP gluing the networks together.
Amazing.
Okay, so that's the traditional world.
But we keep hinting at this big change, software -defined networking, SDN.
Right.
Let's shift gears back to that.
SDN flips the script by moving towards that logically centralized control model.
The book highlights four key characteristics.
First, flow -based forwarding.
This is huge.
Instead of just looking at the destination IP, SDN switches can match packets based on any combination of header fields,
MAC, sourcedist IP, port numbers, VLAN tags, you name it.
Then apply actions like forward, drop, or modify.
Way more flexible.
So much finer -grained control over traffic.
Exactly.
Second, the clear separation of the data plane and the control plane.
The switches, beta plane, become simple, fast hardware just executing rules.
The intelligence control plane move off the switch into software controllers.
Third, naturally following from that, control functions are external to the switches.
The brains run on standard servers, not specialized hardware.
And fourth, it makes the network programmable.
You can write applications that interact with the controller via APIs to define network behavior, just like writing software for a computer.
The network itself becomes programmable.
That sounds powerful.
It really is.
This unbundling allows for much faster innovation.
The core component is the SDN controller.
Remember, it's logically centralized, but often physically distributed for resilience and scale.
Functionally, it has layers.
At the bottom, a communication layer uses protocols like OpenFlow, the southbound API, to talk down to the switches, sends rules, gets status updates.
Above that, a network -wide state management layer keeps track of everything, links, switches, hosts, current rules, statistics, the big picture.
And at the top, the network control application layer, using the northbound API.
This is where the actual intelligence lives.
Routing logic, firewalls, load balancers, they're just applications running on the controller, using its view of the network to make decisions and instruct the switches.
And OpenFlow is that key protocol connecting the controller and the switches.
It's the most well -known southbound protocol, yes.
It defines messages for the controller to configure the switch, modify its flow table rules, read statistics,
and messages for the fudge to send back events, like PortX just went down, or I received a packet that doesn't match any rule.
What should I do?
Pack it in.
You mentioned Google's B4 network using this.
How does it help them?
Massively.
Because Google controls both the network and the applications generating the traffic between its data centers, they can use SDN to do incredibly sophisticated traffic engineering.
They can calculate optimal paths based on real -time demand and link capacity,
programming the OpenFlow switches accordingly.
The result, as the book mentions, is very high link utilization, sometimes approaching 70 % or more, which is much higher than typical ISP networks manage.
SDN is a great fit there.
Okay, let's walk through an example from the book.
A link fails in an SDN network.
What happens?
Say the link between switch S1 and S2 fails.
One,
switch S1 detects the failure, court goes down, and sends an OpenFlow port status message up to the SDN controller.
Two, the controller's state management layer receives this, updates its internal map of the network topology.
Link S1, S2 is now marked down.
Three, this change triggers the relevant application, maybe the routing application, which might be running distros or something else.
Four, the routing app recalculates all affected paths based on the new topology.
Five, it tells the controller's flow table manager about the necessary rule changes.
Six, the controller uses OpenFlow messages to push the updated flow rules down to the affected switches.
S1, S2, maybe others like S4, depending on the new paths.
And packets immediately start flowing along the new optimized routes.
Wow.
So the intelligence is centralized, the reaction is orchestrated, and you're pushing down simple rules.
It seems much cleaner than every router trying to figure it out individually.
It is.
In many ways.
The real power is that if you want to change how routing works or implement a new security policy, you just change the application software on the controller.
You don't need to update the firmware on thousands of switches.
That agility is a huge selling point for SDN.
And it's still evolving, integrating with things like network functions, virtualization, and FV, running network services like firewalls as software on standard hardware.
And people are even exploring SDN concepts for inter -AS routing, though that's complex politically.
Okay, one last area.
Actually managing these complex networks day to day, keeping the lights on.
Yeah, network management.
Hugely important.
Often unsung, you've got potentially thousands of devices, multiple layers of protocols.
It's complex.
The basic framework involves managing server, usually in a network operation center at NOC, the managed devices themselves, router switches, servers, even IoT things, the data about those devices, configuration stats, an agent running on each device, and a network management protocol for communication.
And ICMP plays a role here, right?
The internet control message protocol.
It's like the network's little messenger service.
Exactly.
ICMP is technically part of the network layer carried inside IP datagrams.
It's used mainly for error reporting, destination unreachable, time exceeded, but also for essential diagnostic tools, like ping.
When you ping a host, your computer sends an ICMP echo request message, and the target sends back an ICMP echo reply.
Great for checking basic connectivity and round trip time.
And trace route.
That seems almost magical, mapping the path.
Trace route is brilliant.
It uses ICMP in a really clever way.
Your machine sends out a packet.
The first packet has a time to live PTL field set to one.
The first router it hits decrements the TTL to zero, discards the packet, and crucially sends back an ICMP TTL exceeded message to you.
This message includes the router's IP address.
The new machine sends a packet with TTL too.
The first router passes it.
The second router decrements TTL to zero and sends back the ICMP message.
It keeps doing this, incrementing the TTL each time, effectively getting each router along the path to identify itself via those ICMP error messages, until the packet finally reaches the destination, which usually sends back an ICMP port unreachable message, signaling the end.
It maps the route hop by hop.
Very clever use of error messages.
Okay, beyond diagnostics, how do operators configure devices and monitor them?
Traditionally, three main ways.
The oldest is the CLI command line interface, connecting directly to a device, like via SSH, and typing specific, often vendor proprietary commands.
Works, but doesn't scale well for managing hundreds or thousands of devices.
Right, too manual.
Thinking SNMP.
Simple Network Management Protocol.
This is an application layer protocol specifically for network management.
A managing server sends SNMP requests, like get request to read a value, set request to change one, to an agent running on the managed device.
The device holds its management data in a structured database called a MIB, Management Information Base.
SNMP can also send unsolicited alerts, called traps, from the agent to the server if something critical happens, like a link failure.
Interestingly, SNMP usually runs over UDP a bit surprising, but designed for efficiency.
Okay, SNMP sounds better, but maybe still device -focused.
Is there a more modern way?
Yes.
The book highlights NETCON, Network Configuration Protocol, and Yang, yet another next generation, as the modern approach, especially favored in SDN environments, but also used more broadly now.
NETCON is focused heavily on robust configuration management.
It uses structured XML -encoded remote procedure calls, RPCs, usually over a secure connection -oriented transport, like SSH or TLS.
And Yang is the crucial data modeling language.
It allows vendors and operators to precisely define the structure, syntax, and semantics of all the configuration and operational data on a device.
It includes constraints, data types, everything needed for clarity and validation.
So Yang defines the data model, and NETCON manipulates the data based on that model.
Exactly.
The big advantage is that NETCON lying allows for more abstract, network -wide operations.
You can define configurations based on the Yang model, validate them, and apply them atomically across multiple devices.
It lets operators manage the network more as a cohesive system, rather than just poking it into individual boxes.
It's much better suited for automation.
What a journey.
Okay, let's recap.
We started with the core idea of the control in the internet's brain.
We looked at the two main approaches, per -router control, like OSPF and BGT run, and logically centralized control, the heart of SDN.
We dove into the fundamental routing algorithms, Link State, Dykstra, and Distance Vector, Bellman -Ford, with their trade -offs and quirks, like oscillations and count to infinity.
Then we saw how these play out in the real world.
OSPF for routing within an autonomous system using Link State principles,
and BGP, the crucial inter -AS protocol, using path vectors and complex policies to glue the entire internet together, enabling things like Anycast.
We explored the paradigm shift of SDN, separating control from data, making the network programmable via controllers and protocols like OpenFlow.
And finally, we covered the essential network management tools, ICMP for diagnostics like ping and crace root, and the evolution from CLI to SNMPMI, and now to the more robust NETCONFANG for configuration.
It really covers the gamut, from foundational theory to cutting -edge practice.
It shows the immense complexity, but also the elegance, of the systems that make the global internet work.
And as networking experts like Jennifer Rexford often point out, the ultimate goal is often to make all this complexity disappear, to make the network invisible as the air we breathe, just a seamless platform for innovation.
Which leads to a fascinating thought for you, our listeners.
As these control plane technologies continue to evolve, especially with the programmability and abstraction offered by SDN and approaches like NETCONFANG,
how close are we to that vision?
How might these ongoing changes reshape our digital experience, truly making the network fade into the background, becoming an assumed invisible enabler rather than we ever have to think about?
It's the direction things are heading, making connectivity powerful, adaptable, but ultimately completely seamless for the end user.
That's the future we're building.
Thank you for joining us on this deep dive into the brain of the internet.
We hope you feel a bit more informed about the incredible engineering that happens behind the screen every time you connect.
ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.
Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.
Support LML ♥