Chapter 3: Transport Layer

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement, not replace the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Have you ever like really thought about what happens when you hit send on an email

or start streaming a movie or even just browsing a webpage?

I mean, how does your device actually talk to some server,

maybe thousands of miles away and get exactly the data it needs?

Right, and how does it land in the right application window on your screen?

Exactly.

It feels kind of like magic, doesn't it?

It absolutely does, but behind that magic, there's this incredibly complex layer of the internet's architecture working away.

Constantly making sure your data finds its way.

Precisely.

And today, we're taking a deep dive into that unsung hero, the transport layer.

It's really the crucial middleman making all our daily internet stuff possible.

So our mission today is to unpack its core ideas, look at the main protocols, you know, UDP and TCP, and appreciate the architectural brilliance that lets our apps talk to the network and ultimately to each other.

Think of this as like a shortcut to understanding what's really happening under the hood of your digital life.

Hopefully you'll have some aha moments.

And maybe pick up some surprising facts along the way.

Definitely.

So let's untack this.

At its heart, the transport layer gives us what we call logical communication.

Logical communication between application processes.

That's important.

Not just between machines, but between the specific apps running on them.

Right, so from the apps point of view, it feels like it's directly connected to the other app, even if they're on opposite sides of the planet.

Yeah, it abstracts away all the messy details of the underlying network.

The application doesn't need to worry about routers or different link types.

To make this clearer,

think about two big houses.

One on the East Coast, one on the West Coast.

And each house has, say, a dozen kids inside, cousins, and they love sending letters back and forth.

Got it, lots of mail.

Now in this analogy, the network layer is like the postal service.

Its job is just moving mail from house A to house B.

It only cares about the main address on the envelope, right?

The house address.

Exactly.

It doesn't care who the letter is for inside the house.

Okay, so where does the transport layer fit in?

Well, imagine inside each house, there's one designated person.

Let's call them Anne on the West Coast and Bill on the East.

Anne and Bill, the mail organizers.

Yep, their job is to collect outgoing letters from all their siblings, give them to the postal carrier, and when mail arrives, they sort it and deliver it to the correct cousin within their own house.

Anne and Bill are the transport layer here.

Ah, I see.

They handle the person -to -person delivery inside the house, making that logical connection between cousins.

Precisely.

And here's a really key point.

This transport layer function, this Anne and Bill role, it only exists in the end systems.

Meaning only in the houses themselves, like on your computer or the server or your phone.

Exactly.

It's not happening in the intermediate network routers, the mail trucks and sorting centers in our analogy.

Right.

Routers only care about the network layer, the house address.

They don't need to know about Anne or Bill or which cousin gets the letter.

Makes sense.

So when your application, say your email client, sends a message...

The transport layer grabs it.

It might need to break it into smaller chunks.

Then it adds a special header to each chunk.

Now, these are called segments.

Segments.

Got it.

These segments are then handed down to the network layer.

The postal service.

Right.

Which puts its own header on, turning them into IP datagrams.

Those are the envelopes ready for the journey.

And then on the receiving end, the whole process just happens in reverse.

The network layer delivers the datagram, the transport layer opens it up, gets the segment, maybe reassembles chunks and hands the original message to the right application.

And on the internet, there are basically two main ways this transport layer works.

Two main protocols,

UDP and TCP.

Each offers different kinds of service.

Okay.

This brings us to a really vital job the transport layer has, multiplexing and demultiplexing.

Sounds complicated.

Well, think about your computer right now.

You might be streaming music, browsing the web, maybe a file download is running in the background.

Multiple things happening at once.

Right.

So when all of this data comes pouring in from the network layer, how does the transport layer figure out, okay, this chunk of data is for the music player, that chunk is for the web browser?

Yeah.

How does it sort the incoming mail for the right application?

That's the problem multiplexing and demultiplexing solve.

Your applications use these things called sockets to send and receive data.

Sockets, like interfaces or doorways.

Exactly.

Think of them as numbered doors.

Each application using the network has its own door.

Okay.

So when a segment arrives at your computer,

demultiplexing is the process where the transport layer looks at some special info in the segment's header.

The destination port number, usually.

Right.

It looks at that port number to figure out which door, which socket the data should be delivered through.

It gets the data to the correct application.

Makes sense.

And multiplexing is just the opposite process at the sender's end.

Yep.

Gathering data chunks from different sockets, adding the right header information, including the destination port number so the other side knows where it's going, and handing those segments down to the network layer.

So these port numbers are the key identifiers.

They are.

They're 16 -bit numbers, so they range from zero up to 65535.

They uniquely identify a specific socket, a specific door on a host machine.

And some of these numbers are reserved, right, for common applications.

Yeah, the well -known port numbers.

Like, HTTP for web traffic almost always uses port 80.

Secure Web HTTPS uses 443.

FTP, File Transfer Protocol, uses port 21.

And when your web browser connects to a server, the server is listening on port 80, but your browser gets assigned a temporary, higher -numbered port for its end of the conversation.

Exactly, an ephemeral port.

Now, UDP and TCP actually handle this to multiplexing a bit differently.

Oh, so.

For UDP, a socket is basically identified by just two things.

The destination IP address and the destination port number.

Okay, just where it's going.

Right, which means if, say, two different clients send UDP segments to the same server IP and port,

those segments will both be delivered to the exact same socket on the server.

Wait, so how does the server app know which client sent which message?

Ah, because the UDP segment also contains the source IP and source port.

So the application can look at that information inside the segment to tell the clients apart.

But the transport layer itself just sends everything for that port to the one socket.

Okay, that's UDP, simple enough.

How's TCP different?

TCP is more specific.

A TCP socket isn't just identified by the destination IP and port.

It's identified by a four -tuple.

Four things, what are they?

Source IP address, source port number, destination IP address, and destination port number.

All four together.

Ah, so it includes the sender's info in the socket identification itself.

Precisely, and this is absolutely crucial.

It's what allows a busy web server listening on port 80 to handle connections from thousands of different clients simultaneously.

How does the four -tuple enable that?

Because when client A connects to the web server on port 80, the server creates a new socket specifically for that connection.

That socket is defined by client A's IP port and the server's IP port 80.

Okay.

When client B connects to the same server on port 80, the server creates another completely separate socket because client B has a different source IP or source port.

So even though both clients are talking to port 80, the server uses the unique four -tuple to keep each conversation tied to its own dedicated socket.

Exactly.

It keeps all those simultaneous HTTP sessions perfectly distinct.

It's not just one big funnel into port 80.

It's many separate parallel conversations managed by distinct sockets.

That makes a lot more sense for something like a web server.

And this whole idea of ports, it actually has some real -world security implications, doesn't it?

Like port scanning.

Understanding ports is fundamental to port scanning.

Tools like Nmap are used for this all the time.

Right.

Administrators use it to check their own systems, see what services are running, what ports are open.

And unfortunately, attackers use it too to probe for vulnerabilities.

So how does something like Nmap work basically?

Well, in its simplest form, it sends TCP -SYN segments, the first step of that handshake we'll talk about later, to a range of ports on a target machine.

Okay.

If it gets a SYNAC response back from a port.

That means something's listening.

The port is open.

Exactly.

If it gets back a RST segment, a reset.

Nothing there, port's closed.

Right.

And if it gets nothing back at all.

Maybe a firewall is blocking it.

Could be.

So yeah, just by sending these simple probes and watching the replies, you can map out the open doors on a machine.

It's a very practical use of understanding these transport layer details.

Fascinating.

So we see how data gets sorted, but you mentioned UDP and TCP offer different services.

Let's start with UDP.

You said it's simpler.

Yeah, UDP, the user Betagram protocol, is often called the no -frills protocol.

It does pretty much the bare minimum beyond that multiplexing, demultiplexing we just talked about.

And maybe some very basic error checking.

And it's connectionless, right?

Totally connectionless.

No setup handshake needed before sending data.

You just create a segment and fire it off towards the destination.

It's very lightweight.

Okay, so why would you choose UDP

if TCP offers more features, like reliability?

Well, there are several good reasons.

First, applications get much finer control over when data gets sent.

What do you mean?

TCP has built -in congestion control, which we'll get into.

It might deliberately slow down sending if it thinks the network is busy.

UDP doesn't have that.

The application just says send, and UDP sends.

Ah, so for real -time things, like maybe online gaming or voice calls, where delay is worse than occasionally losing a tiny bit of data.

Exactly.

Some data loss might be acceptable, but lag is killer.

So bypassing TCP's potential delays is a big plus.

Makes sense.

What else?

No connection establishment delay.

TCP needs that three -way handshake to set things up.

UDP just sends.

For applications that do quick back -and -forth transactions like, say, DNS lookups.

Main name system, yeah.

Turning website names into IP addresses.

Right.

If every single DNS query required a full TCP handshake first, the web would feel noticeably slower.

UDP speed here is crucial.

Okay, faster setup.

Anything else?

No connection state needs to be maintained by the server for UDP.

What does that mean?

Well, a PCP server has to keep track of sequence numbers, buffers, timers for every single active connection.

That takes memory and processing power.

A UDP server doesn't track connections like that.

So a single UDP server can handle way more active clients at the same time, theoretically.

Because it's not burdened with tracking each one individually.

Exactly.

Much more scalable in that sense.

And finally, there's small header overhead.

A UDP header is only eight bytes.

TCP's header is 20 bytes minimum.

Less overhead means slightly more room for actual data.

So UDP is good for DNS, network management stuff maybe, and definitely multimedia streaming, internet phone, video conferencing, applications where speed is key and they can handle a bit of loss or maybe build their own reliability on top if needed.

Correct.

But there's a big catch with UDP.

Let me guess, the lack of reliability.

That's part of it, but also it's lack of congestion control.

Ah, you mentioned TCP has that.

UDP doesn't try to slow down if the network is clogged.

Nope.

It just keeps blasting data at whatever rate the application tells it to.

Which sounds great for that app, but.

But if too many UDP applications do that, they could just overwhelm the network, right?

And crowd out the more polite TCP traffic that is trying to back off.

Precisely.

Uncontrolled high bitrate UDP flows could potentially lead to what's called congestion collapse, where the network gets so jammed up with lost packets and retransmissions that useful throughput drops dramatically for everyone.

It's a real danger.

So using UDP comes with a responsibility for the application designer to be somewhat network friendly.

Ideally, yes.

Now that UDP segment structure is super simple.

Just four fields.

What are they?

Source port, destination port.

We know those for multiplexing, demultiplexing.

Then there's a length field saying how long the segment is in bytes.

Okay.

And finally a checksum.

Ah, the basic error checking you mentioned.

How does that work?

It's a mathematical calculation done over the header and data.

The sender calculates it, puts it in the header.

The receiver recalculates it if the results don't match.

It means some bits probably got flipped during transmission.

The segment is corrupted.

Right.

It's just for detection though.

UDP itself doesn't try to fix the error.

It usually just discards the bad segment.

Maybe notifies the application.

But wait, don't lower layers, like Ethernet also have checksums.

Why does UDP need one too?

That's a great question.

It goes back to something called the end -to -end principle in system design.

Okay.

The idea is while a specific link like Ethernet might check for errors on that link, errors could still happen between links.

Like data could get corrupted while sitting in a router's memory buffer before being sent out on the next link.

Ah, so the checks at the lower layers don't guarantee perfect data integrity all the way from the source application to the destination application.

Exactly.

So having an end -to -end checksum at the transport layer provides an extra layer of protection, ensuring data integrity across the entire path.

Got it.

So UDP gives a speed, low overhead, but no reliability guarantees and no congestion control.

Pretty much.

Which leads us to the big question.

How do we build reliable data transfer on top of an inherently unreliable network layer, like IP, which can lose, corrupt, or reorder packets?

This is where things get more complex, I imagine.

Definitely.

This is the core problem that protocols like TCP solve using mechanisms often grouped under the term

ARQ protocols automatic repeat request.

ARQ.

Okay, what's involved?

Several key ingredients.

First, error detection, like that checksum we just discussed.

You need a way to tell if received data is corrupted.

Makes sense.

Second, receiver feedback.

The receiver needs to tell the sender what happened.

Usually this involves sending positive acknowledgements or ACKs for data received correctly.

Nope, got packet number five.

Right.

Sometimes protocols might also use negative acknowledgements, NX, saying, hey, packet number six looked corrupted.

Okay, ACK is maybe an X.

Third, retransmission.

If the sender doesn't get an ACK back for a packet after a certain amount of time, or if it gets an NAK, it needs to send that packet again.

Assuming it got lost or damaged.

Exactly.

And fourth, sequence numbers.

Each packet needs a unique identifier so the receiver can tell if it's received a duplicate or if packets have arrived out of order.

Right, to put things back in the correct sequence.

So using these building blocks, error detection, feedback, retransmission, sequence numbers, you can construct a reliable data transfer protocol.

Early versions of these protocols were quite simple, right?

Like stop and wait.

Yeah, the most basic reliable protocol is stop and wait.

The sender sends one packet, then stops.

And waits.

And waits for the ACK for that specific packet before sending the next one.

Super simple, guarantees order, handles loss through timeouts and retransmissions.

Seems reliable.

It is functionally reliable, but oh boy, is it slow.

The performance is terrible, especially over networks with long delays.

Why is that?

Because the sender spends most of its time just sitting idle, waiting for that ACK to travel back across the network, the channel utilization is abysmal.

Can you give an example?

Sure.

Remember our one gigabit per second link between the US coasts?

That's a huge pipe.

Round trip time, let's say, is 30 milliseconds.

Okay.

Now imagine you're sending relatively small packets, maybe a thousand bytes each.

How long does it actually take to transmit a thousand bytes over one DBPs link?

Very little time.

A thousand bytes is 8 ,000 bits.

One GBPs is one billion bits per second.

So 8 ,000, one million, that's like eight microseconds.

Tiny.

Exactly.

Eight microseconds to push the packet onto the wire, but then you have to wait the full 30 milliseconds, which is 30 ,000 microseconds, for the ACK to come back before you can send the next packet.

Oh wow.

So you transmit for eight microseconds, then wait for 30 ,000 microseconds.

Yep.

Do the math on the utilization.

It's something like eight divided by 30 ,000.

It's about 0 .000027 or 0 .027%.

Less than one tenth of one percent utilization of your expensive gigabit link.

That's dismal.

Dismal is the right word.

Your effective throughput isn't one GBPs.

It's more like 267 kilobits per second.

It's a massive bottleneck caused entirely by the stop and wait protocol.

Okay, so stop and wait is clearly not practical for high speed long distance networks.

What's the solution?

The solution is pipelining.

Pipelining.

Like sending multiple packets down the pipe at once.

Exactly.

Instead of sending just one packet and waiting, the sender is allowed to send multiple packets, maybe packet one, then two, then three, before getting the ACK back for packet one.

Ah, so you fill the pipe.

Keep the network link busy transmitting data while earlier packets and their ACKs are still in transit.

Precisely.

This dramatically increases utilization and throughput.

But it must make things more complex, right?

If you have multiple packets in flight at once.

It does.

You definitely need a wider range of sequence numbers to keep track of all those packets.

And both the sender and receiver need to be able to buffer more packets.

The sender needs for a member packets but hasn't received ACKs for yet in case they need retransmitting.

The receiver might get packets out of order and needs to buffer them.

Okay, so if you're pipelining and an error does occur, say packet three gets lost, but packets four, five, and six arrive.

Okay, how do you recover?

Good question.

There are two main approaches to handling errors in pipeline protocols.

Go back in, GBN, and selective repeat.

SR, go back in and selective repeat.

Okay, what's go back in?

In go back in, the sender is allowed to have up to N unacknowledged packets in the pipeline at any time.

That's its window.

Okay, a window of size N.

It sends packets one, two, three up to N.

Now let's say packet number K gets lost.

The receiver will get packets K one, then maybe K plus one, K plus two, et cetera.

Out of order.

Right.

With GBN, the receiver typically just discards any out of order packets, K plus one, K plus two.

It keeps sending ACKs only for the last packet it received correctly in order, which was K one.

It just throws away perfectly good packets.

Simplicity at the receiver.

It doesn't need complex buffering for out of order stuff.

Meanwhile, the sender has a timer running for the oldest unacknowledged packet, packet K.

When that timer expires.

It assumes packet K was lost.

Right, and with go back in, the sender's response is blunt.

It retransmits packet K and packets K plus one, K plus two all the way up to the last packet it had sent within its window N.

Whoa, so it goes back to packet K and resends everything from that point onwards, even if packets K plus one, K plus two, et cetera, actually arrived okay the first time, but were discarded by the receiver.

Yep, it's like saying, okay, something went wrong around packet K, let's just resend the whole batch starting from K to be sure.

It's potentially wasteful of bandwidth, but simpler to implement.

Okay, let's go back N.

What about the other one?

Selective repeat sounds more selective.

It is, selective repeat SR tries to be much more efficient.

When the sender's timer for packet K expires, it only retransmits packet K.

Just the one it suspects is lost, makes sense.

And the receiver's behavior is different too.

If it receives packets K plus one, K plus two before K, it doesn't discard them.

It buffers them, holds onto them.

Exactly, it buffers the out of order packets, and critically, it sends an individual ACK back for each correctly received packet, even if it's out of order.

So it would ACK K plus one and K plus two, even though K is still missing.

Right, this tells the sender, hey, I got K plus one and K plus two, okay?

You don't need to resend those, but I'm still waiting for K.

Ah, much smarter.

Then when the retransmitted packet K finally arrives.

The receiver can fill the gap in its buffer.

Now it has K, K plus one, K plus two, and it can deliver that whole sequence of in order packets up to the application.

Selective repeat sounds way more efficient, especially if losses are frequent or bandwidth is precious.

It generally is, but it requires more complexity, particularly at the receiver with buffering and managing individual ACKs for potentially out of order segments.

There's also a subtle issue with sequence number wraparound if the window size isn't chosen carefully relative to the sequence number space, but that's a deeper detail.

Okay, so GBN is simple, but potentially wasteful.

SR is efficient, but more complex, which leads us perfectly into the star of the show.

TCP, the transmission control protocol, the internet's heavy duty reliable workhorse.

Where does it fit in?

Is it like GBN or SR?

It's actually kind of a hybrid, borrowing ideas from both, but we'll get to that.

First,

unlike UDP, TCP is fundamentally connection oriented.

Meaning it has to establish a connection before sending data, like making a phone call before you talk.

Exactly.

Before any application data can flow, the TCP entities on the two end systems must handshake.

They exchange some initial control segments to establish the connection and agree on starting parameters, like initial sequence numbers.

They set up some state for this specific conversation.

Right, but it's important to remember this is a logical connection.

The intermediate routers in the network core, they have no idea about these TCP connections.

They just forward datagrams based on IP addresses.

The connection state only exists in the two end systems.

Got it, end to end connection.

TCP also provides a full duplex service.

Data can flow in both directions over the same connection simultaneously.

Like a two way street.

Yep, and it's always point to point.

One sender, one receiver.

TCP doesn't directly support multicasting, sending to multiple receivers at once.

Okay, so how does this connection set up, this handshake actually work?

It's famously a three way process, right?

It's a three way handshake, it works like this.

Step one.

The client machine wants to connect.

It sends a special TCP segment with the SYN flag set.

SYN stands for synchronize.

This segment contains a randomly chosen initial sequence number for the client's byte stream.

Let's call it clientism.

Okay, client sends SYN with its starting number.

What does the server do?

The server receives the SYN.

If it's willing to accept the connection, it allocates some resources, like buffers and variables for this connection, and sends back its own special TCP segment.

This one has both the SYN flag set and the ACK flag set.

The SYNX segment.

Right, the SYN part contains the server's own randomly chosen initial sequence number, server engine.

The ACK part acknowledges the client's SYN by setting the acknowledgement number field to clientism plus one.

So it confirms the client's SYN and proposes its own starting number.

Step two done, what's step three?

The client receives the SYNX.

It also allocates resources for the connection.

Then it sends one more segment back to the server.

This segment has the ACK flag set, and the acknowledgement number is set to serverism plus one, confirming receipt of the server's SYN.

Client acknowledges the server's SYN, and now the connection is established.

Yes, after the server receives this final ACK, both sides know the connection is up, and they're ready to exchange application data.

Interestingly, this final ACK segment from the client can actually carry the first piece of application data if needed.

It doesn't have to be empty.

It's like a confirmation call.

You ready?

Yeah, I'm ready, you ready?

Yeah, I'm ready too, let's go.

Yeah, something like that.

It ensures both sides are set up before data really starts flowing.

Okay, connection established.

How does data flow then?

TCP takes the stream of data from the application, breaks it into chunks.

Limited by the maximum segment size, MSS, right?

Which is usually derived from the underlying network's MTU, like maybe 1460 bytes for ethernet.

Exactly, it takes a chunk no larger than the MSS, sticks a TCP header on it to make a segment, and hands it down to the IP layer for delivery.

And that TCP header contains some crucial info.

We mentioned source and destination ports.

What else is key?

The 32 -bit sequence number field is vital, and remember TCP sequence numbers count bytes in the data stream, not segments.

Right, so if you're sending a 5 ,000 -byte file with an MSS of 1 ,000 bytes.

Your first segment might have sequence number zero, representing bytes 0999.

The second segment would have sequence number 1 ,000, bytes 1 ,000, 1999, the third 2 ,000, and so on.

Okay, counts bytes.

What about acknowledgements?

There's the 32 -bit acknowledgement number field.

This is how the receiver tells the sender which byte it's expecting next.

The next expected byte, so it's cumulative.

Yes, TCP uses cumulative acknowledgements.

If the receiver sends back an ACK with acknowledgement number Y, it implicitly confirms that it has received all data bytes correctly and in order up to byte Y1.

So one ACK can acknowledge multiple segments worth of data.

That seems efficient.

It is.

There are other important fields too, like the receive window, red field for flow control, and various flag bits like SYN, ACK, FIN, for closing connections, PSH, push RST reset.

We'll probably touch on red later with flow control.

Can you walk through a quick example of sequence and ACK numbers, like maybe typing a single character in a Telnet session?

Sure, let's say you typed the letter C on your client connected to a Telnet server.

Your client's TCP might send a segment containing just that one byte of data, CC.

Let's say the client's sequence number for this byte is 42.

Segment C, 42, data C.

The server receives the segment.

It needs to acknowledge it.

It sends back an ACK segment.

What would the acknowledgement number be?

Well, it received byte 42, so it's expecting byte 43 next.

ACK number should be 43.

Perfect, ACK 43.

Now maybe the server also needs to echo that character C back to your screen, so its ACK segment might also contain the C as data.

Let's say the server's sequence number for its byte stream is currently 79.

Okay, so the server sends sec79, ACK 43, data C.

It's acknowledging the client's data and sending its own data in the same segment,

piggybacking the ACK.

Exactly.

Now the client receives this.

It sees the data C, which it displays, and it sees the ACK 43, confirming its own byte 42 was received.

Now the client needs to acknowledge the server's byte 79.

So if the client sends back an ACK segment, it received byte 79, so it expects 80 next, ACK 80.

Right, and maybe the client has no data to send right now, so this segment just contains ACK 80, and so it goes back and forth, sequence numbers incrementing with data sent, acknowledgement numbers confirming data received.

It's a constant dialogue tracking the byte streams in both directions.

Precisely.

Now, crucial to TCP's reliability is its timeout and retransmission mechanism.

If an ACK isn't received for a sent segment within a certain time.

It retransmits.

Yeah.

But how does it determine that timeout value?

Network delays fluctuate wildly.

That's a huge challenge.

Setting the timeout too short leads to unnecessary retransmissions.

Setting it too long makes recovery from packet loss slow.

So what does TCP do?

It dynamically estimates the round trip time, RTT,

based on measurements of how long it takes to get ACKs back for segments that weren't retransmitted.

That's the sampler T.

Okay, measures the RTT.

But it doesn't just use the latest sampler Tt.

It calculates a smoothed average called estimated RTT, usually using an exponential weighted moving average, EWMA.

This gives more weight to recent samples, but smooths out temporary fluctuations.

Like averaging your speed over the last few minutes, not just the last second.

Good analogy.

It also tracks the variation in the RTT, WRTT, again, using an EWMA, to get a sense of how jittery the delay is.

So it knows the average delay and how much it tends to bounce around.

Right, and the actual timeout interval is typically calculated as estimated RTT plus four WRTT.

The four times the deviation adds a safety margin.

Makes sense.

Adapt the timeout based on measuring conditions with a buffer for variations.

And there's another important behavior.

When a timeout does occur and TCP retransmits a segment, it usually doubles the timeout interval for subsequent retransmissions of that same segment.

Doubles it, why?

It's a basic form of congestion control.

A timeout likely means the network is congested.

Doubling the interval causes TCP to back off more aggressively, reducing its sending rate exponentially if peded losses occur.

Okay,

so timeouts are expensive and trigger a strong backoff.

But you mentioned earlier, there's a faster way to recover sometimes.

Yes, waiting for a full timeout can be really slow.

TCP has a brilliant optimization called fast retransmit.

Fast retransmit, how does that work?

Remember cumulative ACKs.

If a segment, say number 100 is lost, but subsequent segments, 101, 102, 103 do get there.

The receiver gets 101, 102, 103 out of order.

What ACK does it send?

Since it's still waiting for segment 100, or rather the bytes starting at sequence number 100, it keeps sending ACKs for the last in -order data it received.

Let's say that was segment 99.

So it sends ACK 100 when it gets 101, then it gets 102, sends ACK 100 again, gets 103, sends ACK 100 again.

It keeps repeating the same ACK number because there's a gap.

Exactly.

These repeated ACKs for the same sequence number are called duplicate ACKs.

And the rule for fast retransmit is, if the sender receives three duplicate ACKs for the same data, meaning four ACKs in total for the same sequence number, the original one, plus three duplicates, it takes this as a very strong hint that the segment immediately following that acknowledged data, segment 100 in our example, must have been lost.

Without waiting for the timer to expire.

Correct.

Upon receiving the third duplicate ACK, the sender immediately retransmits the missing segment, 100.

This often happens much, much faster than waiting for the full RTT timeout.

It significantly speeds up recovery from single packet losses.

Wow, that's clever.

Using the pattern of incoming ACKs to infer loss quickly.

It really is.

So looking at TCP's overall reliability mechanism, you can see it's a sort of hybrid.

It uses cumulative ACKs, which feels a bit like go back M.

But with fast retransmit, it achieves something closer to selective retransmission for single losses, kind of like selective repeat.

And most modern TCP receivers do buffer out of order segments.

Selective acknowledgement or, you know, SCK options help with this too.

Further aligning it with SR principles.

Exactly.

It's a sophisticated blend tailored for the internet.

Now, thinking about that three -way handshake, it actually creates a security vulnerability, doesn't it?

Ah, yes, the SYN flood attack.

A classic denial of service.

How does it exploit the handshake?

Remember, when the server receives that initial SYN from a client, step one, it allocates resources, buffers, state variables, in anticipation of the connection being completed.

Step two, sending SYNAC.

Okay, it sets aside resources.

In a SYN flood, an attacker sends a massive barrage of SYN packets to the server, often from spoofed non -existent IP addresses.

So the server gets all these SYNs.

And for each one, it dutifully allocates resources and sends back a SYNAC, step two.

But because the source IPs are fake or unreachable, the attacker never sends the final ACK, step three, to complete the handshake.

Ah, so the server is left holding onto resources for thousands of these half -open connections that will never be completed.

Exactly.

Eventually, the server's tables fill up, or it runs out of memory for new connections.

Legitimate users trying to connect get denied because the server's overwhelmed by these bogus half -open connections.

Nasty.

How do systems defend against this?

A very effective defense is using SYN cookies.

SYN cookies?

Like browser cookies?

Not quite, but similar concept.

With SYN cookies enabled, when a server receives a SYN, it does not allocate any state right away.

Oh, so it doesn't use up resources.

Not yet.

Instead, it crafts a special, cleverly calculated initial sequence number to send back in the SYNAC, step two.

This sequence number, the cookie, is generated using a cryptographic hash of the client's IP port, the server's IP port, and a secret value known only to the server.

Okay, sends back a special SYNAC with this computed cookie as its sequence number, but remembers nothing.

Right.

Now, if the SYN came from a legitimate client, that client will receive the SYNAC and send back the final ACK, step three, acknowledging the cookie sequence number, specifically sending cookie plus one as the acknowledgment number.

When the server receives this ACK, it can recompute what the cookie should have been based on the client's IP port, which are in the ACK packet, and its secret.

If the acknowledged value, cookie plus one, matches the recomputation, the server knows this ACK is a valid response to its SYNAC.

Only then does it allocate the full connection state and establish the connection.

Ah, so it delays resource allocation until it gets proof, the correct ACK, that the client is real and received the SYNAC.

If the ACK never comes back, like from a spooed SYN flood source, the server hasn't wasted any resources.

Exactly, it's a brilliant way to validate the client before committing server resources, effectively neutralizing the standard SYN flood attack.

Very clever.

Okay, shifting gears slightly.

We often hear about flow control and congestion control in TCP.

They sound similar, both involve managing the sending rate, but they're actually solving different problems, right?

Absolutely, it's a crucial distinction.

Flow control is about preventing the sender from overwhelming the receiver.

So matching the sender speed to how fast the receiving application can actually pull data out of its receive buffer.

Precisely.

Imagine the receiver have a buffer to hold incoming data before the application reads it.

If the sender sends data faster than the application consumes it, that buffer will overflow and data will be lost.

Flow control prevents that.

How does TCP do flow control?

Through that receive window, chord field we mentioned in the TCP header.

The receiver constantly tells the sender how much free buffer space it currently has available by putting that value in the round field of the segments it sends back.

So the receiver advertises its available space.

Right, and the sender tracks this advertised drone value and makes sure it never has more unacknowledged data in flight than the receiver's last reported available window size.

It throttles itself based on the receiver's capacity.

Okay, so Chorin manages the sender receiver pipe.

What about congestion control then?

Congestion control is about preventing the sender, or really all senders collectively, from overwhelming the network itself, the intermediate routers and links.

Ah, so not about the receiver's buffer, but about the capacity of the path through the network.

Preventing router buffers from overflowing and packets getting dropped deep inside the network.

Exactly.

Congestion happens when too much traffic tries to go through a particular bottleneck link or router.

This causes delays to skyrocket and packets to be dropped, wasting bandwidth, and potentially leading to that congestion collapse we talked about.

And congestion is bad for everyone using that path.

Terribly bad.

Increased delays, wasted effort retransmitting dropped packets, and in severe cases the network can grind to a halt where almost no useful data gets through.

So how does TCP try to prevent this?

Does the network explicitly tell TCP senders to slow down?

Sometimes, but not typically in the classic approach.

TCP primarily uses end -to -end congestion control.

The sender infers that congestion is happening based on observed events, mainly packet loss.

How does it detect loss?

Primarily through those two mechanisms we discussed.

A timeout occurring or receiving three duplicate ACKs, triggering fast retransmit.

Both are taken as signals that the network is likely congested and dropped a packet.

So loss equals congestion in TCP's world.

Pretty much.

When no loss is detected, TCP assumes things are okay and tries to increase its sending rate.

This makes TCP self -clocking.

Self -clocking.

Yeah, think about it.

ACKs arriving back, confirmed data got through and implicitly signaled that there's capacity in the network for more.

The rate at which ACKs return naturally paces the sender's transmission rate.

Faster ACKs mean more bandwidth, slower ACKs mean less.

Okay, so it uses ACKs as a signal.

How does it adjust its rate based on detecting or not detecting loss?

The core principle behind classic TCP congestion control is AMD,

additive increase, multiplicative decrease.

AMD, let's break that down, additive increase.

When things are going well, ACKs are arriving, no loss detected TCP increases its sending rate, but it does so relatively cautiously, additively.

It typically increases its congestion window, which limits how much data it can have in flight.

Similar to round, but for congestion by about one maximum segment size, MSS, per round trip time.

So it slowly ramps up, probing for more available bandwidth.

Yeah.

Like gently pressing the accelerator.

Right, but what happens when loss is detected, timeout or three duplicate ACKs?

That's the multiplicative decrease part.

Exactly.

TCP reacts much more drastically to perceived congestion.

It multiplicatively decreases its rate, typically by cutting its quendon half.

Slashes it in half.

So slow, gentle increase when things are good, but a big sharp decrease when the congestion hits, like hitting the brakes hard.

Precisely.

This AMD behavior leads to TCP's characteristic sawtooth pattern if you plot the quendon over time.

It gradually climbs, hits congestion, drops sharply, then starts climbing again.

It's constantly probing for the available bandwidth and quickly backing off when it exceeds it.

It's trying to be fair, but also reactive.

Yeah, it's designed to share bandwidth relatively fairly among competing TCP flows while also protecting the network from collapse.

Now, the actual algorithm is a bit more nuanced and usually involves three main phases or components.

Okay, what are they?

First, there's slow start.

When a connection begins, TCP doesn't know the available path bandwidth.

So CRUNE starts very small, usually just one MSS.

Slow start.

Sounds like it increases slowly.

Ironically, no.

In slow start, the CRUNED typically doubles every RTT.

For every ACK received for new data, the CRUNED increases by one MSS.

This leads to exponential growth initially.

Double every RTT.

That's fast.

Why call it slow start?

It's slow compared to jumping immediately to a potentially huge window, but yeah, the exponential growth is quite rapid.

The goal is to quickly find a rough estimate of the available bandwidth.

When does slow start end?

It ends either when a loss event occurs or when the CRUNED reaches a certain value called the slow start threshold.

Okay, then what happens?

If slow start ends because CRUNED reached thresh without loss, TCP transitions into congestion avoidance mode.

Congestion avoidance.

This must be where the additive increase happens.

Exactly.

In congestion avoidance, CRUNED grows much more slowly, linearly increasing by just one MSS per RTT.

This is the more cautious probing phase when TCP thinks it's getting close to the network's capacity.

So exponential growth in slow start, linear growth in congestion avoidance.

What if loss occurs during congestion avoidance?

If loss occurs, timeout or three duplicate ACKs, TCP reacts by cutting CRUNED, usually in half, setting thresh to this new lower value, and then it might reenter slow start or go into the third phase, which is fast recovery.

This phase is typically entered specifically after detecting loss via three duplicate ACKs, the fast retransmit genre, which is considered a less severe congestion signal than a full timeout.

Okay, so fast recovery is linked to fast retransmit.

What does it do?

In traditional fast recovery, like in TCP Reno, CRUNED is halved, plus maybe a bit extra for the duplicate ACKs already received, and then CRUNED is temporarily inflated for each additional duplicate ACK that arrives while waiting for the ACK of the retransmitted segment.

The idea is to try and keep data flowing if possible, rather than stalling completely, and to avoid dropping all the way back to slow start if the loss was likely just a single dropped packet.

So it tries to recover from the loss more quickly and less disruptively than a timeout would cause.

That's the goal.

Different versions of TCP handle these phases slightly differently.

For example, older TCP Tahoe always dropped back to slow start CRUNED after any loss event.

TCP Reno introduced fast recovery for three duplicate ACKs.

And I bet things have evolved even further since then.

Oh, absolutely.

There are many newer flavors.

TCP Cubic is the default in Linux now.

It modifies the congestion avoidance phase instead of pure linear increase.

It uses a cubic function.

After a loss, it quickly ramps back up towards the previous maximum window size, then probes more cautiously around that point.

It's generally better behaved on high bandwidth, long delay, long fat pipe networks.

Interesting.

Any other major approaches?

Well, there's explicit congestion notification, ECN.

This actually involves help from routers.

Ah, network assisted congestion control.

Right.

ECN aware routers, if they're becoming congested but haven't started dropping packets yet, can mark packets passing through them.

The receiver sees the mark and informs the sender via the TCP header.

So the sender gets an explicit warning before loss happens.

Exactly.

This allows TCP to react to congestion earlier and less drastically, often just by reducing its CRUNED slightly, potentially avoiding packet drops altogether and leading to smoother throughput and lower delays.

That sounds like a definite improvement over just inferring congestion from loss.

It is, but it requires both endpoints and the intermediate routers to support ECN.

Then there are delay -based congestion control protocols.

Delay -based, like measuring the RTT.

Yeah, protocols like PCP Vegas, or more recently Google's BBR, bottleneck bandwidth and round trip propagation time, try to infer congestion not just from loss, but by carefully measuring changes in the RTT.

Does that help?

The idea is that as router queues start building up a precursor to packet loss, the RTT will begin to increase slightly before packets are dropped.

These protocols try to detect that minimum RTT, the propagation delay without queuing, and the bottleneck bandwidth, and then adjust their sending rate to keep the pipe just full enough to utilize the available bandwidth, but not so full that it creates large queues and delays.

So they aim for high throughput and low delay rather than just maximizing throughput until loss occurs.

That's the goal.

BBR in particular is used extensively by Google on YouTube and across its backbone network, as it can perform much better than traditional loss -based TCP in networks with shallow buffers or random non -congestion -related packet loss.

Fascinating how many different strategies exist.

What about fairness?

Does TCP's AMD naturally lead to fair sharing?

Ideally, yes.

If you have multiple TCP connections with similar RTTs competing for the same bottleneck link, the AMD mechanism tends to make them converge towards an equal share of the bandwidth, RK if R is the link rate and K is the number of connections.

Flows that grab too much bandwidth experience loss sooner and cut back harder, while flows with less bandwidth increase until they hit the limit.

But that's the ideal case.

What can mess up fairness?

Several things.

Connections with significantly shorter RTTs tend to grab bandwidth faster and more aggressively because their croon increases more frequently, once per RTT.

So flows going shorter distances might get more bandwidth.

Potentially, yes.

Also, as we discussed, UDP traffic doesn't play by TCP's congestion control rules.

A high rate UDP flow can absolutely starve competing TCP flows because it won't back off when congestion occurs.

The impolite flow takes over.

Right.

And another factor is applications opening multiple parallel TCP connections to the same server.

Web browsers often do this to download different parts of a webpage simultaneously.

So when browser opens 10 connections and another opens only one, the first browser might get 10 times the bandwidth share through that bottleneck.

Roughly speaking, yes.

Each TCP connection tries to get its fair share.

So opening more connections allows an application to grab a larger slice of the total pie, which isn't necessarily fair to other applications using fewer connections.

Lots of nuances to congestion control and fairness.

So it's clear that while TCP and UDP are foundational, the transport layer isn't static.

It keeps evolving.

Definitely.

We've seen all these spiralized TCP versions, but one of the most significant recent developments is arguably QUIC.

QUIC, what's that stand for?

Quick UDP internet connections.

And it's a really interesting shift.

It's so.

QUIC is essentially a new full featured transport protocol providing reliability, congestion control, security, and more.

But it's implemented on top of UDP, typically as a user space library.

Wait, it runs over UDP, but implements TCP -like features.

Why do that?

Several reasons.

One major driver was to enable faster evolution.

Updating TCP itself requires changes to operating system kernels, which is a very slow process.

By building QUIC in user space on top of UDP, new features and improvements can be deployed much more quickly just by updating applications or libraries.

Ah, bypass the OS bottleneck for innovation.

Clever.

What kinds of innovations does QUIC bring?

Quite a few.

It has a faster connection handshake.

It usually combines the transport handshake like TCPs with the cryptographic handshake like TLS for HTTPS into a single round trip, or even zero round trips for returning connections.

This significantly reduces connection setup latency.

Faster startup, nice.

What else?

A huge one is built in support for multiplex streams over a single QUIC connection.

Multiplex streams, like multiple independent logical flows within one connection.

Exactly.

Think of downloading a complex webpage with HTML, CSS, images, scripts, with traditional HTTP 1 .1 over TCP.

If one packet for one image gets lost, it can block the delivery of all subsequent data for everything on that TCP connection until it's retransmitted.

This is called head -of -line H -O -L blocking.

Right, one lost packet stalls the whole pipe.

QUIC avoids this at the transport layer because it manages multiple streams independently within one connection.

If a packet for stream A, say an image is lost, data for stream B, say some CSS, can still be delivered and processed without waiting for stream A's lost packet.

Ah, no H -O -L blocking between streams.

That sounds like a big performance win for web traffic.

It is.

It's a core feature that powers HTTP 3, the latest version of the web protocol, which runs exclusively over QUIC.

So QUIC provides reliability and congestion control too, even though it's on UDP.

Yes, absolutely.

QUIC implements its own sophisticated mechanisms for reliable data transfer, similar in principle to TCP, but with some differences, and pluggable congestion control algorithms, often starting with something similar to TCP Qubic.

It's a full transport service just built at a different layer.

So it's got encryption built in from the start, faster handshakes, stream multiplexing without H -O -L blocking,

reliability congestion control running over UDP.

That's the gist of it.

And it's seeing significant deployment, largely driven by Google initially, but now becoming a standard.

It really represents the ongoing evolution, pushing more transport level intelligence, potentially out of the OS kernel and closer to the application.

Allowing for faster adaptation and innovation in how we move data across the internet.

What a journey we've taken, from just the basic idea of logical communication between apps.

Through the animal analogy.

To the lean and mean speed of UDP, its pros and cons.

Then diving deep into the intricacies of TCP, the handshake, sequence numbers, timeouts, fast retransmit.

The crucial difference between flow control and congestion control.

AMD slow start.

And finally, seeing how things are still evolving with protocols like QUIC,

moving transport logic on top of UDP.

It's truly amazing how these layers, mostly invisible to us, orchestrate this complex dance to let applications halfway across the world talk to each other reliably and efficiently.

It really is.

And it makes you think, considering this trend with protocols like QUIC moving sophisticated transport functions to higher up the stack into the application layer,

what does that mean for the future?

Will those traditional lines between transport and application layers continue to blur?

Could we see even more network intelligence migrating towards the edge, implemented directly by applications or frameworks,

rather than relying solely on the underlying OS or network infrastructure?

And what would that imply for overall network design, management, and even security?

Definitely something to ponder long after this deep dive ends.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
Transport layer protocols bridge the gap between application processes running on different hosts and the underlying network infrastructure that moves data across the internet. This layer provides logical communication channels that allow applications to send messages to one another as if they were directly connected, even though multiple intermediate systems and networks separate them physically. Two primary protocols dominate transport layer functionality: UDP and TCP, each designed with different priorities and trade-offs. UDP operates as a connectionless service with minimal overhead, forwarding datagrams quickly without establishing prior connections or guaranteeing delivery, making it suitable for applications that tolerate some data loss in exchange for speed. TCP, by contrast, implements connection-oriented communication that prioritizes reliability and correctness, ensuring all data arrives at the destination in the proper sequence and without corruption. Achieving this reliability requires sophisticated mechanisms including sequence numbers to track data order, acknowledgments to confirm successful delivery, retransmission algorithms to resend lost segments, and timers to manage waiting periods and detect failures. Beyond basic reliability, TCP must manage two critical performance issues: flow control, which prevents a sender from overwhelming a receiver's buffer capacity, and congestion control, which adapts transmission rates in response to network conditions to prevent overall network congestion. TCP's connection establishment relies on a three-way handshake that synchronizes state between client and server before data transfer begins, while graceful connection termination ensures both endpoints properly close the communication channel. Understanding multiplexing and demultiplexing operations reveals how the transport layer directs incoming segments to the correct application processes and forwards outgoing data from multiple applications to the network layer. These design principles reflect fundamental networking trade-offs between simplicity and robustness, speed and accuracy, and how modern internet protocols balance competing demands to enable reliable global communication.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥