Chapter 16: Virtualization – VMs, Containers & Serverless

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture!

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome back to The Deep Dive.

Today we're exploring something absolutely fundamental to, well, almost everything you do online,

virtualization.

Yeah, it's one of those technologies that makes the modern world tick.

And the chapter we're looking at kicks off with a great line.

It really does.

Virtual means never knowing where your next bite is coming from.

Captures that sort of mystery, doesn't it?

It perfectly sets the stage.

But, you know, to really get why this is such a big deal, we need a quick history lesson.

Think back to the 1960s.

Right.

Computers were these gigantic room -filling machines costing millions.

Millions.

And the kicker was most applications back then barely tickled the hardware.

Maybe used, what, 10 % of the available power?

It was incredibly inefficient.

You had all this expensive potential just sitting there wasted because systems could only really handle one job at a time.

Exactly.

Just couldn't share the resources effectively.

So that economic pressure, that inefficiency really drove the need for something new.

And that something new eventually became virtual machines, or VMs, and later on containers.

The goal was always twofold, wasn't it?

Uh -huh.

First, you absolutely had to keep applications separate, isolated from each other.

Can't have one crashing and taking down everything else.

Or snooping on another app's data.

Right.

But second, you needed to cram as many of those isolated applications onto the same physical hardware as possible to actually use that expensive resource.

Isolation and maximum sharing.

So for you listening, whether you're an architect already or just learning the ropes, understanding this is, well, it's essential.

It really is.

Deploying to the cloud.

You're choosing between VM types, container options, testing specialized hardware.

Virtualization lets you mimic it.

It's not just theory.

It's how we actually build, deploy, and importantly pay for computing power today.

Okay.

So let's unpack this.

If the big goals are sharing in isolation, what are we actually managing?

The fundamental things we need to manage.

CPU,

memory, disk storage, and the network connection.

The building blocks.

Let's start with the CPU, the brain.

How do you prevent one app from hogging all the processing time?

That's where the operating system's thread scheduler comes in.

Think of it like a traffic controller for the CPU.

Okay.

It picks a little piece of work, a thread, and says, okay, you get the CPU for a tiny slice of time.

Or maybe until it finishes its current task or until something more important interrupts it.

So no single application just grabs the wheel.

Nope.

The scheduler is always in charge, making sure everyone gets a fair turn.

That guarantees both sharing and isolation at the processor level.

Got it.

Now memory.

That sounds trickier.

Applications often need way more memory than the physical machine actually has installed.

How does that work?

Yeah.

Memory is complex.

This is where virtual memory technology is key and it relies heavily on the actual hardware, the silicone.

Hardware support?

How so?

Well, imagine your application's memory is divided into little chunks called pages.

The hardware has special units that manage these pages.

They make sure your application only ever sees its own pages.

Like giving each app its own private notebook.

Exactly.

And if a page your app needs isn't in the fast physical memory right now, maybe it's been temporarily stored on the slower disk drive.

The filing cabinet analogy from the text.

Right.

The hardware automatically finds it, swaps it back into physical memory, maybe swaps something else out to make room.

The crucial thing is the hardware enforces the boundaries.

It prevents app A from accidentally stumbling into app E's memory space.

Okay.

That hardware enforcement makes sense.

What about disk storage?

Multiple apps writing to the same hard drive.

How do we keep files separate and secure?

Two layers there.

First, you have the disk controller, the physical interface, which just makes sure data streams don't get jumbled up.

It sequences things properly.

Okay.

Basic traffic management.

But the real security comes from the operating system using tagging.

Think user IDs, group permissions.

Every file, every folder, every running process gets tagged.

Like labels.

Precisely.

And the OS acts like a bouncer.

It checks the tag on the process, trying to access the file against the tag on the file itself if they don't match the rules.

Access denied.

You can't even see files you don't have permission for.

Simple, but effective.

And the last one,

the network connection.

How can two different web servers on the same physical box look like completely separate entities to the outside world?

Through unique addresses and dedicated listening posts, every virtual entity gets its own unique IP address.

That's its unique mailbox on the internet, basically.

That ensures requests from outside get routed correctly.

So the IP address gets you to the right virtual doorstep.

Exactly.

And then once you're at the right IP address, applications use ports.

Think of ports like different apartment numbers of the same street address.

Your web server listens on port 80.

Your email server listens on port 25.

The OS makes sure a message for port 80 only goes to the web server.

Okay.

So we've established we can securely isolate and share CPU, memory, disk, and network.

That technical foundation makes the whole idea of a virtual machine, a VM, possible.

Precisely.

A VM is essentially a complete computer simulation.

The guest running inside the physical host machine, all thanks to that careful resource management we just talked about.

And the software that makes this magic happen, that manages all these guest VMs, that's the hypervisor.

Yep, the hypervisor.

It's like the operating system for the operating systems running inside the VMs.

And there are two main flavors, architecturally speaking.

Right.

Type one and type two.

What's the key difference?

It really comes down to where it runs and the trust boundary.

Type one, or bare metal, runs directly on the physical hardware.

No host OS underneath it.

So it has ultimate control.

Total control.

That makes it super efficient, very secure.

This is what powers the big cloud data centers.

Think AWS, Azure, Google Cloud.

They're running on type one hypervisors.

And type two.

Type two is the hosted kind.

It runs like a regular application on top of your normal

like running virtual box or VMware Fusion on your Windows laptop to run a Linux VM.

So more convenient for a developer maybe, but potentially less performant or secure?

Generally, yes.

Because it has that extra layer of the host OS underneath it, there's more overhead and the attack surface is larger.

Type one cuts out that middleman.

That trust boundary idea is important.

Now quickly, hypervisors versus emulators.

People sometimes mix these up.

Easy distinction.

A hypervisor needs the guest VM's code to use the same CPU instruction set as the physical host machine.

It's just managing access, not translating code.

Like by 86 code running on a by 86 processor.

Right.

An emulator though, that's different.

It actually simulates the instruction execution, often for a completely different type of processor.

Think running an old video game console, ARM code, maybe on your modern by 86 PC.

Emulators translate, hypervisors orchestrate on the same architecture.

Got it.

So once a VM is up and running under a hypervisor, what are the hypervisor's main day -to -day jobs?

Two big things.

First, managing the code inside the VM.

When a VM tries to do something external, like access the network or disk, the hypervisor intercepts that request.

It tags it.

This came from VM number three,

sends it to the physical hardware, gets the response, and makes sure it goes back only to VM number three.

It's the gatekeeper and router.

Okay.

Managing the IO, what's the second job?

Managing the VMs themselves.

Creating them, destroying them, keeping an eye on their health, making sure they don't use more CPU or memory than you allocated.

Enforcing those resource limits is critical.

Makes sense.

Now, to start a VM, you need a VM image basically, the contents of its virtual hard drive.

What's the safe way for architects to handle these images?

Well, you can create them in a few ways.

Snapshot a running system, add software to existing base image, or install everything from scratch.

But the huge warning sign here is, don't blindly trust images you find online, third -party images.

Why not?

Convenience seems tempting.

Because you have no idea what's really in there.

Outdated software, security holes,

actual malware,

weird configurations.

You lose all control and visibility if you didn't build it or get it from a highly trusted source.

It's a massive security risk.

So what's the standard safer practice then?

Especially to avoid having hundreds of slightly different giant images floating around.

The best practice is usually to create a minimal base image, just the core operating system, nothing else.

Then, after the VM boots up, you use automated tools, configuration management to install the specific services and applications needed.

Ah, so you customize it on the fly.

Exactly.

It keeps the base images small and consistent, need to update your web server software.

You just tweak the configuration script to not rebuild and distribute a whole 8 -gigabyte image.

Much more manageable.

Okay, that leads nicely into the big architectural takeaways for VMs.

Performance overhead is one, maybe 10 % latency added by a good Type 1 hypervisor.

Yeah, there's always some cost of virtualization.

And Type 2 overhead is often significantly higher, by the way.

Right.

But what's the biggest architectural benefit beyond just cramming more apps onto one box?

Oh, it's the separation of concerns.

It's huge.

Suddenly, as an architect, you don't necessarily need to think about buying servers, racking them, cooling them, maintaining them.

The physical infrastructure becomes someone else's problem.

Pretty much.

Resources become a commodity you can provision on demand.

You can just defer all those hardware decisions to your cloud provider.

That fundamentally changed how systems are designed and deployed.

A massive shift.

But VMs, despite their power, had their own problems.

They were still heavy.

Yeah, that became the next big bottleneck.

An 8 -gigabyte VM image, even on a fast network, takes time to transfer.

Minutes, realistically.

And then the OS inside still has to boot up.

More minutes.

Exactly.

If you're trying to scale quickly, react to load spikes, deploying new VM instances in minutes just wasn't fast enough.

Too slow, too cumbersome.

And that frustration paved the way for containers.

Enter containers.

They keep that crucial isolation property, but tackle the speed and size problem head on.

The key difference is what they virtualize.

VMs virtualize the hardware.

What do containers virtualize?

They virtualize at the operating system level.

All the containers running on a host machine share the same underlying OS kernel.

They run as isolated processes managed by something called a container runtime engine.

Docker is the famous one.

But there's also Containerd and others.

If you picture the layers, it's containers on top of the runtime engine, on top of the host OS.

You got it.

And that sharing model is where the performance boost comes from.

What's the first one?

Related to the OS itself.

Simple.

You don't ship the OS in the container image.

Since all containers use the host's kernel, the image only needs your application code and its direct dependencies.

Your image size drops from gigabytes for a VM to maybe megabytes for a container.

Huge difference in transfer time right there.

Massive.

And the second performance win comes from layers.

Now, this isn't like architectural layers.

It's about how the image is built.

Okay, explain the layers concept.

The LMP stack example was good here.

Right.

Think Linux, Apache, MySQL, PHP.

When you build that container image, it's done in steps.

Layer 1, base Linux.

Layer 2, add Apache.

Layer 3, add MySQL.

Layer 4, add your PHP code.

Each step creates a distinct layer.

And why does that layering matter for speed, especially for updates?

Because the runtime engine is smart about these layers.

Let's say you find a bug in your PHP code, layer 4.

You fix it and rebuild when you deploy the update.

You don't send the whole stack again.

Exactly.

The runtime on the target machine sees it already has layers 1, 2, and 3.

Linux, Apache, MySQL.

It only needs to download the changed layer 4, the tiny PHP update.

Ah, that's incredibly efficient.

It's the magic.

That's why container startups and updates go from minutes for VMs down to seconds or even milliseconds.

It's a game changer for rapid deployment and scaling.

And the industry realized this was powerful and needed standardization.

Thankfully, yes.

The Open Container Initiative, the OCI, standardized the format of container images and how the runtime interacts with them, meaning you can build your container image using, say, Docker tools on your laptop and then run that exact same image reliably on a different runtime engine like container in your production cloud environment.

Portability is huge.

Okay, so now we have these super -fast portable containers,

but you often have services that need to work closely together.

How do we manage groups of related containers?

That's where orchestration tools like Kubernetes come in, and they introduce the concept of a pod.

Think of a pod as the next level up.

It's a group of one or more containers that are deployed together and, crucially,

share the same network namespace.

Share the same network namespace.

What does that mean, practically?

It means all the containers inside a single pod share the same IP address and can talk to each other using localhost.

They also share the same set of network ports.

And why group them like that?

What's the architectural advantage?

It's primarily about efficient communication for tightly coupled services.

Think of a service mesh sidecar pattern, for example.

Since containers in a pod are guaranteed to land on the same host machine, same VM or physical node, they don't need to use the relatively slow network stack to talk to each other.

Precisely.

They can use much faster interprocess communication, IPC methods, things like shared memory or semaphores, directly through the shared OS kernel.

This drastically cuts down latency and communication overhead for services that need to chat constantly.

Okay, pods for grouping.

Now, if containers can start in milliseconds,

the next logical question seems to be, why even leave them running when they're not actively doing something?

You're thinking exactly like the architects who came up with serverless.

If startup is virtually instantaneous, why not just create a brand new container instance for every single incoming request.

And then kill it immediately after.

And then tear it down the moment the request is finished.

The infrastructure handles this dynamic allocation and deallocation automatically.

That's the core idea behind serverless computing.

Though we should always add the caveat.

It's not really serverless, right?

There are definitely still servers involved somewhere.

Oh, absolutely.

Loads of servers.

A better name is probably function as a service or face.

The point is, you don't manage the servers.

The cloud provider does.

And they handle spinning things up and down just for the duration of your function's execution.

You pay only for the compute time you actually use.

But this incredibly short lifespan, this ephemeral nature, imposes a really significant constraint on how you design the application, doesn't it?

It's the single biggest constraint.

Right.

Your function, the code running in that container, must be stateless.

Meaning?

Meaning, it can't store any information it needs for the next request inside itself because it won't be there.

The container is destroyed.

Any state like user session data, ongoing transactions, anything persistent has to be stored externally.

In a database or a cache or some other cloud service?

Exactly.

Designing for statelessness is fundamental to using face effectively.

Are there other practical tradeoffs or limits when you go serverless with face providers?

Yeah.

A few key ones you need to be aware of.

First, providers often restrict the base images or runtimes you can use.

They need things optimized for super fast loading, so your language or library choices might be limited.

Okay.

Less flexibility sometimes.

What else?

Second, the infamous cold start.

While subsequent requests might be instant, the very first request after a period of inactivity might take several seconds while the platform finds resources, downloads your code, and starts the container for the first time.

That latency spike can be an issue for some applications.

It can.

And third, providers impose strict limits on how long a single function execution can run, maybe a few seconds, maybe a few minutes, depending on the provider and tier.

If your task takes longer than that, it just gets terminated.

So face is really best suited for short, quick, event -driven tasks.

Wow.

Okay.

So we've really traced an incredible evolution here from sharing expensive mainframes.

Right.

To VMs using hardware virtualization for isolation and better utilization.

Then containers using OS virtualization for speed and portability.

Pods for efficient grouping of related containers using IPC.

And finally, to serverless or FAZ, where we dynamically allocate resources per request, demanding statelessness.

It's quite a journey, and it fundamentally underpins almost all modern application deployment.

Absolutely.

If you're building software today, especially cloud -native software, you are navigating these layers, these trade -offs constantly, even if it's abstracted away sometimes.

It's the landscape we operate in.

It truly defines how we think about provision and pay for compute resources now.

And to leave you with something to think about, we talked about isolation.

VMs running minutes or even hours apart on the same physical hardware.

What steps does that hypervisor absolutely have to take between those sessions to guarantee zero data leakage?

When VMA shuts down and VMB starts up on the same core later, how does the hypervisor ensure that VMB can't somehow access leftover bits of VMA's data in memory or on the disk sector as it gets assigned?

That's deep.

The scrubbing and sanitization process must be foolproof.

Foolproof and performant.

It's one of the most critical and complex security functions of the entire virtualization stack.

How much trust do you place in that process?

Something to ponder.

Definitely something to ponder.

Thank you so much for walking us through that complex world of virtualization today.

My pleasure.

It was fun diving in.

And thank you all for joining us on the Deep Dive.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Virtualization technologies represent a foundational shift in how computing resources are allocated and managed across distributed systems, originating from the economic necessity to share expensive hardware infrastructure like processors, memory, storage, and network interfaces among multiple independent applications. Virtual machines achieve this isolation by emulating complete computing environments on a single physical host under the control of a hypervisor, a specialized layer that manages hardware access and resource allocation. The hypervisor exists in two distinct implementations: bare-metal hypervisors that operate directly on physical hardware and are typical in data center environments, and hosted hypervisors that run on top of a conventional operating system, making them practical for development and testing scenarios where developers need to simulate incompatible systems or reproduce production configurations. However, virtual machines carry inherent performance penalties stemming from the need to package entire operating systems within large image files, resulting in lengthy transfer and initialization periods extending into minutes and introducing computational overhead from managing isolation boundaries. Containers represent a more efficient alternative by virtualizing only the operating system kernel itself, allowing multiple containerized applications to share the same kernel while maintaining process-level isolation. This architectural distinction dramatically reduces the footprint of container images since the operating system need not be replicated, enabling startup times measured in milliseconds rather than minutes. Container images employ a layered construction methodology that enables efficient updates, where modifications to individual components require transferring only the changed layers into production environments, often through version-controlled automation scripts. For applications requiring coordinated execution and managed communication, containers can be deployed as Pods within orchestration platforms like Kubernetes, ensuring co-location on identical physical nodes and enabling efficient resource sharing. Serverless computing, implemented through Function-as-a-Service architectures, leverages container instantiation speed to provision and deprovision execution environments for individual requests, removing the burden of infrastructure management from developers while imposing a critical architectural constraint that all containerized services must remain stateless.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥