Chapter 10: Virtual Memory: Demand Paging, Page Replacement, and Thrashing

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Have you ever, you know, just looked at your computer and wondered how it handles everything,

like dozens of browser tabs, maybe a huge game downloading, video editing, all at once?

And it just sort of works.

Yeah, it seems like magic sometimes, doesn't it?

Especially when you think about how much physical memory you actually have.

Exactly.

So today we're going to try and pull back the curtain on that illusion.

We're doing a deep dive into virtual memory.

The great enabler.

Right.

We'll explore how this this really ingenious system operates, why it's so critical, and the clever tricks operating systems use.

And we're drawing from Operating System Concepts 10th Edition by Silberschatz, Galvin, and Gagne.

A classic text.

And it's worth saying, you know, if we connect this to the bigger picture, virtual memory isn't just some neat feature.

No.

No, it's truly one of those foundational, often invisible technologies.

It's what makes modern computing possible, really.

From your smartphone, all the way up to massive cloud servers, it's everywhere quietly doing its job.

Okay, so let's start right at the beginning.

What's the problem virtual memory actually solves?

Because historically, things were different, weren't they?

Oh, absolutely.

Back in the day, to run a program, the entire thing had to fit into your computer's physical memory, all of it.

Wow, like trying to fit an entire library onto just one small desk.

That's a great analogy.

Yeah.

Yeah.

And modern programs, they're often huge,

much, much bigger than the physical memory available.

Think about games or complex scientific software.

Full of features, error handling codes, stuff you might only use once in a blue moon or maybe never.

Precisely.

And loading all that just in case code was incredibly inefficient.

So virtual memory steps in, how does it change the game?

Fundamentally, virtual memory is basically a technique that lets a program start running, even if it's not completely loaded into physical memory.

Okay, so not everything needs to be on the desk at once.

Exactly.

And the key here is abstraction.

It creates this separation between logical memory, what the programmer works with, this vast, almost limitless space and physical memory, which is the actual finite RAM chips in your machine.

And this raises an important question.

Why is this separation so crucial for software developers?

Oh, it's huge.

Suddenly, developers don't have to constantly worry about squeezing their programs into limited physical space.

They can just assume they have this enormous address space to work with.

It simplifies things tremendously.

Okay, so the benefits are obviously programs can be bigger than physical memory.

That's the big one.

What else?

Well, because you're not limited by physical RAM for the number of programs, you can run more processes simultaneously.

Ah, so better multitasking, which means your CPU is kept busier doing useful work that are throughput overall.

Exactly.

And another thing, programs can start faster because you don't need to load the entire thing from the

comparatively slow disk drive before it can begin.

You only load the initial bits needed.

Less IO means faster startup.

That makes sense.

Load on demand.

Right.

And there's more.

Virtual memory also allows processes to share memory.

Think about common system libraries like, you know, the standard C library that tons of applications use.

Okay.

Instead of every single running program loading its own identical copy into physical RAM, they can all point to the same physical pages containing that library code.

Like checking out the same popular book from a library, only one physical copy needed.

Perfect analogy saves a ton of memory.

Yeah.

And it also makes creating new processes much faster, especially with something like the fork system call in Unix like systems.

How so?

Well, when a process forks, the new child process can initially just share all the memory pages of the parent.

It only needs to actually copy a page if one of them tries to change it.

It's called copy on write.

Very efficient.

Okay.

So that's the what and the why.

The illusion and its benefits are pretty clear.

But how?

How does the operating system actually pull off this magic trick?

This feels like work.

It's really interesting.

It absolutely is.

This brings us to the core mechanism.

Demand paging.

Demand paging.

Okay.

Yeah.

It's basically a just in time approach.

Pages,

these small fixed size chunks of a program's code or data are only loaded into physical memory when they are actually needed.

Not before.

So if you never call a specific function or never access a certain block of data, it never gets loaded into physical RAM.

It just sits out on the disk, not taking up valuable space.

Think of like a huge software suite.

You might only use 10 % of its features regularly.

Demand paging means only that 10 % needs to be in memory most of the time.

Like that chef analogy you mentioned, only grabbing ingredients from the pantry right when you need them.

Exactly that.

But what happens when the program does try to use something like a piece of code or data that isn't currently in physical memory?

That must happen all the time.

It does.

And that triggers what's called a page fault.

Page fault.

Okay.

That sounds bad, like an error.

It sounds bad, but it's actually a routine part of how demand paging works.

It's not an error in the program usually.

It's just the system saying, hold on, I need to fetch that piece.

So what happens step by step during a page fault?

Okay.

So the program tries to access a memory address.

The hardware, specifically the memory management unit or MMU, looks in the program's page table.

Which is like a map, right?

Virtual addresses to a physical one.

Precisely.

And each entry in this map has a little flag, a valid invalid bit.

If the bit says invalid, meaning not in physical memory, the MMU triggers a trap, basically, and interrupts the operating system.

Halts the program, calls the OS.

Right.

The OS takes over.

First, it checks if the program was actually allowed to access that memory.

Assuming it is.

The OS then needs to find a free frame, an empty slot in physical memory.

Okay.

Find an empty parking spot.

Yep.

Then it schedules an IO operation to read the needed page from the backing store, usually your SSD or hard disk, into that free frame.

Ah, the slow part.

The disk access.

That's the potential bottleneck, yes.

While the discrete is happening, the OS might switch to running another process to keep the CPU busy.

Once the page is loaded into the frame, the OS updates the page table entry for that page, marking it as valid and pointing it to the correct physical frame.

So the map gets updated.

Correct.

And then the final kind of cool step.

The OS restarts the very instruction that caused the page fault in the first place.

So the program just picks up right where it left off none the wiser.

Exactly.

As far as the program is concerned, the memory was always there.

The whole fault and load process is ideally transparent.

That is quite an orchestration.

But you mentioned the disk access being slow.

So what does this all mean for how fast your computer feels?

Because that sounds like it could really drag things down if it happens too often.

That's the absolute key concern.

Performance?

You're right.

Accessing RAM is incredibly fast nanoseconds.

Like unbelievably fast.

Blink and you miss it.

Faster even.

Right.

But accessing a disk, even a fast SSD, takes milliseconds.

That's millions of times slower.

Whoa.

Okay.

So the effective access time your program sees is a weighted average.

It's mostly the super fast RAM access time, but occasionally it's the super slow page fault time.

The formula is basically effective access time, probability of no fault RAM access time, plus probability of a fault page fault service time.

And that probability of a fault needs to be tiny.

Extremely tiny.

The book is an example.

If RAM access is 10 nanoseconds and a page fault takes eight milliseconds,

even if only one out of a thousand memory accesses causes a page fault, your average access time slows down by a factor of 40.

40 times slower.

Just from one fault in a thousand.

Yikes.

Yeah.

It shows how critical it is for the OS to minimize page faults.

We need that page fault rate to be vanishingly small for virtual memory to be efficient.

This is also where swap space comes in dedicated disk space that's often faster than the regular file system for storing these pages.

Okay.

Minimizing faults is crucial.

But what happens if you just run out of free frames?

Like physical memory is totally full, but your program needs another page loaded.

You can't just stop, right?

Right.

If there are no free frames, the OS has to make space.

It has to choose a page currently in memory and we'll kick it out.

You evict it.

This is called page replacement.

Making room for the new arrival.

Exactly.

And this is where another clever bit comes in the modify bit or sometimes called the dirty bit.

Dirty bit.

Sounds intriguing.

It's simple, but very effective.

The hardware sets this bit for a page whenever the program writes to that page, changing its contents.

Okay.

So it tracks if the page has been modified since it was loaded from disk.

Precisely.

Now, when the OS is choosing a page to replace,

if it picks a page whose dirty bit is not set, meaning it's clean,

unchanged, it can just discard that page.

The copy on the disk is still perfectly valid.

Ah.

So no need to write it back to the slow disk.

Exactly.

Big time saver.

But if the chosen page is dirty, the bit is set.

The OS must write its modified contents back to the disk before it can reuse that frame.

Otherwise, those changes would be lost.

So avoiding writing back dirty pages is a big win.

Got it.

So the challenge then becomes, which page do you kick out?

If you pick one that the needs again immediately, you'll just cause another page fault right away.

That seems like the core problem.

It is the core problem of page replacement.

There are many different algorithms designed to try and pick the best page to replace, usually meaning the one least likely to be needed soon.

And you evaluate these algorithms using like a sequence of memory requests, right?

A reference string.

Yep.

You feed them a sequence of page numbers that a program accesses and count how many page faults each algorithm causes.

And generally, as you'd expect, giving a program more physical memory frames usually reduces the number of page faults.

There's a curve showing faults decreasing as frames increase.

Usually.

Does that mean not always?

Ah.

Good catch.

With some algorithms, surprisingly, yes.

Let's look at a few common ones.

The simplest is FIFO First In First Out.

Like a queue.

The oldest page in memory gets replaced.

Exactly.

Easy to implement.

But it doesn't care how often that old page is being used.

And crucially, FIFO suffers from something called Belladies Anomaly.

Belladies Anomaly.

Sounds ominous.

It's counterintuitive.

For FIFO, sometimes giving the process more memory frames can actually lead to more page faults for certain reference strings.

It's bizarre, but true.

That feels fundamentally wrong.

More resources lead to worse performance.

It's a known weird quirk of FIFO.

Doesn't happen often, but it can.

Okay, so FIFO is simple, but flawed.

What's the theoretical best, the gold standard, even if you can't actually build it?

That would be the optimal algorithm, often called OPT or MIN.

It replaces the page that will not be used for the longest time in the future.

So it needs a crystal ball, predicts the future access pattern.

Right.

Impossible in practice, obviously.

But it gives us the absolute minimum number of page faults possible for any given reference string.

It's the benchmark we compare other real algorithms against.

What's the best practical approach, the one that systems actually try to use or approximate?

That's typically LRU, least recently used.

Least recently used.

So kick out the page that hasn't been touched in the longest time.

Exactly.

The logic is based on locality of reference.

If a page hasn't been used recently, it's probably less likely to be needed again soon compared to pages used very recently.

And does LRU work well?

It generally works much better than FIFO.

And importantly, LRU does not suffer from Belladies Anomaly.

More frames always means fewer, or the same number of faults with LRU.

Good.

But how do you implement LRU?

Tracking the exact last access time for every single page sounds expensive.

It is.

True LRU requires significant hardware support, like maintaining a timestamp for every page access or managing a complex linked list of pages ordered by access time.

It adds overhead.

So systems probably don't do true LRU then.

Often, no.

They use LRU approximations.

These try to get close to LRU behavior without all the hardware costs.

Clever workarounds.

Yeah.

A very common one uses a simple reference bit for each page.

The hardware sets this bit to one whenever a page is accessed, read or written.

Okay, just a simple used recently flag.

Right.

Then you can have algorithms like Second Chance, sometimes called the clock algorithm.

Imagine the page is arranged in a circle, like a clock face, with a pointer moving around.

When the OS needs a frame, the pointer starts scanning.

If it finds a page with a reference bit set to one, meaning recently used, it gives it a second chance.

It clears the bit to zero and moves the pointer on.

Spares it for now, but marks it as not recently used anymore.

Exactly.

If the pointer finds a page whose reference bit is already zero, that's the page it chooses for replacement.

It didn't get used since the last time the pointer swept past.

Ah, neat.

A simpler way to approximate least recently used.

Yep.

And you can make it even smarter with the enhanced Second Chance algorithm.

This one looks at both the reference bit, was it used recently, and the dirty bit as a modified.

Okay, so four possibilities now.

Right.

You get four classes of pages, zero, zero, not recently used, not modified, zero, one, not recently used, but modified, one recently used, not modified, and the little one recently used and modified.

The algorithm prefers to replace pages in the best category first, zero, zero, because they're old and clean, no disk write needed.

It tries hardest to keep pages that are recently used and modified.

That sounds much more intelligent, trying to avoid disk writes and keep useful pages.

It is.

It's a very practical and widely used approach.

There are other types too, like counting -based ones or page buffering, but LRU approximations are very common.

Okay, so that's deciding which page within a process gets replaced.

But you also mentioned earlier that we have multiple processes running.

How does this system decide how many frames each process even gets in the first place?

Okay, let's unpack how processes get their slice of the memory pie.

Right, that's frame allocation.

First off, every process needs a minimum number of frames just to run.

This isn't arbitrary, it's dictated by the computer's architecture, like an instruction might need a frame, its operand might need another, maybe indirect addressing needs a third.

So even simple operations might span several pages.

Exactly.

Below that minimum, the process simply can't make progress.

Now, beyond the minimum, how do you distribute the rest of the frames?

Yeah, how do you share them out?

Well, the simplest is equal allocation.

Just divide the available frames equally among all processes.

Seems fair, but maybe not optimal.

A tiny background process gets the same as a massive database.

Right, so more common is proportional allocation.

Here, you allocate frames based on the size of process.

Bigger processes get more frames.

Or maybe it's priority, more important processes get more frames.

That sounds more reasonable.

It generally is.

Then there's another big choice,

global versus local allocation.

This relates back to page replacement.

Okay, how so?

With global replacement, when a process needs a frame and replacement is necessary, it could potentially grab a frame from any other process in the system, typically one with lower priority or one that seems less active.

So it's a free for all,

basically.

Survival of the fittest.

Kind of.

It often leads to better overall system throughput because frames can quickly shift to where they're most needed.

But it makes the performance of any individual process less predictable because its pages might get stolen away at any time.

I see.

And local.

With local replacement, a process needing a frame can only choose a replacement candidate from its own currently allocated set of frames.

So its memory pool is self -contained.

Right.

This gives more consistent, predictable performance for that single process.

But it might be less efficient for the system overall if one process is hoarding frames it doesn't really need while another is starved.

Most general purpose systems tend to use global replacement strategies.

Got it.

And you mentioned page faults earlier.

Is every page fault that slow disk operation?

No, good point.

We need to distinguish between major and minor page faults.

A major page fault is the slow one we discussed.

The page is not in memory at all.

And we have to go read it from disk.

That involves I .O.

The performance killer if it happens too often.

Exactly.

But a minor page fault is different.

This happens when the page is actually already somewhere in physical memory.

Maybe it belongs to the kernel or it's a shared library page used by another process.

But it just hasn't been mapped into the faulting process's address space yet.

So the data is already in RAM.

Just need to update the map.

Precisely.

The OS just needs to update the process's page table to point to the existing frame.

No disk I .O.

needed.

It's much, much faster.

And presumably minor faults are more common.

Especially with shared libraries.

Way more common in many workloads.

What's fascinating here is how operating systems, like Linux,

optimize away the heavy work for frequently accessed shared code.

Minor faults highlight this efficiency beautifully.

If you look at tools like dips in Linux, you can often see counts for both major and minor faults per process.

Right.

This whole system feels like a delicate balancing act.

Trying to keep enough pages in memory for everyone.

Replacing the right ones.

What happens at the balance tips if the system just gets overwhelmed?

That's when you hit a state called thrashing.

Thrashing.

Sounds bad.

Like the computer is just flailing.

That's a perfect way to put it.

A system is thrashing when it spends almost all its time moving pages between memory and disked paging instead of actually executing program instructions.

So it's constantly shuffling papers instead of writing.

Like your analogy.

CPU utilization must plummet.

Drastically.

The cause is usually that processes don't have enough frames allocated to hold their working set.

Their working set.

What's that?

It's the set of pages that a process is actively using right now.

Programs to exhibit locality of reference.

Meaning they tend to access a relatively small cluster of pages heavily for a period.

Then maybe move to another cluster.

The working set is that currently active cluster.

Okay.

If a process doesn't have enough frames to hold its entire current working set, it will constantly fault.

It loads page A, but to do that it had to kick out page B.

Then it immediately needs page B so it faults again.

Maybe kicking out page C.

Or even page A which it just loaded.

A vicious cycle of page faults.

Exactly.

High paging activity, low CPU utilization because processes are always waiting for the disk.

It's system gridlock.

The OS might even think the CPU is underutilized and try to load more processes into memory, making the thrashing even worse.

Oh wow.

How do systems try to prevent or deal with thrashing?

Well, the working set model tries to estimate the working set size, WSS, for each process and ensure it has at least that many frames.

If the sum of all WSS exceeds the total available frames, the OS might suspend some processes to free up frames for the active ones.

Another approach is monitoring the page fault frequency, PFF, directly.

Just watching how often faults happen.

Yeah.

If a process's fault rate goes above a certain high threshold, it probably needs more frames so the OS gives it some.

If the rate falls below a low threshold, it might have too many frames so the OS can take some away.

Direct feedback control.

Right.

But honestly.

The most common approach in practice today, especially with memory being relatively cheaper, is simply to have enough physical RAM in this system to hold the typical working sets of concurrently running applications.

Thrashing was a bigger nightmare when RAM was incredibly scarce and expensive.

Modern systems try to avoid it by just having sufficient resources.

Makes sense.

Throw hardware at the problem where possible.

So these concepts are foundational, but computing keeps evolving.

How do things like mobile devices or multi -core systems change the virtual memory game?

Great question.

Several modern considerations are really important.

One huge one, especially for phones and tablets where physical RAM is often limited and there might not be fast swap space, is memory compression.

Compressing pages in RAM instead of swapping them to disk.

Exactly.

When memory gets tight, instead of writing a potentially dirty page out to flash storage, which is slow and wears out the flash, the OS can try to compress the contents of that page, or maybe several pages, into a single frame still residing in RAM.

So you free up frames without hitting the disk flash.

Right.

It uses CPU cycles to do the compression -decompression, but that's often much faster than I .O.

Systems like Android, iOS, Windows, Mac OS all use forms of memory compression now.

It's a trade -off between CPU cost and avoiding slow I .O.

Clever.

What else is different?

Allocating kernel memory is a whole different challenge.

The kernel needs memory for its own data structures.

But these aren't nice, uniform, page -sized chunks usually.

They vary wildly in size.

And sometimes the kernel needs memory that's physically contiguous, right?

For talking to hardware.

Yes.

Exactly.

Standard paging doesn't guarantee physical contiguity.

So kernels use different allocators.

One classic one is the buddy system.

Buddy system.

It manages memory in blocks whose sizes are powers of two, like 4 kb, 8 kb, 16 kb.

When a request comes in, it finds the smallest power of two block that fits and allocates it.

If it needs a smaller block, it takes a larger one and recursively splits it in half, creating buddies, until it gets the right size.

The nice thing is, when memory is freed, it can easily check if its buddy block is also free.

And if so, merge them back into a larger block.

It's fast for merging.

But I guess if you need, say, 9 kb, you have to allocate 16 kb, right?

Wasted space.

Exactly.

That's called internal fragmentation.

To combat that, especially for frequently allocated, deallocated small kernel objects like file handles or process descriptors, many systems use slab allocation.

Slab allocation.

Yeah.

It creates caches where each cache consists of one or more slabs.

Each slab is a chunk of contiguous physical memory, pre -divided into objects of a specific size and type.

So if the kernel needs, say, a network buffer structure, it just grabs a pre -initialized one from the network buffer slab cache.

Super fast.

And no wasted space within the object itself.

Precisely.

Almost no fragmentation.

Yeah.

Very fast allocation and deallocation.

Linux uses variations of this extensively, like SOUB or SLOBUS, which are optimized versions.

Okay.

What about systems with lots of CPUs?

Does that change memory management?

Definitely.

That brings in non -uniform memory access.

Non -uniform.

Yeah.

In a multi -CPU system, a given CPU can typically access some banks of physical RAM faster than others, depending on how they're physically wired together.

Accessing memory attached to a different CPU is possible, but slower.

So memory access time isn't uniform across the system.

Correct.

So a new MA -aware virtual memory system tries to be clever.

When allocating a frame for a process, it tries to allocate it from the memory bank closest to the CPU the process is currently running on.

Keeping memory access as local improves performance significantly.

That makes total sense.

Anything else changing, like page sizes?

Page size itself is a constant trade -off.

Historically, they were small, maybe 4KB.

Smaller pages mean less internal fragmentation, less wasted space inside a page.

But they also mean you need huge page tables to map a large virtual address space.

Millions of entries.

Right.

Larger pages, like 64KB, 2MB, or even 1GB, often called huge pages, dramatically reduce the page table size and improve the efficiency of the TLB, that hardware cache for translations.

But they can waste more memory if a process only uses a tiny bit of a huge page.

So systems might support multiple page sizes.

Yes.

Many modern systems do, using huge pages for things like large data buffers or application code where appropriate, and standard smaller pages elsewhere.

It's about optimizing that TLB reach, how much memory can be accessed, just using the fast TLB cache without going to the main page tables.

And can we, as programmers, do anything to help the virtual memory system work better?

Absolutely.

Program structure matters.

Remember locality of reference.

Writing code that accesses memory in a predictable sequential pattern helps the OS predict what you'll need next.

Like accessing a 2D array row by row if it's stored row major.

Accessing it column by column could potentially cause a page fault for every single element if the array is large, which is disastrous for performance.

Good compilers can sometimes help optimize data layout, too.

One last thing.

What about devices like, say, a network card that needs to write data directly into memory?

What if that memory page gets paged out?

Ah, yes.

I .O.

interlock.

That's a critical issue.

If a device is performing direct memory access DMA to a page, and the OS decides to page out that same page,

chaos ensues.

The data goes to the wrong place or overwrites something else.

So how is that prevented?

The OS needs to lock pages involved in I .O.

operations into physical memory.

These locked pages cannot be selected for replacement until the I .O.

operation is completely finished.

It's essential for system stability.

Wow.

Okay, so bringing it all together, how do the big players like Linux or Windows actually put these ideas into practice?

Are they all pretty similar?

The core principles demand paging, copy on write, trying to approximate LRU are very common.

But the specific implementations differ.

For example?

Well, Linux uses concepts like active and inactive page lists.

Pages that are actively used stay on the active list.

If a page isn't referenced for a while, it moves to the inactive list.

The page replacement daemon, case swapped, primarily scans the inactive list for candidates to reclaim, giving recently used pages a better chance of survival.

Kind of like a two -tiered LRU approximation.

Sort of, yeah.

Windows also uses demand paging with clustering.

When a page fault occurs, it often proactively brings in several surrounding pages, too, guessing they might be needed soon.

It also carefully manages process working sets, trimming pages from processes that seem to have more memory than they currently need, especially when memory pressure is high.

Proactive and reactive.

Right.

And Solaris, historically, used a refined clock algorithm, with two hands sweeping through memory at different rates, one scanning to mark pages as unused, and a trailing one reclaiming pages that were made unused between scans.

They all have sophisticated mechanisms to balance performance and memory usage.

It's amazing such intricate systems working behind the scenes.

So to wrap up, virtual memory is, well, it's this incredibly sophisticated layer cake of abstractions and algorithms.

It lets us run huge applications, multitask like crazy, all by cleverly juggling data between fast RAM and slower storage, making the physical limitations almost disappear.

It really is a cornerstone of modern OS design.

And thinking about it, this raises an important question.

Knowing how intricate this memory dance is, what other invisible systems are constantly at play, making our digital lives possible, and how might understanding them empower us to use our technology more effectively, or maybe even build better systems ourselves?

That's a great thought to live with.

So next time you open, you know, your 15th browser tab, or launch that massive application without a second thought, maybe just take a moment to appreciate the silent complex ballet of virtual memory happening constantly behind the scenes, making it all feel so seamless for you.

From the entire last minute lecture team, thank you for taking this deep dive with us today.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
Virtual memory enables processes to run even when their entire memory footprint exceeds available physical memory, creating a powerful abstraction that decouples logical address spaces from physical constraints. This mechanism fundamentally transforms how operating systems manage memory by allowing flexible program sizes, increased multiprogramming capacity, and enhanced isolation between logical and physical memory layouts. Demand paging forms the core strategy, where pages are loaded into physical memory only when accessed, triggering page faults that invoke the operating system to fetch missing pages from secondary storage. The valid-invalid bit mechanism marks which pages currently reside in physical memory, guiding the system to distinguish between accessible and inaccessible page references. When physical memory fills, page replacement algorithms determine which existing pages should be evicted to make room for newly demanded pages. The FIFO algorithm removes the oldest resident page regardless of usage patterns, while the optimal algorithm selects the page not needed for the longest time in the future—an ideal standard against which practical algorithms are measured. The least recently used algorithm approximates optimal behavior by evicting pages that have not been accessed recently, while the second chance algorithm refines FIFO by giving pages a second opportunity based on their reference bit status. The working set model reduces thrashing by maintaining an estimate of the pages a process actively uses, ensuring sufficient frame allocation to prevent excessive page faults. Frame allocation policies determine how physical memory is distributed among competing processes through global replacement strategies that dynamically reallocate frames based on demand, or local replacement approaches that restrict eviction to a process's own pages. Load control regulates the degree of multiprogramming to prevent system-wide degradation when total memory demands exceed available capacity. Memory-mapped files optimize I/O operations by treating file access through the paging mechanism, while copy-on-write optimizes process creation by deferring memory duplication until actual modification occurs. Effective access time calculations balance hit rates against the substantial cost of page faults, while translation lookaside buffers cache address translations to accelerate virtual-to-physical memory mapping. Actual implementations in Windows and Linux demonstrate how these theoretical principles integrate into production systems, revealing the careful engineering needed to balance flexibility, performance, and security.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥