Chapter 13: File-System Interface: File Concepts, Directory Structure, and Protection

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Ever wondered where your photos, documents, and even the very apps you use live inside your computer?

Hmm.

We're talking about the digital backbone of your operating system, the file system.

It's, you know, the unsung hero that brings order to all that digital chaos.

Today we're doing a deep dive into the file system interface chapter.

It's from Operating System Concepts by Silberschatz, Galvin, and Gania.

Our mission really is to unpack how operating systems manage, organize, and crucially protect all your digital files.

Think of it as a shortcut.

A way to understand why your computer behaves the way it does when you save, open, or delete something.

We'll break down these core operating system concepts, make them student -friendly, use some real -world analogies, examples you'll definitely recognize.

Yeah, and what's really fascinating here is how the file system, well, it seems simple from the user's point of view, right?

But it's actually one of the most visible and critical parts of NEOS.

It's this layer, this abstraction that hides away all the physical complexities of the storage devices, you know, NVMs, HDDs, even old magnetic tapes.

And that abstraction gives you that neat, organized view of your data.

So we'll explore not just what files are, but how the system handles them from, like, basic operations all the way up to sophisticated protection mechanisms.

Okay, let's unpack that then.

We use files every single day.

But what is a file, really, from the operating system's perspective?

What's its fundamental nature?

Well, at its core, a file is essentially just a name collection of related information.

And it's recorded on secondary storage.

Think of it as the smallest logical unit of storage the OS deals with.

The key thing is the OS provides this uniform logical view.

It completely hides the messy physical details of, you know, where the GITs actually are on the disk or the flash memory.

And the content is persistent.

It sticks around after a reboot.

So it's much more than just a document or a photo, right?

It could be an executable program.

Oh, absolutely.

Or even really abstract system information, like, I'm thinking of the proc file system in Linux, where you can look at running processes as if they were file.

Precisely.

That's the genius of it, the generality, the meaning of the bits inside the file that's entirely defined by whoever created it and whoever uses it.

So yeah, it allows for everything.

Simple text files, complex multimedia, executables, system settings,

anything, really.

OK.

So beyond the name we give a file, it sounds like the OS keeps track of a whole bunch of other information about it, these attributes.

What are these essential bits of metadata?

And why does the OS even bother?

Right.

These attributes are absolutely foundational.

They enable everything from reliable backups to proper security.

They do vary a bit between systems, but typically you'll find the name, obviously, the human readable label, an identifier, which is like a unique internal tag, usually a number that the OS uses, the type, which helps the system know how to handle the file sometimes, location, a pointer to the device and the specific spot on the device, size, both its current size and maybe a maximum allowed size,

protection information that's crucial, who can read, write, execute it and so on, and then timestamps and user identification.

So when it was created, last modified, last used, really useful for security and just monitoring things.

You mentioned Mac OS before.

If you use a Mac, you often see that get info window.

How does that sort of visually connect back to these attributes?

Oh, that's a perfect example.

If you right click a file on a Mac, hit get info.

That window shows you a lot of this directly.

You'll see the name right at the top.

Then under general, there's kind, that's the type attribute, and the size, of course.

Scroll down a bit, you see created and modified dates.

Those are the timestamps.

And really importantly, the sharing and permissions section.

That directly shows the protection attributes, who can read, who can write.

All this info, by the way, it's stored right there in the directory structure on the storage device itself.

It can actually get quite large.

OK, so we have these files, they've got all these attributes.

How do we actually do things with them?

What are the basic actions the OS lets us perform?

Well, files are treated as abstract data types, right?

So the OS provides a defined set of operations.

The core seven are usually, one,

creating a file.

The system finds space, makes a directory entry.

Simple enough.

Two, opening.

This one's really key for efficiency.

Instead of using the full file name every single time you want to touch the file, the open system call returns a small integer.

It's called a file handle, or file descriptor.

Subsequent operations like reading or writing just use this handle.

Ah, so it avoids having to look up the file name and check permissions over and over again.

Exactly.

It avoids constant directory searches and permission checks.

OK, like checking into a library book once, getting your card stamped, and then you just show the stamp, not your ID, every time you turn a page?

Precisely that analogy.

So after opening, you have three, writing,

use the handle, provide the data.

The system keeps track of where you are with a right pointer for sequential writing.

Four,

reading.

Again, use the handle, tell the system where in memory to put the data.

It updates a read pointer.

Often, read and write actually share a single pointer, the current file position pointer.

Five, repositioning, or seek.

This just moves that current file position pointer to a specific spot in the file without actually reading or writing any data.

Six, deleting.

Removes the directory entry, frees up the space.

But, important note here, if you have things called hardlinks, the actual file content might not be deleted until the last link pointing to it is removed.

OK, that's a detail to remember.

And finally, seven, truncating.

This just erases the file's content, resets its size to zero, but keeps all the attributes.

That library analogy really clicks.

So this open file table you mentioned implicitly, that's the OS's central registry for what's currently active, prevents a lot of redundant work.

It's exactly right.

The OS maintains this system -wide open file table.

It holds vital info for every open file.

The current file pointer position, an open count, that tracks how many processes have this file open.

Its location on disk, the access rights granted when it was open.

When a process closes the file, the count goes down.

When it hits zero, meaning nobody has it open anymore, the entry is removed from the table.

OK, that immediately raises a question for me.

What happens when multiple processes try to use the same file at the very same time?

Ah, yeah.

Like, think about collaborative editing or maybe a system log file where lots of different things are writing entries.

Good question.

That's where file locking comes into play.

It's actually very similar conceptually to reader -writer locks that you see in process synchronization.

You generally have two types, shared locks, where multiple processes can read the file concurrently, no problem there, and exclusive locks, where only one process can hold the lock, usually for writing, preventing anyone else from reading or writing.

What's really neat, though, is that many systems and APIs, like Java's file channel, for instance, let you lock not just the whole file, but specific parts of it, byte ranges.

Oh, wow.

So you can lock just a section.

Yeah.

Imagine an application needing to update, say, the first half of a big data file.

It could grab an exclusive log just on byte zero to halfway.

While that's happening, other processes could still get shared locks on the second half to read that data concurrently.

It gives you really fine -grained control.

That fine -grained control sounds incredibly useful.

Are there common real -world scenarios where locking just a portion is almost required?

Oh, definitely.

Think about database management systems.

They do this constantly.

Or large log files, where multiple processes might be appending new entries.

You lock just the end, maybe.

Or the specific record you're updating.

Or updating a large configuration file.

Maybe you only lock the specific section you're changing.

And there's a really crucial distinction here you need to know about.

Mandatory versus advisory locking.

With mandatory locking, which you often see in Windows, the OS enforces the lock.

If a process holds a lock, the OS will prevent other processes from accessing that locked part inappropriately.

Right.

Seems sensible.

But with advisory locking, which is more common in Unicast systems, the OS doesn't actually prevent access.

It's more like putting up a do -not -disturb sign.

It's up to all the other programs to voluntarily check for the lock and respect it.

If they don't check, or choose to ignore it, they can still access the data.

Ah, okay.

So that puts the responsibility on the application developer, then.

Exactly.

If you're using advisory locks, you have to be careful.

A rogue program, or even just a poorly written one, could ignore the lock and potentially corrupt the data.

So it's a trade -off between enforcement and flexibility.

Got it.

Now, does the operating system itself actually care what kind of information is inside a file, or is it just a generic bag of bytes to the OS?

Well some do, some don't, and there are trade -offs, as always.

If an OS does recognize a file type, it can sometimes operate on it more intelligently.

Like it could prevent you from accidentally trying to execute a JPEG image file, which wouldn't make any sense.

The most common way OASes or applications recognize types is through file extensions.

You can imagine a table, like the one in the book, Figure 13 .3, listing common extensions in their typical purpose.

.obj for compiled object code, .mp3 for audio, .zip for compressed archives, .java for Java source.

How do different systems handle this, like you mentioned Mac OS earlier?

Right.

Mac iOS often uses a special creator attribute that's set by the application that created the file.

So when you double -click something, say a .pages document, the OS looks at that attribute and knows to launch the pages app automatically.

Uninics and Linux often rely more on magic numbers.

These are specific sequences of bytes right at the beginning of binary files.

Like a secret code.

Sort of.

Like a specific pattern might say, I'm a PDF file, or I'm an executable program.

But for many other file types, the extensions are mostly just hints for users and applications.

The Uninize kernel itself often doesn't strictly enforce them.

Does the OS impose any kind of internal structure on the contents of files themselves?

Generally, not much.

The main exception is executable files.

The OS needs a specific format there, so it knows how to load the program into memory, where the code starts, where the data is, etc.

Beyond executables, it's really a design choice for the OS creators.

The potential downside of an OS trying to understand and support many specific file structures internally is that the OS itself becomes really large and complex.

And what happens when a new file format comes along?

The OS wouldn't know how to handle it.

Imagine trying to create, say, an encrypted file on a system that only understands plain text and one specific executable format.

It wouldn't work well.

So what's the more common approach for modern OCs, like the ones we use daily?

Unilex, Linux, Windows?

Most modern systems, including those, take a simpler approach.

They generally treat files as just a simple sequence of bytes.

Or sometimes a sequence of records, but often just bytes.

8 -bit bytes, usually.

This gives maximum flexibility.

It leaves the interpretation of what those bytes mean entirely up to the application program.

Now, internally, on the actual disk, files are stored in fixed -size chunks called blocks, or physical records.

And this leads to a common small inefficiency called internal fragmentation.

The last block of a file might not be completely full.

Oh, like having a storage box that's mostly full, but there's still some empty space left inside that you can't use for anything else.

Exactly that.

If your file is, say, 1 ,000 bytes and the block size is 512 bytes, you'll use two full blocks and turn 24 bytes.

That last block has 24 bytes of wasted space inside it.

That's internal fragmentation.

OK, we've talked quite a bit about individual files, but that's only half the picture, isn't it?

How does an OS stop things becoming just a complete digital mess when you have thousands, maybe millions of files?

This is where directories or folders come in, I assume.

That structure seems absolutely critical.

Absolutely critical.

The directory structure is how we organize everything.

Think of a directory as basically a symbol table.

It maps the human -friendly file names to the file's actual internal identifier and attributes.

And we need operations on these directories, too.

Searching for a file within one, creating new files, deleting files, listing the contents, renaming files, and crucially traversing the structure moving between directories.

What's the absolute simplest way you could organize files, like the most basic directory structure?

The simplest possible way is a single -level directory.

Imagine just one giant directory, one big list, contained to all the files on the system.

Oof.

Yeah.

Figure 13 .7 in the book shows this.

It's easy to understand, easy to implement, but it gets completely unmanageable very, very quickly.

The biggest problem,

naming.

If two different users both want to call their file report that txt on a tile of A's, they can't.

Every single file in the entire system needs a unique name.

Yeah, that sounds like chaos.

Imagine a school where every single student paper goes into one giant bin, and they all need unique names.

Total chaos.

So that leads to the next step.

How do we fix that naming collision problem, then?

The next logical step is the two -level directory.

This is shown in figure 13 .8.

Here, the system has a master file directory, or MFD.

The MFD contains entries that point to separate directories for each user.

These are called user file directories, or UFDs.

So when you log in, the system looks you up in the MFD, finds your UFD, and from then on, when you refer to a file, it only searches within your UFD.

Ah, so my report .txt is totally separate from someone else's report .txt because they're in different UFDs.

Exactly.

It solves the naming conflict between users beautifully.

Each user has their own private namespace.

The limitation, though, is that users are still pretty isolated.

Sharing files isn't straightforward.

To access a file in someone else's directory, you usually need to specify its path name, maybe something like user -b -some -file .txt.

This starts to introduce the idea of path, a route, through the directories.

And systems often have a search path concept, too, for finding system commands.

It looks in your current directory first, then maybe in a special system directory.

OK, this is starting to sound much more like the systems we actually use today.

You can make subfolders within subfolders, right?

Creating these deep -nested structures.

Precisely.

That leads us to the most common structure used today.

The tree -structured directory.

Think of it like figure 13 .9 in the book.

It generalizes that two -level idea into a tree of, well, arbitrary height.

Users can create their own subdirectories inside their main directory, and subdirectories inside those, and so on.

This gives users enormous flexibility to organize their work.

You might have a projects directory, and inside that, separate directories for project A, project B, etc.

Inside project A, you might have source, docs, data.

Every file and every directory in the entire system now has a unique path name, starting from the root directory, usually denoted by or...

And you can refer to them in different ways, right?

Like the full path from the very top, or just relative to where you are now.

Exactly.

You have absolute path names that specify the full path from the root, like homeuserdocumentsreport .doc.

And you have relative path names that are interpreted starting from your current working directory.

If you're already in home user documents, you could just refer to report .doc.

An interesting policy decision comes up with deletion, though.

What happens when you try to delete a directory that isn't empty?

Oh, yeah.

Some systems force you to empty.

First, delete all the files and subdirectories inside.

That's safer, prevents accidental mass deletion.

Others, like Unix with the rmab -rcommand recursive remove, let you just wipe out the directory and everything inside it all the way down.

Super convenient, but also potentially super dangerous if you make a mistake.

Definitely use that one with caution.

OK, so trees are great.

But what if, say, two people are working on the same project?

They might want to share a specific folder, but have it appear logically inside both of their personal directory trees.

A simple tree doesn't quite allow that, does it?

You've hit on a limitation again.

A peer tree structure, by definition, doesn't allow sharing of subdirectories or files like that.

Each file or directory has exactly one parent.

That's where the concept of an acyclic graph directory structure comes in.

This is like figure 13 .7.

It relaxes the single parent rule and allows directories or files to have multiple parent directories.

This lets you link a file or a whole subdirectory so it appears in multiple places.

So it's like having shortcuts, but built into the file system structure itself.

Kind of, yes.

Often it's implemented using links.

Unix has two main types, symbolic links or soft links.

Which are basically just pointers containing the path name of the original file.

And hard links, where multiple directory entries point directly to the same underlying file information, the inode and uninox.

The key benefit is sharing.

There's only one physical copy of the shared file or directory.

So if one person modifies the shared file, the changes are instantly visible to everyone else who has a link to it.

That sounds much more flexible, but also maybe more complicated to manage.

Definitely more complex.

A couple of issues arise.

One is aliasing.

The same file now have multiple different path names.

But the really tricky part is deletion.

What happens if you delete a file that's shared via links?

Yeah, if you delete the original, do the links break?

With symbolic links, yes.

If you delete the target file, the symbolic link is left dangling, pointing to nothing.

Trying to access it will usually result in an error.

Unix typically allows this.

With hard links, it's different.

The system often keeps a reference count associated with the file's data.

Every time a hard link is created, the count goes up.

Every time a link is deleted, the count goes down.

The actual file data is only physically deleted and the space freed when that reference count drops to zero.

Okay, that reference counting makes sense for hard links.

Now, if a cyclic graph's graphs without loops are already complex, what about taking it a step further to a general graph where you can have cycles?

Like a directory could contain a link back to itself or one of its ancestors.

Well, a general graph structure, like figure 13 .11, offers the absolute maximum flexibility.

You could link anything anywhere.

But it becomes much harder to manage.

The biggest problem is cycles.

If you have a cycle, algorithms that traverse the directory tree, like for searching or calculating disk usage, can get stuck in infinite loops, just going round and round.

And that reference count trick we just talked about for hard links, it doesn't reliably work anymore if you have cycles.

A cycle could keep the reference counts of files within the cycle permanently above zero, even if there's no way to actually reach those files from the root directory anymore.

They become inaccessible garbage,

but the system thinks they're still linked.

So how would you even clean that up?

You'd need a much more complex process called garbage collection.

This typically involves, one, traversing the entire file system graph starting from the root, marking every file and directory that is reachable.

Two, making a second pass through all storage blocks and reclaiming anything that wasn't marked as reachable.

This is incredibly time consuming, especially on large disk -based systems.

Because of these complexities, most practical operating systems avoid general graph structures and stick to acyclic graphs, usually implemented with trees plus symbolic links.

That makes a lot of sense.

Avoid the complexity if you can.

We can organize our files into these structures, but now we have this potentially vast shared structure.

How does the OS make absolutely sure that only the right people can access specific files and only in the right ways?

That brings us squarely to the topic of protection.

Exactly.

Protection is all about controlling access.

Who can do what to which files?

It's important to distinguish this from reliability or data integrity.

Reliability is about preventing physical loss or damage things like disk failures, which you handle with backups or RAID systems.

Protection is about enforcing policies on access to the data, assuming the data itself is physically okay.

So what kinds of actions, what types of access does the OS typically need to control?

The standard types of access usually boil down to these.

Read.

View the contents.

Write.

Modify or overwrite the contents.

Execute.

Load the file into memory and run it, if it's a program.

Append.

Add data to the end of the file only, without changing the existing stuff.

Delete.

Remove the file.

List.

View the names and maybe attributes of files within a directory.

And sometimes attribute change.

Modify the file's metadata, like its permissions or timestamps.

Often, more complex operations you think about, like copying a file, are actually just sequences of these basic operations performed by a utility program.

Copying needs read access on the source and write access on the destination directory.

Right.

So how does the OS actually decide who gets to perform which of these actions on a given file?

The most common approach, by far, is based on user identity.

When you log in, the system knows who you are.

The most general way to implement this is using access control lists, or ACLs.

With an ACL, each file has an associated list.

This list specifies individual users, or sometimes groups of users, and exactly what types of access they are granted or denied for that specific file.

That sounds very flexible.

You could get really specific.

Extremely flexible.

You can grant access to user A and user B, but deny it to user C, all for the same file.

The downside is that these lists can become very long and potentially difficult to manage, especially if you have many users or complex sharing requirements.

Imagine trying to grant read access to everyone except one person using a pure ACL.

The list would be huge.

Yeah, I can see that.

So are there simpler ways?

Yes.

To deal with the complexity of long ACLs, many systems simplify the model, particularly older Unix systems.

They use a condensed approach with three main classifications.

One, owner.

The user who created the file usually gets full control.

Two, group.

A defined set of users who are collaborating or need shared access, example, everyone working on Project X.

Three, other, or sometimes called universe or world, basically everyone else on the system who isn't the owner and isn't in the group.

Ah, the classic Unix permissions.

RWX for owner, group, and other.

Exactly.

In traditional Unix, this is represented by those nine permission bits you often see in a directory listing.

LSIWX.

Three bits for the owner, three for the group, and three for others.

So if you see something like $RWANDRUS, that means it's a regular file, the leading IX.

The owner can read and write RWANDR.

Members of the files group can also read and write RWX.

Everyone else, other, can only read IR.

Or if you saw do though under us as Rx, that'll be O to NAX.

A directory.

Owner has full read write execute permissions.

RWX execute permission on a directory means you can search it.

Group members can read and execute R to XX.

Others have no permissions at all.

Can you give us a quick concrete example of how these owner group other classifications might work in practice?

Sure.

Let's use the books example.

Imagine Sarah is writing a book, maybe in a file called book .txt.

She's the owner.

She needs full rtolebyx access to edit it, maybe compile it, execute, etc.

She has three graduate students helping her, Jim, Dawn, and Jill.

They're put into a group, let's call it text.

Sarah sets the permissions so the text group has read and write, article access.

They can edit the text, but maybe she doesn't give them execute permission if it's not needed.

Or perhaps write permission excludes deletion, depending on the system specifics.

Then for other users on the system, maybe she wants them to be able to read the draft for comments.

So she gives her access to others.

That makes it clear.

But what if Sarah needs something more specific?

Like she wants a visiting colleague who isn't the owner and isn't in the text group to be able to read only chapter one, not the whole book.

That owner group other model seems too coarse for that.

You're right.

The basic modder struggles there.

That's where modern systems often combine the simple owner group other bits with more fine -grained ACLs.

For instance, Solaris might show the standard permissions, but add a plus sign at the end if an ACL is also attached, providing more specific rules.

Windows uses ACLs extensively and provides a graphical interface, like in figure 13 .12, where you can see a list of users and groups, and check boxes to grant or deny specific permissions.

Read, write, modify, full control, et cetera, for each one on that file or folder.

You could explicitly add the visitor's username and grant only read access or even explicitly deny access to someone.

And if the simple permissions and the ACL say different things,

like group has write access, but the ACL specifically denies write access to one person in that group.

Good question.

Usually the more specific rule wins.

So the explicit deny in the ACL would typically override the general group permission for that specific user.

Denials often take precedence over grants as well as a safety measure.

And just briefly, there are other protection approaches too.

You could theoretically put a password on each file.

Like encrypting a zip file.

Sort of, but often implemented differently.

The trouble is, remembering tons of different passwords is hard, and people reuse them, making it insecure.

Or if you use one password for everything, once it's compromised, everything is open.

Strong file -level encryption itself is a much better modern approach for confidentiality.

And we also need to remember directory protections.

Not just about accessing the files inside, but controlling who can create or delete files within a directory, or even just list its contents.

That execute dex permission bit on directories in U8IS is crucial for this.

Okay, this has been super informative on organizing and protecting files.

But earlier you mentioned that file operations involve system calls, dist access.

Sounds like it could be slow, especially if a program is reading or writing a lot.

It definitely can be a bottleneck.

Is there a faster way?

Maybe something that uses memory more directly?

Absolutely.

And this is a really important technique used everywhere in modern systems.

Memory mapped files.

It's a very common and highly efficient alternative to the traditional read and write system calls.

Instead of those calls, we use virtual memory techniques.

How does that work?

Basically, the OS maps a portion of a disk file, or maybe the whole file, directly into the virtual address space of a process.

When the process first tries to access a memory address within that mapped region,

it causes a page fault, just like accessing any other part of virtual memory that isn't physically present.

The OS handles the page fault by reading the corresponding data from the file on disk into a physical memory page, and then maps the virtual address to that physical page.

Okay.

But here's the magic.

Once that initial page fault is handled, subsequent reads and writes to that memory region by the process are just normal memory accesses.

They happen at memory speed without any explicit system calls for each read or write.

Whoa.

So I'm just reading and writing to memory addresses like any other variable in my program, and the OS takes care of getting the data from disk initially and eventually writing my changes back.

Precisely.

The reads are satisfied directly from memory after the first fault.

Writes modify the memory page.

The OS handles writing those modified memory pages back to the actual disk file later on.

Usually this happens periodically, or when the file is unmapped or closed, or if the system is under memory pressure and needs to reuse the page frame.

Systems like Solaris actually take this concept even further.

They often use memory mapping internally, even when you use the traditional read and write calls.

The data is mapped into the kernel's address space, and the read and write calls just copygated between the kernel's map and the user's buffer.

It highlights just how fundamental and efficient this technique is.

That sounds way faster.

Does it help with sharing data, too?

Immensely.

This is one of the key benefits.

Multiple processes can map the same file into their own virtual address spaces concurrently.

The OS is smart enough to map their different virtual addresses to the same physical memory pages containing the file data.

Figure 13 .3Li shows this visually.

This means they are literally sharing the same physical memory.

If one process writes to its mapped region, the change is instantly visible to the other processes when they read from their mapped region, because they're looking at the same RAM.

Wow.

Okay.

It's a very efficient way to achieve shared memory between processes.

You can even combine it with techniques like copy -on -write if you want processes to initially share the data but get their own private copies if they try to modify it.

So, memory map files aren't just for speeding up single process I .O.

They're actually a powerful technique for interprocess communication, for shared memory.

Can you walk us through how that might look in a real API?

Absolutely.

It's a very common IPC mechanism because it avoids extra copying.

Let's look at the Windows API example conceptually.

Figures 13 .15 and 13 .16 describe this for a producer -consumer scenario.

Imagine you have a producer process that wants to send data to a consumer process.

One, the producer first calls createFile to get a handle to a file.

This might be a real file on disk, or it can actually use a special value, VLAN handle value, to indicate it wants memory backed just by the system paging file, not a permanent disk file.

Two,

then it calls createFileMapping.

This is the key step.

It creates a named file mapping object.

Let's say it names it shared object.

This name is crucial for sharing.

Three, finally, the producer calls mapUAlpha.

This maps a view of that named mapping object into the producer's own virtual address space.

It gets back a memory pointer.

Now the producer can just write its message directly to that memory pointer as if it were writing to any local variable.

Okay, so the producer has set up this named shared space and written to it.

How does the consumer get the data?

It's quite elegant.

One, the consumer process calls openFileMapping using the exact same name shared object that the producer used.

This gives it access to the same underlying mapping object created by the producer.

Two, then the consumer also calls mapViewOFile on the handle it got back.

This maps a view of that same shared object into the consumer's virtual address space.

It gets its own memory pointer.

Three, now consumer can simply read the message directly from its memory pointer.

It's reading the data the producer wrote because both pointers ultimately refer to the same physical memory managed by the OS via the named mapping object.

After they're done, both processes would call unMapViewOFile to remove the mapping from their address space and closeHandle to release the mapping object and file handle.

That's a really clear walkthrough.

It perfectly shows how this seemingly abstract idea of memory mapping becomes a concrete, efficient tool for processes to communicate and share data.

It really blurs that line between memory and disk files.

Just like that, we've taken a pretty deep dive into the world of file systems.

Wow, from the basic idea of what a file is to how operating systems organize them using these clever directory structures.

Mm -hmm, the trees and graphs.

Right, and how they protect them with access controls, owner, group, other ACLs, and even this really cool technique of mapping files directly into memory for super fast access.

We've covered a lot of ground here on a really core OS concept.

Yeah, we hope this exploration has demystified the file system interface a bit, giving you a clear understanding of how crucial it is to literally everything you do on your computer.

It really does underpin our entire digital lives, saving documents, running apps, streaming video.

It's all relying on the file system, often in ways you don't even consciously think about.

You've definitely taken a major shortcut now to being well -informed on this foundational operating system topic.

So think about it.

What stands out to you the most about how files are managed and protected?

How might understanding this change the way you think about saving your work, or maybe sharing files with colleagues?

Perhaps you'll think twice before typing rmsr in a directory someone shared with you.

Definitely keep exploring these ideas.

A warm thank you from the Last Minute Lecture team.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

The file-system interface serves as the critical abstraction layer between users and applications on one side and physical storage devices on the other, defining how data is organized, accessed, and protected in modern operating systems. A file, fundamentally, is a named collection of related information stored on secondary storage, characterized by attributes such as its name, type, physical location, size, protection settings, and various timestamps that track creation, modification, and access events. Operating systems support a range of file operations—creation, reading, writing, repositioning the read/write pointer, deletion, and truncation—each with specific implications for resource management and data integrity. Access methods determine how applications retrieve data from files, with sequential access requiring data to be read in order and direct access allowing random positioning within the file. File types communicate intended usage through naming conventions and extensions, influencing how the operating system handles execution, interpretation, and presentation. Protection and sharing mechanisms, including access control lists and file locking protocols, enforce security policies by regulating which users or processes can perform specific operations on particular files. Directory structures organize files hierarchically and logically, ranging from simple single-level directories suitable only for small systems to sophisticated tree-structured and acyclic graph directories that enable namespace organization and symbolic linking. More complex general graph structures, while powerful, introduce the risk of cycles and navigational ambiguity. File-system mounting integrates multiple storage devices or network resources into a unified namespace, while remote file systems and protocols like the Network File System enable transparent access to data across network boundaries in distributed environments. Real-world implementations in Windows and UNIX-like systems demonstrate how these concepts translate into concrete permission models, naming conventions, and directory hierarchies that users and administrators interact with daily. Understanding the file-system interface equips students with knowledge of how operating systems bridge the gap between logical data organization and physical storage, how access control prevents unauthorized use, and how distributed file systems extend storage capabilities across networks.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 13: File-System Interface: File Concepts, Directory Structure, and Protection

Related Chapters