Chapter 3: Framework for System Design Interviews

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome back to the Deep Dive.

So if you're preparing for a technical role, you definitely know the feeling.

You're scanning your interview schedule and then your eyes land on it.

System design interview.

And that little spike of anxiety just hits you.

Oh yeah.

It's often the most intimidating session for candidates and for good reason, right?

The questions are intentionally vague, design product X, and they just seem unreasonably broad.

Totally.

How are you supposed to design something like, I don't know, Google search that took thousands of engineers years to build in just one hour?

It feels impossible.

Exactly.

But here's the thing.

The critical insight from our sources is that no one actually expects you to do that.

The real world systems are just way too complex.

Okay, so it's not a trick question.

Not at all.

What the system design interview actually is, is a simulation.

It's simulating two coworkers collaborating on a really ambiguous problem, you know, under a tight deadline.

Your final design is, it's actually less important than the process you took to get there.

So this is your chance to show how you work, how you think.

Precisely.

It's your opportunity to demonstrate how you scale a system, how you defend your choices, and how you respond to feedback constructively.

So our mission for this deep dive is to break that all down.

We're going to give you a simple, really effective four -step framework.

The idea is to structure your approach, manage your time, and just turn all that ambiguity into a confident, logical discussion.

And if we look at it from the interviewer's side for a second, they're tracking signals that go way beyond just your technical knowledge.

Right.

They want to see your collaboration style, how you handle pressure, and this is a big one, your ability to ask good, clarifying questions.

And I'm sure they're also looking for red flags.

Yeah.

What are the major pitfalls we should try to avoid?

Over -engineering is a huge one.

That's where you prioritize, like, some kind of design purity over the practical reality of things, ignoring trade -offs like cost or time.

Makes sense.

You also want to avoid being narrow -minded, you know, refusing to consider other ideas.

And especially working in silence, you have to talk through your thought process.

Okay, let's jump right into that structure.

Step one, understand the problem and establish design scope.

Right.

This is that crucial first phase, and our sources call it, don't be like Jimmy.

Ah, yes, Jimmy.

So Jimmy is that person who hears the question and immediately jumps to a solution because they think they know the answer.

We all know Jimmy.

We do.

And in this interview, that is a massive red flag.

Speed gives you zero bonus points here.

If you answer too quickly, you risk designing this, I don't know, beautiful, elegant system that solves the completely wrong problem.

So that initial pause, that moment you take to just think,

that's vital.

It is.

It's your moment to show that the most important skill you have isn't just coding, it's asking the right questions, clarifying all the constraints, and really importantly, writing down your assumptions.

So if you assume, say, that posts are text only because they didn't mention photos, you need to say that out loud.

You have to say it and you have to write it down.

So what are those must ask clarifying questions that everyone should have ready?

You need to define three things right up front.

First, what are the specific features?

Read only, read write.

Second is scale.

You need hard numbers like daily active users or DAU.

And third is growth.

How fast are we scaling up?

Is it 10 million users in three months or a year?

Because that speed dramatically changes your architecture.

And I guess the final really practical question is about the environment itself.

What existing tech can we leverage?

Exactly.

Why would you design a whole new caching system if the company already has one that's robust and highly available?

Asking that shows you think practically.

Let's make this concrete.

We could apply it to that classic scenario.

Design a newsfeed system.

Perfect.

So we'd start by clarifying the platform.

Is it mobile, web, or both?

Let's assume both.

Then key features.

The ability to make a post and the ability to see a feed of your friend's post.

Simple enough.

Then you have to pin down the sorting.

Is it just reverse chronological newest first?

Or is it something more complex like an algorithm?

For an interview, we'll assume reverse chronological to keep it simple, but you have to mention that trade off.

And then we lock down the constraints with actual numbers.

Concrete numbers.

So friend count, say, max 5000, traffic volume, maybe 10 million DAU,

and content, type text, images, and videos.

By the end of this first step, all of that needs to be agreed on and written on the whiteboard.

That clear agreement is what lets us move on to step two.

Propose high level design and get buy -in.

The goal here is basically to sketch out an initial blueprint and get your interviewer who's acting as your teammate to agree with it.

This is where you start drawing the boxes on the whiteboard.

You identify the key

You've got clients, load balancers, web servers, some data stores or DBs, a cache, maybe a CDN, a message queue, and you have to briefly explain what each one does.

Okay, let's stop right there and just clarify one of those for a second.

You said content delivery network or CDN.

If we're dealing with photos and videos, why is that a necessity and not just, you know, a nice to have?

A CDN is absolutely crucial for performance, especially for a global user base.

It's just a network of servers spread out geographically.

So when a user in, say, Japan requests a photo, the CDN serves it from a server nearby in Asia, not from your main data center in Virginia.

So it drastically cuts down the travel time.

It's drastically.

It speeds up load times and it also takes a huge load off your main servers.

It's a win -win.

Okay, so once that blueprint is sketched out, you immediately move to what everyone loves,

the back of the envelope calculations.

Why is it so important to do the math right here?

Because without validating the numbers, your beautiful blueprint might just be fundamentally broken at scale.

Before you've even started, really?

Exactly.

You use the numbers from step one, like how many posts per day, how much storage you'll need to check, if your database or cache can actually handle the load.

These calculations force you to find bottlenecks early before you waste time on a component that won't scale.

And the level of detail is a judgment call, I imagine.

For sure.

For a huge problem like design Google search, you're not going to detail API endpoints here.

But for something smaller, you might.

You just have to communicate with the interviewer.

Okay, so let's apply this to the newsfeed.

The high -level architecture is probably best understood if we split it into two different flows, right?

Yeah, that's the best way.

The first flow is feed publishing.

So when I make a post, what's happening on the back end?

What's the heavy lifting?

Well, the data gets written into your main database and also into a cache.

But the really critical piece here is something called the fan out service.

See, the challenge with the newsfeed is delivery.

If you have millions of users posting every day, you can't have their friends constantly checking for updates.

The system needs to be proactive.

So this fan out service is what makes it scalable.

How does that work?

Think of it like a super fast personalized mail carrier.

When you post, the fan out service gets that post, it looks up all of your friends IDs, and then it immediately pushes that post ID into the specific personal newsfeed cache for each of those friends.

So it's called fan out on write.

Fan out on write.

It means posting is a little slower because you're doing potentially thousands of writes at once, but it makes reading the feed incredibly fast for everyone else.

And the other side of that coin is newsfeed building, which is when I open the app to scroll.

Right.

So your phone sends a request, it goes through the load balancer to the web servers, then to the newsfeed service.

And because we did all that work up front with fan out, the service just has to grab the already prepared list of post IDs from your personal cache.

That whole flow is optimized for super fast reads.

Got it.

So with that blueprint agreed on, we move to step three, design deep dive.

This is where we focus on the trickiest parts.

By now, you've got agreed upon goals, a blueprint, and some initial feedback.

Time management is everything here.

You have to avoid getting lost in the weeds on some minor detail that doesn't really show your core skills.

Like spending 20 minutes on some obscure image compression algorithm probably isn't the best use of your time.

Probably not.

You want to focus on the universally challenging parts.

If you're designing a URL shortener, the deep dive is the hash function.

For a chat system, it's all about reducing latency and managing online status.

For a more senior candidate, it's often about identifying and mitigating performance bottlenecks.

So let's detail that feed publishing flow again, but this time focusing on where you do a deep dive.

So the request first has to go through rate limiting and authentication.

Rate limiting is just a basic protective layer.

It stops one person from spamming the system and taking it down.

Must have.

Absolutely.

Then the post service creates the post.

But the real scaling headache, the part that needs a deep dive, is that fan out service when it deals with celebrities, the heavy hitters with millions of followers.

Ah, right.

If a celebrity posts and we try to do a fan out on right to 5 million followers at once,

that's not going to end well.

It will crash the system.

So we implement a trade off.

For most users, we stick with fan out on right because it's fast for the reader.

But for those heavy hitters, we switch to a different strategy, fan out on read.

What does that mean?

We only write their post once to a special cache.

Then when there are millions of followers check their feeds, our newsfeed service actually merges two different feeds on the fly.

The feed from their personal cache and the feed from the celebrity cache.

It's a hybrid model.

That's a clever solution.

It shows you're thinking about real world constraints.

That's the goal.

Okay.

So what about the deep dive on the other side?

The newsfeed retrieval flow, where's the bottleneck there?

So even after we get the post IDs from the cache, the newsfeed service still has to go and fetch all the supporting data,

usernames, profile pictures, the actual post text or photo from various databases and caches.

If that takes a bunch of separate lookups, your latency just explodes.

So the deep dive is about efficiency.

It's all about data aggregation efficiency, making sure you can grab all that data in as few network calls as possible, probably with some kind of batch retrieval.

Okay.

We've used up most of our time now, which brings us to the final part.

Step four, wrap up.

This is just the last three to five minutes, but it sounds like it's incredibly important for making a good final impression.

It really is.

And the first rule is you never ever say, my design is perfect.

I can imagine that doesn't go over well.

No, you want to identify the main bottlenecks you didn't fully solve, maybe at the complexity of that Phenodon read merge process.

And then you propose potential improvements.

It shows critical self -assessment.

What else is essential to cover in those last few minutes?

You should recap your design choices, especially if you discussed a few different options.

You need to address error cases.

What happens if a server fails?

That's where things like message queues come in to provide reliability for those fan out tasks.

And then you discuss operational stuff like monitoring and system rollout.

And finally, looking ahead.

Always show your thinking about the future.

What breaks first when we go from 10 million to a hundred million users?

What architectural changes like sharding the database would we need?

Answering that is what really leaves a strong impression.

Let's quickly pull all those takeaways together.

What are the essential does and don'ts?

Do.

Always ask for clarification.

Talk through your thinking.

Don't just sit there in silence.

Suggest multiple approaches and treat the interviewer like a collaborator.

And focus your time on the most critical components.

And the big don'ts.

Don't jump into a solution before you have the requirements.

Don't get lost in the details on one component.

Too early, start high level, then drill down.

And never, ever assume the interview is over until they say it is.

And for a 45 minute session, that rough time allocation is key.

Step one is about three, 10 minutes.

Step two, the high level design is 10, 15.

Step three, the deep dive gets the biggest chunk, 10, 25 minutes.

And then step four, the wrap up is a crisp three, five minutes.

The whole value of this framework is that it helps you prove your critical thinking and your collaboration skills.

If you follow the structure, a good design will naturally come out of it.

The design process itself really is the deliverable here.

It's all about structure, communication, and clarification.

So building on that newsfeed architecture we just discussed, here's a thought for you to consider.

If you had to remove one non -essential component from that detailed design, maybe the CDN or a specific cash layer to save a lot of money, but you could only accept a moderate increase in latency,

which one would you choose and why would that trade -off be worth it for a young startup?

That is exactly the kind of practical, cost -aware thinking that proves your value in the real world.

Thank you for diving deep with us today.

We'll catch you next time.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
System design interviews evaluate candidates through open-ended, ambiguous problems that mirror real-world engineering collaboration rather than testing the ability to build a complete product within strict time limits. The evaluation targets critical professional competencies: navigating complex design decisions, managing uncertainty, communicating technical reasoning, justifying architectural choices, and performing effectively under constraints. Interviewers assess candidates for positive signals such as thoughtful questioning and collaborative problem-solving while watching for concerning patterns like inflexible thinking, defensive behavior, or unnecessary complexity. A structured four-step methodology provides the foundation for navigating these interviews successfully. The first step requires slowing down to truly understand the problem and define scope rather than rushing into solutions prematurely. This phase demands asking precise clarifying questions about concrete metrics like daily active users, feature requirements, existing infrastructure, and performance targets. The second step involves presenting a high-level architectural blueprint with key system components such as clients, load balancers, application servers, caching layers, and storage systems. Proposing rough capacity calculations demonstrates that the proposed design handles the stated constraints and can support anticipated growth. The third step focuses on detailed examination of critical components once the overall structure receives agreement. Rather than exploring every aspect equally, this phase should concentrate on areas most relevant to the system's bottlenecks—such as database design for a URL shortener or message delivery optimization for a chat application. Careful time management prevents the discussion from becoming bogged down in marginal details with minimal impact on scalability. The final step involves recapping the complete solution, acknowledging architectural limitations, proposing further enhancements as systems evolve, and addressing production concerns like monitoring, failure scenarios, and preparation for subsequent scaling challenges. Time allocation matters significantly; dedicating ten to twenty-five minutes to the deep dive phase while reserving only three to five minutes for closing remarks allows sufficient depth without rushing critical analysis.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥