Chapter 4: Cognitive Aspects of Interaction

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Imagine this.

You are absolutely slammed, trying to finish a massive report, deadline is crushing you, you need to focus.

Right.

But your phone is just constantly going, vibrating, buzzing, lighting up.

Your friends are planning dinner, your boss sends a late night email,

and social media is just serving you endless shiny distractions.

So you respond, you check, you scroll, and 30 minutes later, you snap back to your report.

Where was I?

You've totally lost your train of thought.

And now you have to backtrack.

Okay, let's unpack this.

That scenario, that feeling of digital fragmentation, it isn't a moral failing.

It's a direct result of designing technology that constantly works against our biological limitations.

That is precisely right.

Our goal in interaction design is to solve this conflict.

It is.

And our mission for this deep dive is to really dissect the crucial role of what we call cognitive aspects in interaction design.

This is all about understanding at a really fundamental level how we as humans perceive, process, and remember information.

So the rules of the human mind.

Exactly.

And when we know those rules, we can design technology that either brilliantly extends our capabilities or, you know, more often compensates for our inherent weaknesses.

And how are we going to structure this?

We'll do it in four parts.

First we'll define cognition.

Then we'll get into the costs of distraction and multitasking.

After that, how technology can be a memory aid.

And finally, we'll review the core cognitive frameworks that underpin all of HCI.

So we're essentially turning the spotlight back on the user's brain, making sure every interface decision is, well, neuro -friendly.

Let's start with the basics, then.

Cognition.

What is it, exactly?

It's this massive umbrella term for, you know, all mental activities.

Everything from abstract thinking and problem solving to basic perception, learning and making decisions.

That's a lot.

It is.

So to make it actionable for design, we tend to look at two established modes of operation.

And these are experiential and reflective cognition.

So experiential is the intuitive mode.

It's the effortless, often automatic stuff.

Uh -huh.

Like an experienced driver navigating traffic.

They're not thinking about every single step.

Or when you're just lost in a great novel, just reacting.

Exactly.

And then you have reflective cognition.

The hard stuff.

This is the demanding process.

It requires real mental effort,

focus, judgment, careful attention.

This is where you're writing that challenging report or designing a new system.

Or trying to solve a complex math puzzle.

This is where innovation happens.

And this sounds a lot like Daniel Kahneman's work on fast and slow thinking.

It's a direct parallel.

Experiential is fast thinking.

It's instinctive, effortless, like knowing two plus two is four.

Instantly.

And reflective is slow thinking.

It's logical.

It takes energy.

And when you have to solve something like, say, 21 times 19, you reach for a pen.

You almost always have to externalize it.

Yeah.

You write it down or grab a calculator.

So while those two modes define the how of thinking, modern cognitive science also describes it based on context.

Like is it distributed across people and tools?

Or situated in a specific environment.

Right.

Or extended by our devices.

Or even embodied through our physical interactions.

But regardless of the mode or the context.

Designers really need to optimize for six specific crucial cognitive processes.

Which are?

Attention, perception, memory, learning, reading, speaking, and listening.

And then that final big group,

problem solving, planning, reasoning, and decision making.

Okay.

Let's start with the first one.

Attention.

Attention is our first gatekeeper.

It's the process of selecting what you're going to concentrate on.

And how easy that is seems to depend on a couple of things.

Two main factors, yeah.

First, how clear are your goals?

And second, how is the information presented?

So if your goal is crystal clear, say, you need the address of a specific restaurant, your attention is like a laser.

Exactly.

But if you're just browsing Netflix.

Your attention is just wandering, waiting for something to grab.

Which brings us to presentation.

There's a classic study by Tullis on this.

Right.

The one comparing two screens with hotel rate information, the info was identical.

But the layout was totally different.

Screen A grouped the data vertically.

City, hotel, phone,

rates,

and used a lot of white space as separators.

Screen B just jammed it all together horizontally in these dense little clusters.

And the difference was profound.

How profound?

Users trying to find a rate on that clustered screen, screen B, took an average of 5 .5 seconds.

And the organized one?

Only 3 .2 seconds.

Wow.

And that's not a test of the computer's speed.

It's a demonstration that simple design choices, grouping, hierarchy, white space, they fundamentally dictate how efficiently we can use our attention.

Which brings us to the true cost of modern life, multitasking.

We are all heavy media multitaskers now.

Constantly switching.

You see people with multiple chats, documents, games, all going at once.

And the general finding is that heavy multitaskers are surprisingly more easily distracted than light multitaskers.

But there's some nuance there, right?

There is.

Some studies suggest that if the sources of distraction are actually relevant to your main task.

Like doing background research for that report you're writing.

Exactly.

Then the distraction is less harmful.

But even with that, the core finding is pretty devastating.

That when it's irrelevant tasks, it just overloads our cognitive capacity.

Totally.

That switching requires effort to get back on track, and it dramatically increases the time it takes to finish anything.

You see this in that instant messaging study.

When students were interrupted by messages while reading a textbook.

Oh, this one's a classic.

Their completion time went up by a staggering 50%.

50 % compared to the uninterrupted readers.

And this leads us right to that critical dilemma of using a cell phone while driving.

The data here is, it's conclusive and it's frightening.

Drivers talking on the phone have significantly longer reaction times.

They struggle to stay in their lane.

And here's the kicker.

Hands -free is not safer.

Not at all.

And this is the key cognitive insight.

The mental load of the conversation itself, visualizing what you're talking about, trying to remember where you left your keys.

It competes for the same resources you need to process the visual reality of the road.

Which is why talking to a passenger is so different.

Completely different.

A passenger sees the road, they see the hazard, and they instinctively shut up or slow down the conversation.

The person on the other end of the phone has zero situational awareness.

And this finding has directly led to designs like the driver mode on phones, which lock down features to prevent these resource conflicts.

So the design implications for attention are pretty clear.

Make important information stand out, but use things like color and animation sparingly.

Avoid visual clutter.

And if you have to have task switching, design good support for it.

Like a pulsing icon or a subtle audio alert.

Exactly.

Something to help the user reorient quickly.

Okay, shifting gears to perception.

This is how we acquire information through our senses.

Mostly vision.

Right.

And again, that hotel screen study applies perfectly here.

Organization helps perception.

What else?

Another key study found that physically grouping information on a web page using a visible border was actually more effective for locating things than just using color contrast.

So a border creates a stronger perceptual separation.

A much stronger one.

So the takeaways for perception are about presentation quality.

Use white space, use separators, design icons that are distinct from one another.

And check your color contrast.

Yellow text on a blue background is fine.

But yellow on white or light green is a nightmare.

We should also use haptic feedback wisely.

Oh, definitely.

Reserve that physical vibration for when a user needs confirmation that they've completed an action.

Okay, on to memory.

This is a siltration process, encoding, storing,

retrieving knowledge.

And what we recall is incredibly context dependent.

It's why you don't recognize your neighbor when you see them on a train.

They're out of context.

A huge difference for designers is the gap between recognition and recall.

A huge chasm.

Recognition is easy like recognizing a familiar icon.

Recall is hard like remembering the exact file path to a document you saved three months ago.

And there's a modern irony here.

Studies show people remember less about objects they photograph than objects they just look at.

Because you're so focused on framing the shot, on the act of taking the picture, that you're not encoding the details of the object itself.

Which feeds right into how we use our devices as, what do you call them?

Cognitive Prospecies.

Research shows that when we know, we can just Google something.

We remember less of the actual information, but our memory for where to find it gets better.

So we remember the search term, or the app we need to open.

That's the essence of personal information management, or PM.

We still struggle to find our digital files.

Naming things relies on recall, which is hard.

And even though we have powerful search tools like Spotlight on a Mac.

Users still overwhelmingly stick to the primitive metaphor of folders.

Why is that?

Because folders offer instant recognition and a satisfying sense of place.

Even if they often just become messy stuff folders.

And what about the cognitive load of authentication?

It's immense having to remember ZIP codes, security questions, and, you know, the fifth and seventh characters of a complex password.

It forces us to externalize.

We write things down, or use a password manager.

Because mentally counting alphanumeric characters is just too difficult.

The evolution in interface design is to shift that entire burden.

With things like biometrics.

Exactly.

Facial ID, touch ID.

They solve this completely by moving the responsibility from the user's memory to the device's recognition.

And technology can also be a memory amplifier.

Take the sense cam from Microsoft.

A wearable camera that just takes photos intermittently.

Right, the case study with Mrs.

B who had amnesia.

Reviewing the images taken by the camera with her husband nearly tripled her ability to recall events.

Wow.

And we see similar things with apps for people with dementia using old photos and music to rigor long -term memories.

Before we move on from memory, we have to talk about the magical number seven.

We must.

A public service announcement.

George Miller found that humans can hold seven, plus or minus two, chunks of information in short -term working memory.

Right.

And early designers just ran with this in the wrong direction.

Terribly misapplied it.

They thought, haha, our menus can only have seven items.

But that's not how it works.

Not at all.

Visually scanned interfaces like menus or lists are testing recognition, not short -term recall.

You don't have to memorize the list.

You just have to recognize the option you want.

So limiting a list to seven items based on this myth is just arbitrary.

And it can actually hurt usability if more options are needed and could be organized clearly.

So the design mandate for memory is reduce cognitive load.

Always.

And always, always prioritize recognition over recall.

Use familiar icons, consistent menus, and good labeling.

Next up is learning.

We can split this into incidental, unintentional, like learning to recognize a voice and intentional, which is goal -directed, like studying for an exam.

And users consistently prefer learning by doing rather than reading a manual.

Which is why graphical user interfaces and direct manipulation are so powerful.

They support exploration and, crucially, the ability to undo mistakes.

Modern tech like VR helps by showing multiple representations of the same abstract idea at the same time.

So the design implication is to encourage exploration, but make sure the interface provides some constraints or guidance for new users.

OK.

What about reading, speaking, and listening?

The underlying meaning is the same, but the mode affects how we process it.

Written text is permanent and scannable, but it takes more effort.

And listening is transient, but requires less cognitive effort.

Modern apps leverage all these modes, speech recognition, interactive books, chatbots.

So for design, keep spoken menus really short.

No more than three or four options, max.

People lose track.

And for generated speech, make sure the intonation sounds natural.

And always provide options for large text without breaking the layout.

Definitely.

And finally, that big group.

Problem -solving, planning, reasoning,

and decision -making.

This is pure reflective cognition.

It's weighing options, considering consequences,

figuring out how to plan a family vacation.

But we're dealing with so much information overload now.

And classical theories suggest we should analyze every single variable.

But that's not what we do.

We use what cognitive psychology calls fast and frugal heuristics.

Exactly.

Chobbers rely on a few simple cues—brand, price, packaging—to make quick decisions.

They ignore almost all of the available data.

Which means designers need to provide just enough salient information.

Think about AR, or wearable tech.

Instead of overwhelming you, a glanceable display lets you turn on filters.

Show me only the organic items, or the nut -free ones, to help you make those quick, heuristic -based choices.

But this reliance on cues can lead to a dilemma.

The app generation dilemma.

Young people, faced with huge choices like what college to attend, rely heavily on curated apps and reviews.

And this can lead to whisk aversion.

And an inability to make the final decision independently.

Often they just end up reverting to their initial preference anyway.

So the design mandate here is high cognitive support.

Provide accessible help, use simple, memorable functions, and let users save and compare their preferences easily.

Okay, so beyond those individual processes, there are broader conceptual frameworks that help explain user behavior.

And the most foundational of these is mental models.

What are those exactly?

They're the internal constructions we build about how a technology works.

We use them to reason about the system, especially when something goes wrong.

So an engineer has a deep mental model of an engine, but the average driver.

It has a much shallower one.

And crucially, these models are often incomplete or based on totally inappropriate analogies.

The classic example is the thermostat.

Yes.

Most people operate on an incorrect valve theory mental model.

Meaning they think if they want the room to heat up faster, they should crank the thermostat way up.

More equals more rate.

But a thermostat is just an on -off switch.

Turning it to 90 degrees doesn't make the heat come out any faster than setting it to 75.

So people are applying a water faucet analogy to a temperature switch.

Exactly.

We bring physical world assumptions to digital interactions.

Designers have to solve this by promoting transparency with clear instructions and feedback.

Next up, the gulfs of execution and evaluation.

These define the two crucial gaps in human -computer interaction.

The gulf of execution is the gap between what the user wants to do and how they actually do it on the system.

So how do I get this thing to work?

Right.

And the gulf of evaluation is the gap between the system's state and the user's understanding of it.

What did it just do and did it work?

Precisely.

And the Bluetooth headset example is a perfect illustration of failing to bridge these gulfs.

A total failure.

The user was trying to connect the headset, but was completely confused by the slider control.

Not because the labeling was inconsistent.

So the user couldn't accurately evaluate the system's state.

Was it on -off pairing?

That confusion just widens both gulfs and leads to frustration.

Good design closes those gaps.

What about the information processing framework?

This one takes the most direct view, seeing the mind basically as a computer.

It breaks down any activity into a series of four ordered stages.

Which are?

First, encoding, which is receiving the input, then comparison,

then response selection, deciding what to do, and finally response execution, the actual action.

And this was foundational for early predictive models in HCI.

Very much so.

It helped predict how long a user would take to complete a task based on the assumed speed of those internal stages.

Now let's shift focus a bit to distributed cognition.

This moves beyond the mind and the head and studies cognitive phenomena in the wild.

It treats the entire system, the people, the artifacts, the environment, as the unit of analysis.

So like an airline cockpit.

A perfect example.

The cognitive system there involves the pilot, the captain, air traffic control, all the instruments.

And distributed cognition analyzes how information, like an altitude change,

is propagated through that whole system.

Right.

Through all the different media, radio, instruments, physical actions, via changes in representational state.

It's all about shared awareness.

And closely related to that is external cognition.

Which focuses specifically on how we interact with external representations, maps, notes, software, to reduce our internal cognitive load.

There are three key types.

First, externalizing to reduce memory load.

Using diaries, sticky notes, calendars.

Second is computational offloading.

Using a tool, like a calculator, to do a demanding computation.

And this shows how representation matters.

Multiplying 21 by 19 is easy.

Multiplying the Roman numerals, XE1 by XE8, is ridiculously hard.

And the third one?

Annotating and cognitive tracing.

Annotating is modifying a representation, like crossing items off a to -do list.

And cognitive tracing is physically manipulating things to help you think.

Like a Scrabble player shuffling tiles on their rack to find a word.

The physical manipulation offloads the mental work.

And finally, we have embodied interaction.

This framework argues that our physical bodies and our active experiences with the environment fundamentally shape how we think and perceive.

The example of the choreographers is fascinating.

It is.

They don't always rehearse the full, exhausting routine.

Instead, they use marking small, abbreviated, light gestures.

So this embodied marking lets them explore complex moves without the huge mental cost of a full simulation.

Exactly.

And the design implication is powerful.

If you're using technology to teach a new physical skill, like a golf swing, you should encourage these marking motions for low -complexity practice.

So we've taken a really deep dive through the human mind here.

We covered the reflective and experiential modes, the six essential cognitive processes.

From attention and memory, all the way to planning and decision making.

And the critical frameworks, like mental models, the golfs, and the external approaches.

The core takeaway, it seems, is that interaction design is the art of honoring our cognitive strengths, while meticulously compensating for our very real limitations.

It is.

And that leads us to a final, provocative question for you to consider.

As our reliance on digital tools, our biometrics, our search engines, our personal devices, as all this grows and they function more and more as these reliable cognitive prostheses,

where exactly does the boundary between our own mind and the machine truly lie?

Something to mull over.

Thank you for joining us for this deep dive into the cognitive aspects of interface design.

We'll catch you next time.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Understanding how human cognition shapes interaction design requires examining both the nature of mental processing and the practical constraints that govern how people think and act. Cognition operates through two distinct modes: experiential cognition engages automatic, intuitive mental processes that require minimal effort, while reflective cognition demands conscious attention and deliberate reasoning. These dual systems have profound implications for design because users frequently rely on fast mental shortcuts rather than exhaustive analysis of available options. Attention is a limited resource easily fractured by competing demands, making it essential for designers to highlight critical information and minimize visual clutter so users can focus on their primary tasks. Visual perception depends on how information is organized spatially through principles like grouping, white space, and visual boundaries, which prove more effective than color coding alone for helping users understand relationships between elements. Human memory exhibits a critical asymmetry: recognition—identifying something when presented with it—requires far less cognitive effort than recall—generating information from memory without external cues—a principle that should guide decisions about interface controls, menus, and navigation patterns. The common misinterpretation of the magical number seven suggests designers should limit menu items to seven choices, but the principle actually refers only to the capacity of immediate recall; people can quickly scan many more visible options without cognitive burden. Complex security systems like multi-factor authentication demand substantial memory resources, creating friction that external aids such as biometric authentication can alleviate. Decision-making relies on fast and frugal heuristics that simplify information processing, indicating that overloading users with complete data sets contradicts how people naturally think. Mental models represent the internal frameworks users construct to explain system behavior; misalignment between actual and perceived functionality creates frustration and errors. The gulfs of execution and evaluation describe the distances between what users want to accomplish and how they discover whether their actions succeeded. Modern design thinking extends beyond simple information processing to encompass distributed cognition, which tracks how knowledge and understanding flow across teams and tools, and external cognition, which recognizes that notes, sketches, and physical artifacts reduce mental burden through cognitive offloading. Embodied interaction emphasizes that physical movement and sensory engagement fundamentally influence learning and understanding, making hands-on practice more powerful than passive observation.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 4: Cognitive Aspects of Interaction

Related Chapters