Chapter 7: The Predictive Processing Hypothesis

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome back to the Deep Dive, where we take complex, challenging concepts, throw them into the theoretical deep end, and hopefully pull out some golden nuggets of insight.

Hopefully.

Our mission today is a really big one.

We're doing a deep dive into the intersection of modern cognitive science and philosophy.

It's a look at a pretty fierce high -stakes debate over how we experience, know, and act within the world.

That's right.

We are guiding you through a complex, but absolutely critical argument, the relationship between the Predictive Processing Hypothesis, let's call it PEM, and this sprawling theoretical landscape of 4 -E cognition.

Right, the Embodied, Embedded, Enacted, and Extended Mind.

And the reason this is a deep dive is because these two frameworks are, I mean, they're the reigning heavyweights of cognitive science right now, but on the surface they seem fundamentally incompatible.

They really do.

PEM basically says your brain is this spectacular internal inference machine constantly locked in your skull, building models of a hidden world.

While the 4 -E movement, especially the more radical side of it, says, hold up, that's too old -fashioned.

Right.

Cognition happens in the world, with your body, through your active engagement.

Let's get rid of those internal representations.

So the central theoretical tension this analysis addresses is, well, it's monumental.

Can an internal, neurocentric, inferential theory like PEM genuinely account for the active, situated, body -based reality that the 4 -E thinkers correctly emphasize?

And here's the chapter's thesis, which is really the core of our entire discussion today.

The argument is that the answer is yes.

But it's a qualified yes.

It's a qualified yes.

It argues that PEM is resourceful enough to absorb and explain the core ideas of 4 -E cognition without compromising its own foundation, which is, you know, inference and representation.

It's sort of a philosophical Trojan horse.

That's a great way to put it.

It suggests that once you truly understand the mathematical and computational necessities of PEM, you realize that 4 -E cognition, rightly understood, is nothing but representation and inference.

It's just implemented in this very dynamic way that has to incorporate the body.

OK, so before we jump into the 21st century battle of Bayesian models, we have to understand the philosophical history here.

I mean, this idea that perception is inference, that the brain is constantly guessing what's out there, where did that actually come from?

It's a really old idea, with roots going way back to the 11th century polymath, Ibn al -Haytham, or al -Hazen, who was working around 1030.

Wow.

Yeah, he was focused on optics, and he realized something profound.

The raw image hitting the eye, the sensory input itself, is inherently flawed.

Flawed how?

I mean, we think a vision is being pretty perfect, especially if our eyes are healthy.

Well, it's flawed because it's full of distortions, ambiguities,

and critical emissions.

Think about the blind spot in your eye, or how light refracts and bends, or how we see depth from a flat 2D retinal image.

If perception relied only on that imperfect raw input, we would experience a chaotic, fragmented world.

It wouldn't make any sense.

So he concluded that to get the stable, coherent experience we have, perception requires something more, what he called judgment and inference.

Exactly.

The system has to fill in the blanks and actively correct the distortions.

It has to make a judgment call.

And the detail he highlighted is surprisingly modern.

He said this process happens extremely quickly.

And because perception feels so instant and direct, we are usually completely unaware that we're constantly performing these complex, corrective inferences.

This is sort of the first historical realization that perception is vicarious, not direct.

That concept then gets formalized centuries later.

We jump forward to 1867 with the German scientist philosopher Hermann von Helholtz.

Right.

And he gave this process its catchy name, unconscious perceptual inference.

And Helholtz was crucial because he really defined the nature of this inference, didn't he?

He did.

He described these underlying psychical activities as being like inferences because the brain takes the observed effect, which is just the messy internal data flowing along your nerves, and arrives at an idea of the hidden cause of that effect.

So we never get to touch the world directly.

We just feel the pressure on our nerves, the photons hitting the retina.

Exactly.

You only ever sense the effects on your sensory surfaces.

You never have direct,

unfettered access to the true external causes, the objects themselves.

Unconscious inference is the cognitive bridge that spans that gap.

That really grounds the central foundational question that drives this entire field, what the chapter calls the perceptual challenge.

How does the brain construct our rich, familiar perceptual experience based only on imperfect sensory data without ever having direct, total access to the true external causes?

And when you look at it that way, you realize, this isn't just some philosophical curiosity.

This problem underpins, well, classical psychology, the most advanced theories in neurobiology, and every single attempt in modern AI to make a machine understand its environment.

You have to build a model of the hidden world from messy proxy data.

You have to.

So we take this 1 ,000 -year -old problem, unconscious inference, and in the last few decades, we've put mathematical armor around it.

This is the probabilistic revolution.

Correct.

We formalize the intuition that unconscious perceptual inference we just discussed is now mathematically understood as Bayesian inference.

So the brain, in a computationally efficient way, must be following a rational mathematical recipe known as Bayes' rule.

That's the idea.

It's how it overcomes the problem of perception.

So instead of just saying the brain is filling in the blanks, we say it's making probabilistically optimal updates.

And the single most ambitious and comprehensive theory to emerge from this movement is the one built on prediction error minimization, or P .E.

Yes, P .E .M.

postulates a system, the brain,

that is constantly predicting the sensory input it expects.

Then it measures the difference between that prediction and reality.

Right, just the prediction error.

Exactly.

And it uses that error signal to refine its internal models of the world's causes.

And this is where the conflict with 4E cognition just explodes.

Because if cognition is fundamentally about prediction error minimization, it sounds incredibly passive, intellectualist, and neurocentric.

It absolutely does, on paper.

If perception is just calculating internal beliefs and updating internal representations, that whole process seems completely divorced from action.

Right, it leaves no foundational role for the body, no importance for the environment, and certainly no possibility of the mind extending outside the brain, which is the whole point of 4E.

The 4E thinkers, people like Varela, Noe, Gallagher, they often stress radical inaction, situatedness, and often explicitly reject the need for these heavy internal classical representations that want to dissolve that brain world barrier.

The problem for cognitive science is that both models have compelling explanations.

We see neurobiological evidence for PEM, we see behavioral evidence for 4E.

So the field has three broad options for resolving this.

Option one is the hammer, incompatibility.

One of them must be wrong, so we're just discarding.

Which is unsatisfying.

Very.

Option two, often favored by people like Andy Clark, is to try deflating PEM.

They argue that PE, if you understand it properly, is actually non -representational and naturally leads to 4E processes anyway, so the conflict just disappears.

And then there's option three, which is the position this chapter champions,

deflating 4E.

Right, the argument here is that 4E cognition, rightly understood, when you look at what's necessary for action and situatedness, is actually nothing but representation and inference.

It's just applied to the body and the environment.

Exactly.

So the central thesis we're examining is that PEM is so tremendously resourceful that it can absorb and explain these embodied and enacted elements of 4E without compromising its core identity as an inferential representational framework.

It's an attempt to unify the field under the PEM banner.

Okay, so let's assume the chapter's thesis is correct for a moment.

To prove it, we need to get anchored in the exact mechanics.

How does PEM rigorously define inference and representation?

Well, in PEM, inference is the process of drawing conclusions about hidden causes, the actual objects and events in the world based only on incomplete, noisy sensory input.

It's the constant attempt to figure out what external reality generated the internal signals we're getting.

Exactly.

And Bayes' rule provides the recipe for how a rational agent should update its beliefs.

It tells the system how to weigh its prior expectations.

What it already thinks is true against the likelihood the new incoming evidence.

Precisely.

If you follow that rule, you get the optimal probabilistic update.

Let's make this vivid with the sound source thought experiment from the chapter.

Okay, so suppose there's a sound source that's truly located at 80 degrees.

Your current model, your prior expectation, is maybe 90 degrees.

Right.

A new sensory sample comes in at 77 degrees.

This generates a 13 degree prediction error.

Okay, so the system is now in conflict.

If the prior says 90, the new data says 77.

If it updates all the way to 77, it's just overreacting to noise.

But if it ignores the 77, it fails to learn the 80 degree reality.

The Bayesian framework solves this by assigning weights.

But how much weight?

And this is where it gets really interesting because the answer relies on the system modeling something called precision.

Okay, let's unpack precision.

This is a technical term, but it seems critical to PM's entire explanatory power.

It is.

Think of precision not as perfect accuracy, but as reliability or certainty.

Mathematically, it's the inverse of variance.

How wide or narrow a probability distribution is.

Give me an example.

Okay.

Your vision in a well -lit room has very high precision, low variance.

You trust it completely.

Your hearing during a roaring thunderstorm has very low precision, high variance.

You don't really trust the specific location a sound seems to come from.

So when that 13 degree error comes in, the system doesn't just look at the size of the error, it looks at the expected precision of the input.

Absolutely.

If that 13 degree error came from a high precision modality like sharp vision on a clear day, the system gives it a huge weight.

The internal model updates quickly towards 77 degrees.

But if the error came from a low precision modality like a muffled sound from behind a wall, it gives it very low weight.

And the internal model barely moves.

So the learning rate, how quickly the model actually changes, depends entirely on the precisions of both the priors and the incoming prediction errors.

This metric of reliability or precision becomes the central control knob for all of PM.

And here's the core realization.

The system doesn't need to be a conscious mathematician doing these calculations.

No, a physical system, a brain that simply minimizes its average prediction error over the long run will automatically approximate Bayesian inference.

So inference in this context isn't some intellectual act or propositional logic.

It's just the constant subtle refinement of internal models guided by this optimal weighting.

That's it.

And if we remove that inferential label,

PEM becomes an unconstrained notion.

It loses all its theoretical power.

So anyone who endorses predictive processing has to accept its fundamentally inferential nature.

Okay, but that model, a single system minimizing error about one sound source is way too simple.

It assumes a static world.

Our reality is complex.

How does PEM scale up?

It has to become hierarchical.

The system can't just assess the current precision.

It needs to know, for example, that a visual prediction error and an auditory one of the exact same size should be treated differently.

Because the expected precisions of those two senses are just inherently different over time.

Exactly.

So a robust PM system doesn't just model the means, the location, the identity of something.

It has to model expectations about precisions as well.

It has to model its own level of uncertainty.

It needs priors for its own precision.

And this whole process has to be day's optimal too.

And this complexity is necessary because causes in the world interact constantly.

Let's go back to our sound source.

If a screen moves intermittently to obscure the sound, that causes unpredictable variability.

Right, so the system has to figure out,

is the sound itself just getting noisier, which would lower precision, or is a second interfering cause, the screen now present?

To distinguish those, you need a hierarchical structure.

Models operate at different time scales and constantly pass messages.

Predictions down the hierarchy and prediction errors back up.

Can you give us a clear example of how these time scales would interact?

Sure, think about a long time scale regularity, like knowing a train passes by your office every hour.

That slow regularity, the train schedule, has to influence predictions of much faster regularities like the words you hear in a conversation.

So if you hear a distant noise during a conversation, the hierarchy might decide.

Low level sound error is high, but the high level model expects a train.

So it'll down weight the error at the conversation level until the sound confirms the train pattern.

It sounds like the hierarchy is effectively building this rich layered internal simulacrum of the world.

And as the system minimizes error, the model parameters get shaped to mirror the causes of the sensory input.

Yes, and the key outcome is that minimizing prediction error demands that the internal models represent the world.

This isn't optional.

And it's not just simple correlation, which some 4E thinkers might grudgingly accept.

No, it involves rich structured operations, like the ones mentioned in the source, model selection and convolution of expected signals.

Right, model selection is constantly asking, is the sensory change due to a new cause or is it just the original cause acting noisily?

And convolution is even richer.

If the system models a cat and a fence, it has to blend or convolve the predicted sensory signals from both hidden causes to account for how they interact.

For example, how the cat is partially blocked by the fence, which changes its visual signature.

Exactly, that is far more than just receptor covariance, a neuron firing when a stimulus is present.

This is structured active internal modeling.

So the PEM framework is established as inherently inferential and richly representational.

Which sets up the ultimate showdown with those who claim cognition is anti -representational.

We've established that PEM and its basic structure is inferential and representational.

So now we face the 4E challenge head on.

How can a neurocentric theory account for the body action and situatedness?

Right, because the intuitive claim of inactive models is that embodied action is fundamentally not inference.

Action is our way of achieving direct coupling with the environment, allowing us to circumvent the slow heavy burden of internal computation.

But PEM has its philosophical sleight of hand.

Active inference, remember the core definition.

Any system that minimizes prediction error is approximating Bayesian inference.

In our sound source example, the system minimized error by revising its internal model,

that was perception.

But the body provides a second equally valid path to achieving zero prediction error.

Instead of revising the internal belief moving the prior from 90 degrees to 80, the agent could simply.

Change the sensory input itself via action.

If the prediction is 90 degrees, the agent turns its head 10 degrees to the left.

The sensory input now shifts from 77 to 87 degrees aligning with the 90 degree prediction.

So the action successfully minimized the prediction error.

And since minimizing prediction error is inference.

Then PEM successfully cast action as inference.

The agent acts to make its predictions come true.

There's no conceptual hurdle here for the inferentialist.

Action isn't separate from cognition.

It's a means of optimizing it.

But action requires intentionality, a goal.

Does active inference still need those same internal representations that for e -thinkers reject?

It absolutely does.

Active inference relies centrally on internal representations of desired future states.

Action only occurs when a specific hypothesis,

a representation of a future state that hasn't happened yet, accumulates sufficient evidence.

And crucially, because these are representations of the future, they're inherently detached from the immediate sensory input.

They stand for something that isn't currently happening.

Which is a classic definition of representation.

Furthermore, action demands robust modeling of expected precisions across various time scales.

Without a nuanced model of the consequences of action, the agent would immediately fall prey to the dark room problem.

Let's expand on this because it's a great thought experiment.

If the sole drive is to minimize prediction error right now, why not just walk into a dark, quiet room?

Sensory input is minimal, so prediction error is inherently low.

The problem is that staying in a dark room minimizes local immediate error.

But it's like catastrophic overfitting in machine learning.

It comes at the cost of massive prediction error in the long term.

It's biologically and evolutionarily detrimental.

Massively.

In the dark room, the agent fails to maintain its model of the outside world.

Its internal expectations about its own homeostasis fail, and you get eventual system collapse.

So, to survive, the agent needs to model self -involving regularities.

It needs to predict the sensory consequences of its own actions in the world.

Yes.

It needs a high -level self -model that predicts, for instance, that walking into a bright, complicated kitchen will cause an immediate spike in prediction error.

But that error will be quickly resolved and result in lower error over the long run.

Because it allows the agent to gather better, more precise evidence like finding food.

That makes active inference an evolutionarily necessary complexity.

So action is a necessity for maintaining a valid internal model.

But how does the agent decide, moment to moment, whether to act or just update its model?

The switch between perceptual inference and active inference.

This crucial switching mechanism is explained by precision optimization.

It's essentially the focusing and weighting of information, which functionally we call attention.

So attention isn't some spotlight of consciousness.

It's a probabilistic control mechanism.

Precisely.

Expected precisions determine the weights given to sensory input.

If the expected precision of a sensory stream is high, the system gates that input, prioritizing it.

If it's low, it's inhibited.

This requires a neuronal gating mechanism that functionally serves as attention.

Let's use the classic coffee cup example to see this in action.

Okay, suppose you have the goal, the high level prior to pick up the coffee cup.

You have a hypothesis that your hand will soon be at position X, where the cup is.

And your system expects higher sensor precision, better tactile and visual feedback when the hand is at position X than it is at position Y, where it's currently typing on the laptop.

Exactly, so the system starts gating the current low precision input,

the feeling of the laptop keys.

This action shifts the balance.

The hand at laptop hypothesis loses relative weight to the hand at coffee hypothesis.

Which generates a prediction error between the predicted state hand moving to X and the current state, hand at Y.

Because the system has optimized precision around the coffee cup, that prediction error gets high gain and the action moving the hand quickly minimizes it.

But dynamic exchange between perception and action is therefore just a consequence of optimizing expected precision.

This smoothly integrates the enacted aspect of 4E.

So now we address the core E's, embodiment and embeddedness.

We have a P agent that acts and is situated, but can it truly capture that sense of foundational embodiment that 4E thinkers claim is prior to inference?

Well, in this intergenerational view, purposeful behavior is entirely unified.

Desires are just high level priors beliefs about what future states should look like.

And perhaps most surprisingly, reward is simply the absence of prediction error.

Success is achieved when predictions match reality.

Right, but this requires the model to define the agent's desires.

What determines the content of that internal model?

What anchors the agent's deepest beliefs about what it should be doing?

Within the wider PEM framework, particularly the free energy principle,

the expected states that anchor all active inference relate fundamentally to the set points in the organism's homeostasis.

The desired range of temperature, glucose levels, oxygen intake,

things for biological survival.

Exactly.

This is a powerful move.

It grounds all perception and cognition in fundamental bodily needs, providing a robust foundational embodiment perspective.

All cognitive activity starts with the imperative to maintain biological existence.

And this embodiment is conceived probabilistically?

Correct.

The model dictates the probability of the organism finding itself in a particular range of states that are conducive to life.

This links the expected internal states monitored by interoception.

The sense of the body's internal state.

Tightly with the expected environmental states monitored by extraception, the sense of the outside world.

So using the fish example from the source, the interoceptive needs of the fish staying wet, maintaining oxygen, are inextricably linked with its extraceptive expectations, like predicting watery sensory inputs.

And if the fish senses dry air, the resulting prediction error isn't just an informational anomaly.

It's a direct signal of homeostatic failure.

That tight coupling between the interoceptive and extraceptive prediction error landscapes is what defines a living embodied PAM system.

And since perception and cognition are defined by this inseparable link, they can't be separated from the bodily or environmental aspects of the system.

This speaks directly to embedded cognition.

The agent is inherently situated.

But here is the critical philosophical pivot that refutes the radical 4E claim.

This deep reading of embodiment leads directly back to the necessity of inferential processing.

It doesn't bypass it.

Why can't the system just be deeply coupled and avoid the internal guessing game?

Because the organism is a finite system.

It can't possibly learn its model of survivable states by simply averaging all possible states in the world.

That would be an infinite task.

So instead it has to guess.

It must guess, form a belief or a model about what states it should be in and then strive to minimize the error between that belief and reality through inference and action.

In simple terms, the finite organism must avoid being surprised.

Okay, surprise in a probabilistic sense is the low probability of finding yourself in a state given your model.

A fish on a beach is massive surprise.

And this leads us to the free energy principle.

Right, the FEP.

It provides the mathematical bridge here.

The sum of prediction error across all levels of the hierarchy is a mathematically defined upper bound on this quantity surprise.

And this bound is called free energy.

So minimizing prediction error implicitly minimizes free energy.

But why is that so foundational?

It's about more than just data processing, isn't it?

Oh, much more.

Systems that minimize free energy are systems that maintain structural integrity.

They resist the natural entropic tendency to dissolve into the environment.

So a biological organism is this improbable, highly ordered system in a high entropy world.

And to remain ordered to maintain its existence, it must actively minimize surprise.

If free energy is high for too long, the system fails to maintain its boundaries, its homeostatic set points, and that leads to death.

Wow.

So the imperative to minimize prediction error isn't just a clever cognitive strategy, it's the universal imperative for all biological systems to maintain their existence and avoid thermodynamic dissolution.

Correct.

And that makes the inferential conception based on PNM inescapable for any system that is embodied and embedded in the way 4E suggests.

It has to infer the world to survive in it.

Okay, so many anti -representational 4E accounts prioritize this quick and dirty processing, often linked to the idea of affordances, to overcome the perceived computational bottleneck of traditional cognition.

How does PNM counter this speed critique?

Well, PNM fundamentally rejects the existence of that bottleneck, because the system never starts from scratch.

It's a fully formed hierarchy built on extensive prior learning.

Sensory input isn't something the brain has to painstakingly encode.

It's functionally the feedback signal to the forward prediction that's already been generated by the internal model.

This means the system isn't waiting for the input to arrive before it starts processing.

It's processing before the input even gets there.

Precisely.

The system relies on slow and clean learning, the meticulous accumulation of evidence over years, and even evolutionary time to facilitate swift and fluid perception and interaction in the here and now.

So the speed is achieved because the prediction is almost always right and the error is minimal, requiring only tiny fast corrections.

And this model easily accommodates the concept of affordances,

the action -guiding environmental elements that 4E loves.

Right, like a doorknob affords grasping.

In PNM, affordances are just defined as causes of sensory input that are strongly expected, based on prior learning, to give rise to high -precision prediction error.

What does that mean in practice?

For the doorknob, the system has a high -level expectation that when it moves its hand toward it, it will receive specific high -precision tactile and visual feedback.

Because the expected precision is so high, the system focuses attention or gating on it.

Which makes the error minimization happen rapidly and fluidly.

The agent knows exactly how to move to make the doorknob feel the way it expects it to feel.

Exactly.

But even in this swift processing, the rich internal representations are still necessary.

Right, because the PNM account counters the radical 4E idea that the world's affordances can be their own representation.

The brain has to have an internal model of the potential affordances to select which ones to focus on.

And this brings us to a crucial advantage of PNM flexibility.

The core critique of pure affordance -based accounts is that the agent seems too tightly coupled or tethered to its environment.

Right.

If the visual stimulus disappears, they struggle to explain how the agent can disengage, step back, and reconsider.

They lack genuine cognitive flexibility.

In contrast,

flexible cognition is central to the hierarchical Bayesian framework.

The PNM system has to model the world's dynamism.

The hierarchical model builds expectations not just for the causes of input, but for the typical evolution of prediction error precision.

The system knows that any precise current hypothesis like, I am standing on solid ground, will have a limited lifespan.

It anticipates the inevitable failure of its current model because the world is always changing.

Yes.

Take the baseball player example.

The player is accurately tracking the ball now with high -precision visual input.

But the high -level model knows the sun will soon set, making visual input inherently less reliable.

So the system anticipates the failure of its current visual hypothesis.

And begins to accumulate evidence for the next precise hypothesis like, I am moving toward the locker room, or I am eating dinner, before the current hypothesis fails catastrophically.

This means the agent has to constantly balance holding a hypothesis stable with adjusting the internal model.

Exactly.

It must intersperse active inference, which is acting to confirm a stable hypothesis with perceptual inference, which is checking and adjusting the model.

If it only does active inference for too long, the world might change drastically behind the scenes.

Leading to a massive error spike when they finally pause to look around.

So the PM framework's complexity,

the hierarchy, the modeling of precision, is necessary precisely because we live in a changing world.

And this is the ultimate synthesis with 4E.

The structure of the cognitive system is the way it is, because the agent's world and body are the way they are.

But the solution isn't to tie cognition closer to raw input, like Radical and activists suggest.

No, the solution is to retract farther away from the immediate world, building a vast, rich internal model of expectations that anticipates and manages that complexity.

Cognition remains a matter of richly represented expectations seeking confirming feedback.

We've established that PM gives us an agent that acts, is embodied and embedded, all while staying inferential.

But this raises a fundamental philosophical question.

Where does the mind actually end and the world begin in this model?

The process of minimizing prediction error leads to a profound result.

The more the system successfully minimizes error, the more it accumulates evidence for its own internal model.

The PEM system is inherently self -evidencing.

It confirms its own existence and its hypotheses through its own activity.

And this self -evidencing system necessarily creates a boundary.

It does.

There's a distinction between what gathers the evidence, the mind or the model, and what the evidence is of the world or the hidden causes.

The external world is known only vicariously through inference.

This is the epistemic boundary.

But PEM also provides a strict causal definition of the boundary drawn from probability theory known as the Markov blanket.

The Markov blanket is a concept from causal network theory.

Imagine a system where the internal states, the brain states are trying to infer external causes.

The Markov blanket is the set of states that shield the internal states from the external environment.

Okay, I need an analogy here because Markov blanket is pretty abstract.

Okay, think of a space station floating in space.

The internal environment, the astronauts, the computers, is shielded from the vast complex external environment.

The Markov blanket is the hull of the space station.

Which includes the sensory portals like windows and antenna and the active components like thrusters and airlocks.

Exactly.

So to know everything happening inside the space station, you only need to monitor the activity on the hull, the sensory input and the thruster output.

You don't need to model the sun and Jupiter in perfect detail.

Got it.

The implication is that the activity of the states within the Markov blanket, the mind, is wholly determined by the states on the blanket.

And this provides a principled causal way of defining the boundary of the cognitive agent.

If a state is necessary for performing the inference, it has to be within that blanket.

This brings us to the final E, extended cognition.

The idea that external objects, a notebook, a smartphone, can become so integral to our cognitive process that they should be considered part of our mental states.

P .M.

immediately runs into a genuine dilemma with this hypothesis.

Let's take a smartphone.

On the one hand, the smartphone is a hidden cause being inferred.

As a cause, it is outside the mind, beyond the Markov blanket.

But if it's an extension of the mind, it must be within the sensory boundary part of the internal states doing the inferring.

That's the philosophical contradiction.

How can something be simultaneously the cause being inferred, so it's external, and part of the system doing the inferring, which is internal?

They can't be both.

The contradiction suggests either the extended cognition hypothesis is false or that there might be multiple coexisting sensory boundaries nested Markov blankets.

Which is probably true for complex systems, but what's the explanatory cause?

The problem arises when we try to define the agent.

If the model inside the blanket is the agent and we have multiple nested blankets, my brain model, my brain plus smartphone model, then we have multiple agents coexisting.

Which gets very messy, very fast.

It does.

If we ask which agent is truly acting when I write a reminder on my phone,

there are multiple technically correct, but philosophically confusing answers.

So to solve this, the chapter suggests a pragmatic solution.

Identifying the agent using inference to the best explanation.

Right.

We define the agent as the system whose relatively invariant sustained involvement accounts for most of the observed behavior over the longest time scale.

And it seems highly probable that this pragmatically identified agent would be the one specified by the model harbor just in the nervous system.

Yes.

This is the agent relative to which prediction error is minimized over the most crucial long time scale of survival.

And if we adopt that view, it strongly suggests there is no extended cognition.

The notebooks and phones remain beyond the one dominant Markov blanket.

This leads us to the crucial concluding insight.

The sensory boundary between mind and world as illuminated by PM has this intriguing duality.

The first boundary is the epistemic boundary.

The world's causes are known only vicariously through inference.

The mind is locked in, guessing.

And the second boundary is a causal boundary defined by the Markov blanket.

There is a constant dynamic coupling between the internal model and the external causes enabled by perception and action.

That is a profound synthesis.

The conclusion is that PEM throws light on embodied agents dynamically interacting with their embedded environment.

And this good fit with 4E is made possible precisely because PM is fundamentally inferential and representational.

It uses these complex internal models to anticipate and manage external reality, effectively turning the action and embodiment favored by 4E into necessary consequences of rational inference.

Ultimately, PM provides a unified probabilistic framework.

Inference and representation are not features we can discard.

They are necessary requirements for any system that must maintain itself in a complex dynamic world.

Action, embodiment, and embeddedness are all necessary consequences of the foundational imperative to minimize long -term prediction error.

Which leaves us with a final provocative thought for you, the listener.

If minimizing surprise or free energy is truly the foundational drive of all cognitive and biological systems, the universal imperative to maintain existence, does that mean our entire subjective experience is simply a constant lifelong effort to confirm our own best hypotheses about the world?

And if so, how much can we ever truly break free of our own priors?

Thank you for joining us for this deep dive into the philosophical core of cognitive science.

We'll see you next time.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
Predictive processing provides a theoretical framework that bridges representational accounts of cognition with the embodied, embedded, enacted, and extended perspectives central to 4E cognitive science. Drawing from Hermann von Helmholtz's foundational work on unconscious inference, the hypothesis conceptualizes the brain as a Bayesian inference engine that continuously generates internal models to explain the origins of sensory signals reaching the organism. The fundamental mechanism driving this process is prediction error minimization, wherein the cognitive system perpetually compares descending predictions against ascending sensory information, refining its internal representations based on the precision or trustworthiness of incoming data. Rather than treating 4E phenomena and representationalism as fundamentally incompatible positions, this framework demonstrates how embodied and enacted behaviors emerge naturally from sophisticated inferential processes. Active inference extends the predictive model beyond passive perceptual updating by proposing that agents reduce prediction error through deliberate bodily movement, thereby transforming action itself into an inferential process that alters sensory input to conform to expectations. The free energy principle expands this account to encompass embodiment and homeostatic processes, characterizing organisms as systems driven to minimize surprise at both neuronal and physiological levels in order to sustain viability and maintain viable states. The concept of affordances and the rapid, context-sensitive interactions organisms exhibit with their environments are reinterpreted not as non-representational reflexes but as high-precision predictions operating within hierarchical model structures that enable adaptive, flexible responsiveness to dynamic environments. The markov blanket provides a mathematical tool for delineating cognitive boundaries, demonstrating how self-evidencing systems naturally establish distinctions between internal states and external conditions, a proposal that accommodates situated cognition while offering principled constraints against unlimited extension of mental processes into the environment.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥