Chapter 45: Robots for the Study of Embodied Cognition

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement, not replace, the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Forget the brain scan.

I mean, really, forget the algorithms.

The most radical argument in cognitive science right now is that your brain is

And we're proving it by building robots.

Exactly.

Welcome back to The Deep Dive, where we're taking this really complex, cutting edge research and breaking it down into the insights you need.

And that really is the core conflict, isn't it?

For decades, the study of intelligence was just dominated by this computational idea.

The brain as a computer.

Right.

It was the era of what they called good old fashioned artificial intelligence or

AI.

Cognition was just information processing.

You're just running algorithms over these abstract symbols that represent the world.

The clean metaphor was always the brain is the hardware.

Yeah.

And your thoughts?

That's the software.

Yeah.

But that metaphor has just run into a brick wall of evidence.

It's incomplete at best.

Severely incomplete.

The pure computational view is so limited because it just leaves out the two things that define all biological intelligence,

embodiment and embeddedness.

OK, let's define those terms right up front.

Sure.

So embodiment is the agent's physical setup, the specific design of its sensors, its muscles, its bones.

And then embeddedness is that constant, messy, necessary interaction with the real world.

So you really can't separate thought from the constraints of having a body that's stuck in a world.

You can.

And that's why pure software simulation in some sterile computer environment just isn't enough to understand what intelligence actually is.

And this brings us to the central thesis of our deep dive today,

robots.

Robots are the essential tool for studying cognition from the bottom up.

We use them because they're real physical systems.

They have to obey the laws of physics, which forces our theories to be explicit and concrete and complete in a way that just thinking about it in the abstract can never be.

So our mission for you today is to walk through this progression.

We're going to start with behaviors that require, and this is amazing,

zero computation.

Literally zero.

Then we'll move into how to quantify the link between action and perception using some pretty cutting edge information theory.

And finally, we'll see how these state of the art humanoid robots are helping us tackle the biggest questions about how humans develop and, you know, become aware of their own bodies.

We're really going to see how the physical shape of a creature fundamentally dictates what it can know and what it can learn.

Let's get into it.

OK, so let's unpack this journey, starting at the lowest possible level.

Zero cognition.

Right.

This is one of the most surprising things in the source material for me, that these complex coordinated behaviors we automatically think of as intelligent, like walking or, you know, avoiding something.

Yeah, things that seem to require a brain.

Exactly.

They can often be done with no computation at all or minimal involvement.

The intelligence is just baked right into the mechanics.

That's the starting point.

We have to acknowledge how powerful physical structure is.

It can simplify the job of the brain, the controller so much.

And the classic example here is the passive dynamic walker developed by Tad McGeer back in 1990.

That's the one.

And when we say this thing is minimal, we mean truly minimal.

Describe it for us, because this is not like a Boston Dynamics robot we're picturing.

Oh, not at all.

Far from it.

This thing is basically just two legs connected at a hip joint.

That's it.

That's pretty much it.

I mean, it looks sort of like a person's lower half.

But and this is critical.

It has no torso, no sensors, no motors and absolutely no control electronics.

There's no computer chip anywhere.

And yet it walks.

How can something with no power, no brain achieve a stable rhythmic walk?

It's all about the physical setup and gravity.

You just place it on a slight downward slope and its own forward momentum and the natural swing of its joints.

They just interact harmonically.

It's like it's perpetually falling forward in a controlled way.

That's a great way to put it.

The resulting movement, its stable gate, is purely a consequence of these carefully tuned mechanical things.

The leg lengths, how the mass is distributed, the shape of the feet, gravity and the system's own momentum do all the work.

So the intelligence isn't in a nervous system calculating every step.

It's an emergent property of the physical thing itself.

Exactly.

It's architecture over algorithm.

This shows us the concept of mechanical feedback loops and self -stabilization.

So think of it in terms of dynamical systems, that walking trajectory, that stable gate, it acts as a powerful attractor.

An attractor, meaning that's the state the system just naturally wants to be in.

Precisely.

If there's a small perturbation, like say the foot lands a little bit off balance, the mechanical properties of the legs and the joints just automatically correct for it.

Without any brain telling it to.

Without any input at all.

It just brings the system back to that stable, efficient gate.

The control is inherent in the design.

The body itself is a kind of analog computer, solving the problem of walking in real time.

That feels almost counterintuitive, though.

If the control is just physics, how do we study it?

Where does the physics end and the cognition actually begin?

That is the million dollar question this research forces us to ask.

The physics sets this this low energy baseline.

Nature finds the simplest solution first and then builds on it and then builds on it.

We can see this when we want the walker to do more than just go down a gentle slope.

You're talking about the adaptations people made to the passive walker.

Yes, exactly.

Collins and his colleagues in 2005, they adapted the passive walker by adding just a couple of very simple actuators and a controller based on reflexes.

So not a central computer running complex optimization.

No, no, just simple reflexes.

And this little addition allowed the walker to work on level ground.

It hugely broadened its niche while still keeping the control effort and the energy use at an absolute minimum.

So the body is still doing most of the work.

The body is the primary worker.

The minimal control system just gives it a little nudge when it needs it.

OK, so let's move up that complexity ladder a bit.

Let's introduce the bare minimum of neural processing.

We're not talking about sensory motor loops.

Direct reflex like loops.

This is what creates what we often call reactive intelligence.

And this takes us back to some of the early pioneers of this stuff.

It does.

It takes us back to people like Gray Walter, who built these little electronic machines back in 1953 that showed photo taxes.

They were attracted to light, but with really minimal internal parts.

And this idea that complex behavior can emerge from extreme simplicity was really elegantly laid out by Valentino Bradenburg.

Oh, yes.

In his amazing 1986 book Vehicles.

What was the idea there?

Bradenburg just imagined the series of simple two wheeled machines.

In the most basic ones, the sensors, say for light, were wired directly to the motors.

So like the left light sensor stimulates the right motor.

Something like that.

Yeah.

So if the left sensor got stimulated, it might speed up the right wheel, causing the vehicle to turn toward the light source.

So there's no internal map of the world.

No planning is just a direct link.

Sensation leads to action.

That's it.

And what's so fascinating is that even these primitive deterministically wired little vehicles showed behavior that looked surprisingly complex, unpredictable, even like they were shy or aggressive toward the light.

Exactly.

And why?

Because the continuous nonlinear interaction with the noisy, messy real world introduces all this variance and emergent behavior that the designer never explicitly programmed in.

This is where Rodney Brooks's famous line comes from, isn't it?

It is.

The world is its own model.

And that realization that a complex environment means you don't need a complex internal model.

That became the foundation for a huge pivot in A .I.

led by Brooks.

He mounted an explicit anti -representationalist attack on the whole GEOFAI orthodoxy, which, just to remind everyone, was this idea from thinkers like Jerry Fodor in the 70s that intelligence had to have abstract models and symbolic logic.

Right.

Manipulating formal knowledge.

But when Brooks started building these real physical insect like robots, he found that that whole abstract approach just failed miserably.

The real world was too messy.

It's too complex, too variable.

The embeddedness.

Any internal model you build has to be constantly updated and it just becomes brittle and out of date almost instantly.

So his famous conclusion was?

That when we examine very simple level intelligence, we find that explicit representations and models of the world simply get in the way.

It turns out to be better to use the world as its own model.

So instead of building a perfect internal map of a changing world, just sense and react fast enough.

That's the core idea.

And this led him to create the subsumption architecture in 1986.

It was a completely decentralized approach, rejecting that central control system that GEOFAI loved.

How did he actually build that?

Can you give us the specs on the subsumption architecture?

Sure.

It's structured as this vertical stack of independent layers, and they all operate in parallel.

The lowest layers, they handle the most basic survival behaviors like avoiding obstacles or just moving around aimlessly.

And each layer is just a simple sensor to motor connection.

A simple finite state machine.

Yeah.

And the higher layers handle more complex, longer term goals, like finding a charging station or exploring a room.

And the key is this idea of subsumption.

Right.

Higher layers can override or subsume the outputs of the lower layers.

So you get this complex behavior emerging without needing one single master model of the world that's coordinating everything.

So a higher layer might be saying go forward.

But if a lower layer suddenly detects a cliff edge, the lower layers stop and back up command instantly overrides the go forward command.

And the central planning system, if there even was one, would never even be aware of the threat.

It's all handled locally and immediately.

That architecture was just a foundational step then.

It showed intelligence could be situated and interactive, not just rigid and planned.

It really marked the beginning of the embodied approach in practical, real world robotics.

OK, so far we've covered agents that are either purely physical, like the passive walkers, or purely reactive, like the subsumption agents.

Right.

They're all constrained to the here and now, like a simple reflex arc.

But wait.

If the world is the model,

how do these robots ever plan for the future?

How do they learn a long term goal?

That takes more than just a reflex.

And that is the pivot into the next stage, what we call minimal cognition.

To get to what we'd call genuine cognition, the agent needs to be able to move beyond that immediate moment.

It needs to be able to think offline.

Exactly.

It needs some kind of internal simulation or emulation.

The ability to separate itself, even just a little, from the immediacy of the world.

And what's the most primitive mechanism that allows for this leap into foresight?

It's believed to be the forward model.

This is a simple but incredibly powerful neurological trick.

What is it?

It's a mechanism that lets the agent predict a future sensory state, given its current state and a motor command it's about to issue.

How does that prediction actually work?

It relies on something called the efference copy.

When your brain sends a command to, say, move your arm, a copy of that command gets sent not just to the muscles, but also to this specialized area, the forward model.

So it's like a CC on an email.

A CC, exactly.

The forward model takes that command, combines it with its current sense of the body's position, and it predicts what sensation should be generated as a result of that action.

So if I decide to move my eyes, the forward model predicts the visual scene is going to shift in a very specific way.

Precisely.

And if the prediction matches the reality, the agent knows the movement was successful and that everything is internally consistent.

If there's a mismatch, that's a surprise.

And that's what drives learning.

This capacity for internal prediction is seen as, what, the evolutionary origin of simulation.

That's a theory, yeah.

It allows a creature to internally test out movements before committing to a costly or maybe dangerous action in the real world.

This sounds a lot like the concept of the Papyrian creature.

It is, yeah.

Daniel Dennett coined that term, and it's from the philosopher Karl Popper, who said that rational agents are the ones who can let their hypotheses die in their stead.

So instead of being killed by a bad decision, the idea of the bad decision dies in your internal simulation.

A purely reactive agent might just walk toward a cliff edge and fall off.

A Papyrian creature mentally simulates walking to the edge, predicts the sensory outcome, you know, no ground falling, and then chooses a different action all without risking its physical body.

That's foresight.

But it's really important to stress this minimal cognition.

This is an abstract symbolic reasoning yet.

Not at all.

It's all happening directly in the sensor or motor space.

The agent is just extracting regularities from the tight loop between what it does and what it senses.

And that's the key distinction.

It is.

The agent doesn't have an abstract concept labeled ledge.

It just learns that this specific sequence of motor commands like move my legs in pattern X reliably leads to this specific pattern of sensory input, like a sudden drop in pressure on my foot sensor and a big visual discontinuity.

That learned regularity is its knowledge.

Which brings us powerfully back to the embodiment constraint.

That sensor motor space is totally and fundamentally shaped by the agent specific body.

The body is the critical filter.

It's the generator of what's possible.

I mean, think about the motor signal.

You need to grasp a delicate egg versus a heavy baseball.

Completely different.

Totally.

And the success of that action depends entirely on the physical geometry, the muscle tension, the joint limits of your specific arm and hand.

If you change the body, say, replaced a human hand with a tentacle, the exact same motor command for grasping the egg would fail.

So the body dictates the rules of the game.

It dictates the rules of engagement.

And those rules of sensor motor contingencies are what the agent has to learn.

So if the sensor motor space is the raw material for learning, we need some rigorous ways to measure and quantify its quality and its structure.

Because what an agent can learn is fundamentally limited by the information its body even lets into the system.

Right.

And that information is being filtered by the morphology, by the physical shape of the body.

So the body isn't just this passive container for the brain.

It's an active processor.

It's doing these crucial transformations of information before that info ever hits a neuron.

A really great biological example of this is the housefly eye.

So insects have these compound eyes with facets, right?

And researchers found that these facets are arranged in a non -homogeneous way.

Meaning they're not evenly spread out.

Exactly.

They're much denser in the front of the eye than on the sides.

And this specific arrangement performs this incredibly advantageous physical transformation.

What does it do?

It automatically compensates for motion parallax.

Wait, hold on.

Motion parallax is the way things in the distance seem to move slower than things up close when you're moving.

That's it.

And you're telling me the shape of the eyeball itself is physically doing the math to correct for that?

That's what the research suggests.

It's an incredible piece of evolutionary engineering.

The morphology, the physical structure does the complex math that a computer chip would otherwise have to calculate.

And because that pre -processing happens physically?

The fly can use much simpler uniform motion detection circuits across its whole eye.

It makes avoiding obstacles super cheap and super fast computationally.

The body makes the computation easy.

That's the perfect way to put it.

OK, that makes the case for embodiment incredibly clear.

But what if the interaction is too complex or the specific physical transformation isn't obvious?

We need a general tool for this.

And that is where information theory comes in, specifically Shannon entropy.

It lets us quantify the statistical patterns and structure that are induced by the body's shape and its interaction with the world.

So instead of looking for one specific calculation, we're just measuring the amount of structure or regularity in the system.

Right.

But we can't just look at the sensory data alone.

That's a passive view of perception.

Organisms are in this closed circular loop action affects sensation, which affects the next action.

So we have to study the relationships between sensors and motors over time.

And this need to study the structured relationship between action and sensation brings us to a core concept in 4E cognition.

Sensor motor contingencies or SMCs?

Proposed by O 'Regan and Noe back in 2001.

And what are SMCs exactly?

Formally, they're the structure of the rules governing sensory changes produced by motor actions.

OK, in simpler terms.

They're the rules of the game.

If a creature masters its SMCs, it knows what its body can do and how the world will respond to its actions.

And the theory says there's a hierarchy to these rules.

Yes.

Starting with the most basic level,

modality related SMCs.

These capture the immediate reflexive effect of an action on your senses, and they depend heavily on the physical shape of your sensor.

Can you give us an example?

Sure.

Think about turning your head.

When you turn your head, your visual field changes drastically and quickly.

That's a very high contingency for vision.

Right.

But turning your head barely changes what you're hearing.

So that's a low contingency for acoustics.

The sensory apparatus, the eye versus the ear, dictates the nature of that rule.

These are the rules that let a baby figure out when I turn my head.

The world slides.

That's it.

And the next level builds on that.

It moves toward identifying objects.

This is the level of object related SMCs.

Right.

These emerge from longer, more complex interactions.

These are the rules that define the object itself.

For example, the sensor motor pattern you need to squeeze a rigid block is totally different from the one you need to squeeze a soft sponge.

And that persistent difference in interaction is what defines the object's identity for the agent.

Exactly.

OK, so the SMC concept is really powerful in theory.

But to actually use it in robotics, it had to be made mathematically concrete.

It did.

And one attempt was the dynamical systems account.

Researchers like Berman tried to formalize these rules by modeling the whole closed loop of the agent and the environment.

And they made some key distinctions.

Let's break those down.

What's the most basic level?

That's the SM environment.

This is just the pure physical relationship between a motor action and a sensory change.

It's completely independent of the agent's brain.

Like a physics experiment.

Exactly.

If I apply 10 Newtons of force to a spring, it compresses by X length.

Pure physics.

But once you add the agent's own control system, its brain, that changes what actions are even possible.

And that moves us to the SM habitat.

This is the subset of all possible movements that the agent's internal dynamics, its neural processes, actually allow.

A robot might be physically capable of flailing its arms wildly.

But its programming limits it to smooth, stable walking.

Right.

So its habitat is much smaller than its environment.

And the next level is about using those movements for a task.

That's SM coordination.

This narrows it down even further to only those movements that are functionally relevant to a goal.

If a robot's task is to figure out how hard an object is, it's not going to use random movements.

It'll use specific squeezing and probing pattern.

The functional patterns.

Exactly.

And then the highest level is SM strategy.

And this is where reward comes in.

Right.

This is about adding value.

The robot chooses a specific coordinated action because it leads to a desirable outcome, like using a fast gate to get to a charging station before its battery dies.

OK, that dynamical systems framework is really thorough.

But you mentioned that information theory has actually been more successful when applied to real robots.

Why is that?

It's mostly a practical thing.

The dynamical systems approach is great for theory.

But information theory and specifically a measure called transfer entropy is just better at analyzing high dimensionality, noisy, real world data.

And critically, transfer entropy lets you quantify causality, right?

Directed information flow.

That's the key advantage.

Standard correlation can't do that.

Correlation might tell you that my leg motor moving and my knee sensor changing are related.

They happen at the same time.

But it doesn't tell you if the motor caused the knee to change or if the knee changing caused the motor to adjust through a reflex.

Exactly.

Transfer entropy, because it looks at predictability over time, can tell us the direction of that flow.

It can tell us how much knowing the state of A helps predict the future state of B.

And that's vital for mapping out these closed causal loops of embodiment.

OK, this is where it gets really interesting.

We're taking this abstract math and actually applying it to a physical system.

The researchers used a real quadruped robot they call Puppy to quantify all this stuff using transfer entropy.

Right.

Let's talk about Puppy's embodiment.

It's a four legged robot, so it's a complex but manageable system to study.

Third parts.

It's got four hip servo motors, which are the main power source, and they report their joint angle.

It also has four passive compliant me encoders.

And then crucially for sensing the ground for pressure sensors on its feet.

So that's what 12 different data streams all interacting with each other continuously.

It is.

And the tool, again, is transfer entropy.

The idea is to visualize the information flows as a network.

The nodes are the sensors and motors and the arrows show the flow of information.

So a thick arrow from A to B means A is a really good predictor of B's future state.

A strong causal link.

It lets you literally see the robot's functional topology.

OK, so experiment one was about extracting the robot's intrinsic physical structure, the SM environment.

How did they do that?

The method was actually pretty clever.

They just applied random motor commands to the robot.

So just making it twitch its legs randomly while it's standing still.

Pretty much.

And then they observed the statistical relationships that popped out.

Since the commands are random, any structure you see has to be inherent to the robot's physical body.

And what did that information flow map look like?

It was really clear and logical.

The strongest flows, the thickest arrows were from the motor signals to their own respective hip joint angles.

Well, yeah, because the motor directly drives that joint.

Of course.

But more subtly, the analysis successfully pulled out the robot's fixed body topology.

It showed these smaller but still significant flows from the motors to the knees and the foot sensors on the same leg.

So even with random noise going in, the physics of the system guarantees a specific predictable output pattern.

Exactly.

And this map of physical cause and effect, that's analogous to the modality related SMCs.

The robot just through this analysis automatically defines the immediate effects of its own body movements on its own sensors.

OK, but the real world doesn't run on random twitches.

What happens when the robot actually does something functional like a coordinated gate?

This is the next experiment.

And it moves us into SM coordination or strategy.

And this revealed something fundamental about action itself.

It really did.

When they compared the random twitching to specific optimized gates like turn left or bound right, two huge things jumped out.

First, the overall amount of information flow went way up.

Dramatically.

The action itself amplifies the structure in the loop.

It shows you how much structure and predictability the neural pattern generator, you know, the gate program is actively imposing on the interaction.

The brain doesn't just react, it actively structures the information it gets.

And second, the patterns themselves changed to reflect the function of the gate.

Oh, completely.

The map for the turn left gate looks nothing like the random map.

A turn requires you to push asymmetrically, right?

Mm hmm.

So the analysis showed that the motor to sensor flows were now dominated by the specific joint that was critical for that maneuver.

Like it might highlight the huge role of the right knee joint in predicting the whole movement.

The functional goal turning selectively amplifies certain pathways and quiets others.

Exactly.

The robot's goals are actively shaping its perception.

OK, so now let's bring in the last piece of the puzzle.

The environment.

Embeddedness.

How did the SMCs change when the robot ran that same bound right gate, but on different surfaces?

This is where we get the full brain body environment loop.

And the environment provided this critical modulation of the patterns.

They were the surfaces.

They used three plastic foil, which is very low friction,

styrofoam and then rubber, which is very high friction.

And what specific changes did they see in the information flows?

Well, look at the difference between running on the slippery foil versus the grippy rubber.

As friction went up, the motor flows became much more dedicated to stability and adjusting propulsion.

And so specifically, some flows leading to the foot pressure sensor actually disappeared, while flows leading to the knee joint encoder became way more dominant.

And why that specific shift?

Because the robot's whole posture changed to cope with the friction on the rubber.

It had to push off much harder, which resulted in more knee bending.

So the knee joint information just became much more important in the overall flow.

So this leads to a really crucial insight.

The action the robot used, the gate, its internal dynamics.

That was the main thing shaping the basic structure of the sensor motor space.

Right.

The environment, the ground, it only modulated that structure.

And this quantitative understanding allowed them to test a prediction from SMC theory about perception.

They did.

They tested the robot's ability to do a terrain classification task, you know, to identify if it was running on foil, styrofoam or rubber.

And what did the results show about the need for action?

The robot's accuracy shot up when the action context, the specific gate it was using, was taken into account compared to just relying on the sensory data alone.

So knowing what you're doing helps you interpret what you're sensing.

Absolutely.

And they also confirm the prediction about object related SMCs needing sustained interaction.

Right.

Oregon and Noe said you need longer sequences.

Exactly.

The researchers found that if the robot only used these little split second sensor motor snippets, classification was poor because the modality related SMCs, just the effects of its own body moving dominated.

But when they use longer sequences.

Up to six seconds of running, the accuracy improved significantly.

It confirmed that to perceive an object, you need the sustained functional interaction to let the object's own properties, its specific SMCs, stand out from the noise of your own movement.

So we've established the grounding.

Robotics forces our theories to confront physics.

But the ultimate goal here is understanding human cognition.

And that means our robots have to start approximating human morphology, our shape, our sensor distribution, our mechanics.

And this is where we run into the challenge of abstraction.

When you look at even the most advanced humanoids today, like the speech robot Erica, they're fundamentally abstracted.

They lack true biological complexity.

Totally.

They have very few muscles compared to us.

No pain receptors,

no inner organs, no metabolism.

Erica, for all her realistic look, she can't feel thirst or get out of breath.

Her internal constraints are so impoverished compared to ours.

But there's a scientific justification for this abstraction.

There is.

Abstraction is necessary.

It forces us to distill the problem down to the essentials.

We're looking for the necessary and sufficient conditions.

So even if a robot specific sensor, a motor idea of a glass is different from a human's, if it can still successfully recognize a glass and fill it and hand it to you based on these embodied principles, then we can be pretty confident that the underlying principles like discovering SMCs are shared.

Let's look at the different kinds of robots researchers use for specific cognitive goals.

Where do they start for motor control for complex motor learning?

They use musculoskeletal robots like Kenshiro and Roboy.

Kenshiro is astounding.

It has 160 muscles, 160.

That's insane.

And it is 50 in the legs, 76 in the trunk.

It's the closest we've gotten to a human in terms of musculature.

With that many degrees of freedom, you can't possibly program a single smooth action.

You can't.

It's chaos.

So learning is an absolute necessity.

These robots have to rely on motor babbling, just randomly generating movements and feeling the result to build up an internal body schema.

And what about studying how cognition develops from the very beginning?

For that, you have the baby robots like the fetus simulator or the iCub.

iCub is the child size one, right?

That's it.

It has 53 degrees of freedom, but more importantly, it has over 4000 tactile sensors covering its whole body.

That rich sense of touch makes it the perfect platform for studying development and the role of the tactile sense, which is the first one to develop in humans.

And finally, you have the specialized social robots.

Right.

Social interaction robots like Erica or Pepper.

For them, haptics and touch are secondary to communication.

They're all about sophisticated speech, facial expression, emotion recognition.

So they're the primary tools for studying things like social dynamics or the uncanny valley hypothesis.

Exactly.

That feeling of unease you get when a robot looks almost but not quite human.

The developmental approach really argues that a lot of critical constraints on our cognition get locked in very, very early, even before birth.

The work on the musculoskeletal fetus simulator is powerful evidence for this.

They modeled a fetus at 32 weeks capable of kicking and jerking, which are these essential early movements.

And then they messed with its embodiment.

They did.

They manipulated one key thing,

the distribution of its tactile receptors.

So like running A -B tests on sensory development.

Essentially, yeah.

And the finding was profound.

They found that a natural, non -homogeneous distribution of tactile receptors denser in some places, sparser in others, was necessary for the fetus to develop normal kicking and jerking movements.

And a uniform, evenly spread distribution failed.

It failed to develop those crucial, structured early behaviors.

So the shape of the body's sensor array fundamentally dictates the emergence of basic motor behavior.

Embodiment is a constraint from day one.

And that somatosensory input, that touch and proprioception, it remains key in early infancy.

This is what the iCub robot is engineered to study.

Because of its huge, rich, whole -body, tactile array,

researchers can study how an infant learns its own body through self -touch.

How does that self -touch lead to learning?

Researchers showed how iCub can use self -contact to automatically calibrate its own body model.

So by bumping its hand against its torso, for instance, the robot generates this distinct, coupled sensor -motor information.

The motor command for the arm and the resulting touch sensation on the torso.

Exactly.

By repeating those kinds of self -contact sequences, the robot develops an internal, self -consistent map of its own physical limits and its own structure.

It's a form of embodied self -calibration.

Let's talk about the practical applications of this, especially for safety.

As robots move into our homes and streets, they have to share space with us safely.

Absolutely.

And that means they need to ditch the traditional engineering model of the body and adopt something more brain -like, what we call the body schema.

OK, let's contrast the two.

What's the traditional robot model?

That's the one you'd find in a factory arm.

It's explicit, a one -to -one mathematical map from joint angles to the hands position.

It's unimodal, so it only relies on joint angles.

It's centralized, one model for the whole robot.

And critically, it's fixed.

The manufacturer sets it and it never, ever changes.

That sounds efficient, but completely brittle in the real world.

It's the total opposite of biological resilience.

The human or animal body schemer is radically different.

It's implicit, not explicit.

Information about your arm length isn't stored in one place.

It's available through the dynamic relationship between different variables.

And it's highly multimodal.

Incredibly so.

It's drawing from touch, muscle stretch, vision, balance, all coupled with motor commands.

And it's distributed across many overlapping brain areas.

But most importantly, it's plastic.

It changes.

It is incredibly plastic.

If a factory robot has a joint that locks up, it just stops working because its fixed model is now wrong.

But if you sprain your ankle, your brain instantly updates its internal map of what your body can do.

It even incorporates tools.

A monkey using a rake will temporarily extend its body schema to include the rake.

So the practical goal is to get robots to have that kind of plastic, implicit schema.

And the payoff is what?

Autonomy, robustness and safety.

Brain -like models allow for what's called continuous self -modeling.

They can dynamically adapt to things like muscle fatigue or wear and tear or a sudden joint failure.

That's mandatory for true autonomy.

And the safety aspect is paramount.

Absolutely.

Humans use this multimodal body awareness to construct a peripersonal space or PPC.

It's this dynamic virtual margin of safety right around our bodies.

So it integrates vision and touch to prepare you to react to something looming towards you.

Exactly.

And a robot needs to develop its own PPC to go from being a dangerous, rigid machine to a safe collaborator.

And we're starting to see the first steps of that.

We are.

Early iCub studies showed it learning the boundary between objects that are far away and those close enough to require a preemptive action.

That understanding of my body and my immediate space is just non -negotiable for safe human -robot interaction.

Wow.

What a deep dive into this brain -body environment, Lou.

We started with purely passive mechanics and we've gone all the way up to these complex humanoids that can do continuous self -modeling.

The main contribution of robotics to cognitive science just seems so clear.

It provides the essential physical grounding for any model of the brain.

The synthetic methodology, this idea of understanding by building, it forces our theories to be explicit and complete because the robots are real physical things.

They avoid the reality gap you get with pure simulation.

You have real physics, real complex sensory input.

It makes the study of embodiment both feasible and rigorous.

And robotics just naturally fits three of the four E's of four E cognition.

It does.

They're embodied with specific physical structures.

They're embedded, interacting with a real environment.

And their cognitive system can be extended out into the world like we saw with the SMC framework.

And they're just uniquely suited for these bottom up developmental investigations.

You can simulate conditions like messing with tactile receptors in a fetal simulator that you could never study otherwise.

But now we have to turn to the ultimate philosophical and engineering hurdle, the fourth E, the challenge of an action.

OK.

While most of robotics is happy with this extended functionalism view, an act of cognitive science argues that simple sensor motor loops might not be enough to induce truly lifelike properties, things like intentional agency or meaning.

So what does an action require that our current robots are missing?

It connects life and cognition.

It argues that intelligence needs characteristics you find in living systems like autopoiesis, the ability to self -produce and maintain yourself or metabolism.

And our robots can't metabolize.

Not yet.

So instead, they must be subject to precarious conditions.

Precarious conditions, meaning the robot has to have something essential it needs to regulate to survive.

Precisely.

Its sensor motor organization has to regulate its interactions in relation to a viability criterion.

Like maintaining its battery charge or not overheating.

Exactly.

And if the agent fails to maintain that viability, it dies.

That raises the stakes dramatically.

We're not just designing a robot to do a task anymore.

We're designing a robot that cares about its own existence.

It is the ultimate engineering constraint.

This approach, while it's incredibly difficult and it's not going to produce useful commercial robots right away, it forces the robot to generate its own systemic identity, to self -regulate its own existence.

That requirement, the threat of death by a drained battery.

That might be the only path to truly autonomous, self -sustaining and maybe even meaning generating robots.

It really asks the question,

can we build systems whose cognition is inseparable from their will to exist?

An incredibly challenging and thought -provoking place to leave it.

Thank you for diving deep with us into how the humble physical body recreated in steel and plastic is rewriting the rules of cognitive science.

My pleasure.

We really appreciate you joining us on the deep dive.

β“˜ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
Robotics provides a powerful experimental framework for investigating how cognition emerges from the dynamic coupling between an agent's physical structure, its sensory and motor systems, and its environment. Traditional computational approaches treat cognition as abstract information processing independent of bodily form, but embodied perspectives demonstrate that intelligence arises fundamentally from how organisms interact with their surroundings through their particular physical instantiation. The simplest demonstrations of this principle appear in mechanical systems like the passive dynamic walker, which achieves complex locomotion through structural mechanics and gravity alone, with no sensors, motors, or controllers. Building from such basic examples, researchers have developed progressively more sophisticated robotic systems to understand how low-level behaviors emerge from sensor-motor coupling. Sensorimotor intelligence, exemplified by Walter's and Braitenberg's vehicles, shows how deterministic connections between sensors and motors generate apparently intelligent behavior without internal representation. Rodney Brooks extended this work through behavior-based robotics and subsumption architecture, rejecting the traditional representational model in favor of letting the world serve as its own model. As cognition becomes more elaborate, internal simulation enters the picture, allowing agents to model future sensory consequences of their actions and decouple thought from immediate circumstances. Sensorimotor contingencies represent the core structural relationship between action and perception, formalizing the rules governing how motor commands produce sensory changes. These contingencies vary according to embodiment, as demonstrated through information-theoretic analysis of robots like Puppy, where transfer entropy reveals how an agent's morphology and motor strategies shape information flow from environment to sensors. Advanced experimental platforms including humanoid robots such as Roboy and Kenshiro, along with developmental robots like the iCub, permit investigation of human-like cognition while necessarily abstracting from actual human physiology. Biological systems construct body representations that differ fundamentally from engineered models, displaying distributed, multimodal, plastic, and implicit organization rather than explicit, centralized, and unimodal control. Incorporating brain-inspired design principles into robotic systems generates desirable properties including autonomy, robustness, and dynamic adaptation. The ultimate aspiration within this field is enactive robotics, wherein artificial systems must regulate their own sensorimotor engagement relative to intrinsic viability criteria such as energy conservation or thermal stability, thereby generating their own systemic meaning and adaptive identity.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML β™₯