Unit 11: Testing and Individual Differences

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

So if you were suddenly dropped right into the middle of the dense, uncharted Amazon rainforest right now, how long do you honestly think you would survive?

Me personally.

Maybe a day.

If I'm lucky.

I mean, a week is pushing it for any of us.

Let's say you have a 4 .0 high school GPA, you can solve like a complex quadratic equation in your head and your grammar is absolutely flawless.

None of that is going to save you from eating a toxic berry.

Exactly.

Or tracking a venomous snake or just getting hopelessly lost without a compass.

So in that specific scenario, what good is your intelligence?

That is actually the perfect question to kick things off.

Right.

Because we walk around assuming intelligence is this, you know, this universal fixed tool that works identically in every single environment.

But the moment you change the context, it changes.

Right.

The whole definition of what it means to be smart completely fractures.

Well, welcome to the study table.

I'm your host.

And today we are turning our attention directly to you, the listener, as we dive deep into Unit 11 of Meyer's Psychology for AP.

Which is all about testing and individual differences.

And it is one of the most fiercely debated topics in all of psychology.

I mean, we've talked about the memory war, you know, repressed versus recovered trauma.

We've done the gender war, the nature versus nurture debate.

But today our mission is to wade directly into what the textbook calls the intelligence war.

It really is a massive intellectual battlefield.

The whole controversy basically boils down to this this one question.

Do each of us possess an inborn general mental capacity?

Like a single physical reservoir of brain power.

Yeah.

And if there is one, can we realistically slap a meaningful number on it to rank human beings?

And how did we even get to a point where a society decided a single test score could just define a person's entire potential?

OK, let's unpack this, because our goal here isn't just to rattle off vocab words for an exam.

Memorizing flashcards is a terrible way to learn anyway.

You forget the definition the second you walk out.

Right.

Our goal is to dissect the core concepts, look at the historical shifts and really understand why the IQ test was actually invented.

What social problems were they trying to solve?

Which gives you a completely different lens for viewing the world today.

But before we even debate what intelligence is, we have to talk about a cognitive trap.

Yes, the reification error.

We absolutely have to avoid this.

It's a pitfall that almost everyone,

scientists and journalists included, trips into constantly.

Reification.

It sounds like something out of a sci -fi movie.

It does, but it's just when we view an abstract, immaterial concept as if it were a concrete physical thing, we invent a concept to describe a behavior, give it a name, and then gradually convince ourselves that this concept objectively exists out there in the real world, like sitting on a shelf.

So how are we doing that with intelligence?

Well, think about how people casually talk about IQ.

You'll hear someone say, she has an IQ of 120.

Like saying she has blue eyes or she has a bicycle.

Precisely.

By phrasing it that way, we are reifying IQ.

We're imagining that IQ is a physical trait housed inside her brain.

When it's definitely not.

Scientifically, no, it's completely inaccurate.

A much better way to phrase it would be, she scored 120 on an intelligence test.

Because IQ isn't a thing you possess.

It's just a score you got on a highly specific set of tasks on, like a random Tuesday.

Exactly.

That distinction completely reframes the conversation.

Intelligence is socially constructed.

Which brings us right back to the Amazon rainforest.

Right.

In a hunter -gatherer society,

intelligence might be your spatial memory for tracking weather patterns.

But universally, intelligence is broadly defined as the ability to learn from experience, solve problems, and adapt.

But if you dropped a straight -A student from an Ontario high school into that jungle, they'd look incredibly unintelligent.

They wouldn't survive the week.

Yet in that high school, intelligence is defined by algebra and essay writing.

Though in psychological research, it's defined much more narrowly.

For researchers, intelligence is literally just whatever intelligence tests measure,

which has historically been academic aptitude.

School smarts.

Which brings us to Charles Spearman.

Yes, early to mid 20th century.

Spearman proposed we have one general intelligence, which he shortened to the little italicized letter G.

He acknowledged people have special abilities.

Sure, you might be better at math than writing.

But he developed a statistical procedure called factor analysis.

Now, factor analysis sounds incredibly dry.

Is it basically just a mathematical way of finding hidden patterns in data?

That is exactly what it does.

It identifies clusters of related items.

Imagine you give a massive test with hundreds of questions to a large group.

OK, factor analysis looks at the raw data and highlights the clusters.

It might show that people who scored well on vocabulary also scored highly on paragraph comprehension.

So it groups those verbal skills into a single cluster.

And what did Spearman actually find when he ran this?

He noticed a persistent trend.

People who scored high in one cluster typically scored higher than average in completely different clusters like spatial ability.

So because of that correlation across the board, he thought, OK, there must be a common skill set.

Right.

This G factor underlying all intelligent behavior.

Let me try to wrap my head around this with an analogy.

Let's look at professional kitchens.

OK, I love food analogies.

Being an incredible pastry chef takes meticulous measuring and patience.

But being a short order grill cook in a diner requires just speed and pure chaos tolerance.

A pastry chef might completely melt down on the grill.

Exactly.

But if I test a thousand random people, the ones who are naturally gifted at baking probably pick up grilling faster than the average person.

Because there's an underlying culinary intuition.

Right.

A baseline understanding of flavor profiles and timing.

Good things tend to come packaged together.

Is that Spearman's argument about the brain?

That captures the essence perfectly.

Distinct mental abilities tend to cluster together to define a general intelligence factor.

But of course, someone immediately stepped up to tear his theory down.

As they always do in psychology.

And his loudest critic was L .L.

Thurstone.

Yes, Thurstone fundamentally rejected the idea of ranking human mental capacity on a single scale.

He gave his subjects 56 completely different tests instead.

Fifty six.

That's intense.

And using math, he identified seven distinct clusters of primary mental abilities, not just broad categories, but highly specific domains.

Like what?

Word fluency, verbal comprehension, spatial ability, perceptual speed, numerical ability, inductive reasoning and memory.

So he created this dynamic profile instead of a single flat score.

He completely refused to rank people on a single G scale.

But there's a massive irony here, right?

Didn't Thurstone's own data kind of backfire on him?

It absolutely did.

Other investigators took Thurstone's raw data and ran further analyses.

And they found the very trend he had tried to minimize.

That the people who excelled in one cluster generally scored well on the others.

Exactly.

So Thurstone's attempt to disprove the G factor actually provided the mathematical evidence for it.

Wow.

So G survived the attack.

It did.

But evolutionary psychology offers a fascinating perspective on what she actually is.

Satoshi Kanazawa argued it evolved for a specific survival reason.

How does evolution explain our ability to take an SAT test?

Well, Kanazawa asserts that general intelligence evolved to help our ancestors solve novel problems.

OK, so not everyday things.

Right.

Routine problems like reading a stranger's facial expression or navigating back to camp.

Those are evolutionarily familiar.

Over millions of years, our brains evolved dedicated modules for those.

But then a massive wildfire suddenly sweeps through the savanna, cutting off your escape route.

Or a multi -year drought hits.

Those are novel problems.

No evolutionary script for them.

Kanazawa argues that G is essentially the brain's novel problem solving engine.

So it kicks in when the familiar scripts fail.

Exactly.

And modern research backs this up.

General intelligence correlates highly with solving complex academic problems, but not with evolutionarily familiar areas like forming friendships or parenting.

Which perfectly explains the classic trope of the brilliant physicist who can solve mind bending equations, but like cannot gracefully exit an awkward dinner party conversation.

We see that disconnect constantly in the real world.

Their G engine is massive, but their social modules might be average.

So if G is mostly for academic and complex cognitive tasks, what about people whose brains work in non -academic ways?

In the 1980s, the definition of smart really expanded.

And Howard Gardner was the primary architect of that.

He proposed the theory of multiple intelligences.

Eight of them, right?

Yes.

Eight distinct abilities that come in independent packages.

Linguistic, logical, mathematical,

musical, spatial, bodily kinesthetic, interpersonal, interpersonal, and naturalist.

So bodily kinesthetic is like the elite dancer or athlete, intrapersonal is understanding your own emotions and naturalist is like a master botanist.

Exactly.

Gardner found evidence for this by looking at patients with brain damage, observing how a stroke might destroy spatial ability while leaving linguistic ability entirely untouched.

Because if G was a single central engine, a broken engine would affect everything.

That was his argument, but he also looked heavily at savant syndrome.

Right.

Savant syndrome is where someone scores quite low on standardized intelligence tests.

Maybe they have severe developmental disabilities, but they have this island of brilliance.

Usually an extraordinary skill in computation, memory, or music.

About four and five people with savant syndrome have autism, though not all.

Like Kim Peek, the real life inspiration for Rain Man.

He didn't have autism, but his abilities were staggering.

The textbook says he could read and memorize a page in eight to 10 seconds.

He learned roughly 9 ,000 books completely by heart.

The Bible, Shakespeare, phone books.

And he could give exact driving directions for any U S city from memory.

But then he struggled with basic daily mechanics.

Right.

He could not physically button his own clothes and he had almost zero capacity for abstract thought.

I read this incredible anecdote where his dad told him, Kim, lower your voice.

And instead of speaking more quietly, Kim physically slid his body lower down into his chair because he processed language with absolute rigid literalism.

Abstract idioms escaped him.

When asked to recount the Gettysburg address, he gave the literal physical street address of the boarding house Lincoln stayed at.

That is wild.

So Gardner points to someone like Kim Peek and says intelligence cannot possibly be one single number.

It clearly comes in isolated packages.

It's a very appealing democratic idea, right?

Everyone gets to be smart in their own unique way, but this got significant pushback from researchers like Sandra Scar.

Oh yeah.

Scar.

She argued it's a romanticized view of human capability.

She famously asked, wouldn't it be wonderful if the world were so just?

Like if you're terrible at math, the universe magically compensates you by making you a musical prodigy.

But looking at the hard statistical data, the world simply isn't just in that way.

General intelligence scores do heavily predict performance across various complex tasks and jobs.

G matters.

But wait, does being a world -class dancer really mean you have bodily kinesthetic intelligence, or is that just stretching the word intelligence to mean talent or athleticism?

That is precisely the critique.

If we call everything we value intelligence, the word loses its meaning.

High intelligence might get you in the door, but it won't guarantee success once you're inside.

For real world success, you need the grit factor like Anders Erikson's research.

Yes.

The 10 year rule.

Erikson studied chess grandmasters, Olympic athletes, surgeons.

He found that expert performance requires about 10 years of intense daily deliberate practice.

So you can have an off the charts IQ, but if you don't practice a craft for a decade, someone with an average IQ who puts in the relentless work will surpass you.

Exactly.

Robert Sternberg agreed there's more to success than academic intelligence, but he thought Gardner's eight intelligences were too fractured.

So he proposed a triarchic theory.

Three types.

First is analytical intelligence, which is textbook academic problem solving.

One right answer.

That's the SAT.

Second is creative intelligence, how you react adaptively to novel situations and generate original ideas.

And third is practical intelligence, street smarts, everyday tasks where problems are ill -defined with multiple solutions.

Practical intelligence is absolutely crucial for managers writing a persuasive memo, motivating a resentful team, reading a room.

And Sternberg worked at the college board on this, right?

Asking students to write creative captions for cartoons or figure out how to move a heavy bed up a narrow staircase.

Yes.

And initial data shows assessing these practical and creative skills actually improves the prediction of college success.

And it reduces ethnic group differences we see in traditional tests, which suggests our traditional metrics are way too narrow.

And speaking of narrow, we have to talk about social and emotional intelligence because you can be analytically brilliant and completely inept socially.

Edward Thorndyke noted this back in 1920,

the absolute best mechanic on the factory floor might be promoted a foreman and fail miserably because they lack social intelligence.

They understand the machines perfectly, but not the human beings operating them.

That laid the groundwork for emotional intelligence defined later by Mayer, Solovey and Caruso.

They break it into four components.

Okay, let's hear them.

Perceiving emotions.

So recognizing them in faces,

understanding emotions, predicting how they change,

managing emotions, knowing how to express them appropriately and using emotions to enable adaptive thinking.

And the data is really compelling.

People who manage emotions well, avoid being hijacked by depression or anger.

They can deescalate workplace conflicts.

IEQ correlates heavily with better job performance and the ability to delay gratification.

To illustrate how critical this is to survival, the textbook has that tragic case of Elliot studied by Antonio DiMazio.

This completely shattered my understanding of decision -making.

It's a profound case.

Elliot had a brain tumor removed.

The surgery was physically successful and his IQ remained completely intact.

He was just as analytically brilliant,

but the surgery inadvertently damaged a highly specific area responsible for processing emotion.

DiMazio said, he never saw a single tinge of emotion in Elliot.

No sadness, frustration, or joy.

DiMazio showed him disturbing pictures of injured people and Elliot felt absolutely nothing.

He cognitively knew the images were sad, but he couldn't physically feel the sadness.

And the real world consequences were devastating.

Emotions are the vital invisible weights that help us make decisions.

Without them, Elliot couldn't prioritize tasks.

He lost his job.

He went bankrupt.

His marriage collapsed.

He remarried and divorced again.

Despite his high IQ, he ended up entirely dependent on a disability check and his siblings care.

It proves we aren't just cold logic machines.

Emotion is functional.

Okay.

So if we have all these theories about intelligence, can we actually see physical evidence of it in the brain?

Historically scientists thought it was simply brain size.

Bigger brain equals bigger smarts.

Lord Byron's brain weighed a massive five pounds compared to the normal three pounds.

So they assumed genius meant having a larger biological hard drive, but then they measured more brains and the theory fell apart.

Some brilliant geniuses had tiny brains.

Some criminals had massive five pound brains, but with modern MRI technology, we do see a modest correlation, right?

We do.

Adjusting for body size.

There is a positive correlation of about plus 0 .33 between brain size and intelligence scores.

It's statistically significant.

What about Albert Einstein's brain?

Sandra Whittleson and her colleagues studied Einstein's brain post -mortem, comparing it to 91 Canadian controlled brains.

With his brain just massively heavier?

Not at all.

It wasn't heavier or larger overall, but one specific area, the lower region of the parietal lobe was 15 % larger.

And that's the processing center for mathematical and spatial information.

Exactly.

The machinery theoretical physics requires.

And because the skull is rigid,

different brain areas compete for real estate.

So because his math center was 15 % larger, it crowded out the verbal center, which might explain why Einstein was notoriously slow in learning how to talk as a child.

It's a compelling hypothesis, but modern neuroscience looks at the microscopic level too.

Synapses and gray matter.

Highly educated people die with up to 17 % more synapses than less educated counterparts.

Now classic chicken and egg problem.

Does reading complex books build more synapses or do people born with more synapses crave complex books?

It's likely a feedback loop of both.

Experience alters the brain.

Rats in stimulating environments develop thicker brain cortexes, but there's also neuroplasticity, which leads to a surprising study by Shah and colleagues.

The longitudinal scan of the 307 kids.

Yes.

They scan these kids' brains from age five to 19, looking specifically at the thickness of the cortex.

Intuitively you'd think the smartest kids would have the thickest cortexes right from birth.

The data showed the exact opposite.

The most intelligent seven -year -olds actually had a thinner brain cortex than average kids.

Wait, really?

Yeah.

But as they grew, their cortex progressively thickened at a rapid rate, peaking much later, around 11 to 13 before thinning back down as it naturally pruned unused connections.

So it's not about starting with more hardware.

It's having a prolonged dynamic developmental window.

Agile minds are powered by physically agile brains.

We also see differences in processing speed.

The frontal lobe acts as a global workspace for organizing information.

Like a computer CPU and the speed matters immensely.

There's a tight correlation between intelligence scores and perceptual speed.

Let's describe the masking image test from the text.

Okay.

Imagine staring at a blank computer screen for a tiny fraction of a second, an image flashes, maybe a shape like a wide U or one side is longer than the other.

It flashes for literally like 0 .01 seconds, a blink.

And immediately it's violently replaced by a masking image, a chaotic jumble of lines that wipes your visual memory clean.

And they ask, was the long side on the left or the right?

To get that right consistently, you need incredibly rapid visual processing.

The correlation is about plus 0 .3 to plus 0 .5.

People who score higher on intelligence tests are literally taking in information milliseconds faster.

It's like having a faster internet download speed.

Okay.

So we have MRIs and microsecond tests today, but how did psychologists measure this before all this tech?

The history is wild.

It begins with English scientist Francis Galton in the late 1800s, Charles Darwin's cousin.

He read Darwin's work and wondered if he could measure natural ability to encourage those of high ability to mate with each other, which is the intellectual seed of the eugenics movement at the 1884 London Exposition.

Galton assessed over 10 ,000 visitors intellectual strengths, but he didn't use a math test.

He measured physical reaction time and muscular power because they assumed a strong body meant a strong mind.

It was a complete scientific failure.

High achieving scholars didn't outscore working class folks on physical measures, but he did pioneer statistical techniques we still use and coined nature and nurture.

A stark reminder that objective science can be heavily influenced by social biases.

So Galton strikes out and the true birth of modern testing shifts to France with Alfred Binet.

In 1904, France mandated schooling for all children.

Suddenly classrooms were flooded with kids who had never received formal education.

And they needed a way to identify which kids needed special education classes without relying on teacher's subjective biases.

So Binet and Theodore Simon created a test based on the concept of mental age.

A bright child would perform like an older child and a struggling child, like a younger one.

If you're nine and performing at a seven year old level, your mental age is seven, but Binet didn't believe he was measuring a genetic limit, right?

No, he leaned heavily toward an environmental explanation.

He prescribed mental orthopedics, targeted training to develop attention.

He explicitly feared his tests would be used to label and limit children.

Unfortunately, that's exactly what happened when the test was brought to the U S by Louis Turman at Stanford.

Turman heavily adapted it for California kids, creating the Stanford Binet.

And Turman's philosophy was the polar opposite.

He believed intelligence was a fixed inherited trait.

He was sympathetic to eugenics and literally wanted to use tests to curtail the reproduction of feeble mindedness.

It's a chilling historical pivot.

With Turman's help, the U S government used intelligence tests to evaluate immigrants at Ellis Island and world war one army recruits, the first mass administration of an intelligence test.

But the tests were heavily culturally biased requiring specific American knowledge.

So immigrants from Southern and Eastern Europe scored poorly.

And those poor results were interpreted as scientific proof of genetic inferiority directly leading to the xenophobic 1924 immigration law that drastically reduced quotas.

While Turman popularized the test, William Stern came up with the IQ formula, intelligence quotient, mental age divided by chronological age multiplied by a hundred.

Let's run the math to show why it breaks down for adults.

An eight year old performing like a 10 year old 10 divided by eight is 1 .25 times 100 is an IQ of 125 gifted.

It works beautifully during childhood.

But if I'm 40 and I perform like a 20 year old mental age, 20 divided by chronological age 40 is 0 .5 times a hundred.

My IQ is 50.

I suddenly have a severe intellectual disability just because I didn't get cognitively smarter than a 20 year old.

It makes no mathematical sense.

Human cognitive growth plateaus.

So modern psychologists no longer use Stern's division formula.

We just compare scores relative to the average performance of other people, your exact same age.

So what are we actually taking when we take a test today?

We have to separate aptitude versus achievement.

An achievement test strictly measures what you've already learned.

The AP psychology exam or a history final, but an aptitude test predicts your future ability to learn a new skill like the SAT.

It's testing raw capacity to handle college level reasoning though the line is incredibly blurry.

Your achieved vocabulary will obviously boost your aptitude score.

The gold standard today is David Wessler's creation, the Weiss Wechsler adult intelligence scale.

Wessler realized a single score didn't provide enough clinical info.

So the West consists of 11 distinct subtests broken into verbal and performance areas.

Verbal tests are like defining vocabulary or repeating numbers backward, but performance subtests are totally different.

Picture arrangement, ordering scrambled comic panels of a snowman or block design, recreating a geometric design with red and white blocks.

Yielding separate scores is incredibly useful.

If a child is amazing at spatial reasoning, but has very low verbal comprehension,

that discrepancy is a massive red flag for reading disability like dyslexia.

Or for stroke patients pinpointing damaged domains for rehab.

But for any test to be accepted, it needs three criteria,

standardization, reliability, and validity.

What is standardization?

If you answer 42 questions correctly on a new test, that tells us nothing.

Standardization is defining meaningful schools relative to a pre -tested group.

You give it to a massive representative sample and their baseline scores form the normal curve, the bell curve.

Most people score in the middle.

The midpoint is arbitrarily assigned the number 100, roughly 68 % of people score within 15 points between 85 and 115.

But the wildest part is tests have to be periodically re -standardized to keep the average exactly at 100.

And if you compare us to people who took it in the 1930s, we run into the Flynn effect.

The Flynn effect is the phenomenon of global steadily rising intelligence test performance over the last century.

The text points out that if you scored an average 1930s person against today's harder criteria, their IQ would be a 76.

Performance is marching upward.

It's likely a combo of improved childhood nutrition, better prenatal care, smaller families, and vastly more stimulating environments.

Think about the density of information kids process today.

Some even suggest hybrid vigor from global mixing.

Whatever the cause, humanity is scoring higher.

Okay.

Second criteria is reliability.

Consistency.

If you step on a scale today and it says 150 and tomorrow it says 200, that scale has terrible reliability.

The Waze is a correlation of about plus 0 .9, highly reliable.

But high reliability doesn't guarantee the third criteria.

Validity.

Does this test actually measure what it promises to?

Exactly.

Content validity is whether a test samples the pertinent behavior.

A driving test makes you drive.

But for the SAT, we care about predictive validity.

Does it predict future performance?

Like your college GPA.

But the text says predictive power fades as we age.

It correlates plus 0 .6 in early school, drops below plus 0 .5 for the SAT, and is a modest plus 0 .4 for the GRE.

Why?

The restricted range phenomenon.

Like in professional video gamers, broadly, fast reaction time predicts who wins.

But among the top 100 players in the world, they all have peak reaction time.

The range is restricted.

So it no longer predicts the winner.

Strategy does.

The exact same thing happens with the SAT and elite universities.

If Harvard only accepts scores between 1500 and 1600, the score can't meaningfully predict who gets a 4 .0 versus a 3 .0.

The text uses NFL linemen to illustrate this.

Once everyone is 300 pounds, weight stops being the deciding factor.

So if tests are generally reliable, do our scores stay the same from birth to death?

Formal testing before age three is highly unpredictable.

The only crude infant predictor is habituation.

Babies who get bored quickly with a familiar picture and prefer novel stimuli tend to score slightly higher later.

But by age four, it starts predicting adult scores.

And by age seven, scores stabilize.

And the Scottish study proves just how stable they are over a lifetime.

In Dury's research in 1932, Scotland administered a test to essentially every 11 year old child in the country,

87 ,498 children.

The data was forgotten until 1997 when they found the dusty results in Edinburgh.

They tracked down hundreds of those individuals, now 80 years old and readministered the exact same test.

After 70 years surviving World War II careers, everything, the correlation between age 11 and age 80 was striking.

Higher scoring 11 year olds were more likely to be living independently at 77 and less likely to suffer from Alzheimer's.

It's just like the nun study where early verbal ability and essays protected against Alzheimer's decades later.

An active brain early on provides a buffer.

Now let's look at the extremes of the normal curve.

At the low extreme, an intellectual disability diagnosis requires two criteria.

First, an intelligence score of 70 or below.

The bottom 2%.

And second, a comparable difficulty in adapting to the normal demands of independent living, managing money, social cues.

Sometimes there's a known physical cause like Down syndrome.

Historically, people were warehoused in massive institutions.

Thankfully, we've swung back toward normalization and mainstreaming into regular classrooms and group homes.

But the Flynn effect has a wild real world implication here.

Because of re -standardization.

Right.

The test gets harder to keep the average at 100.

So someone scores a 72 in 1990, no legal disability.

But taking the re -standardized test in 1996, their score might drop six points to a 66.

Without their brain changing at all, they cross the 70 point line.

Which suddenly makes them eligible for special education, social security, and even the death penalty.

Since executing people with intellectual disabilities is cruel and unusual punishment.

A six point statistical drop can be a matter of life and death.

It's sobering.

At the high extreme, we revisit Louis Terman and his termites.

He tracked 1500 kids with IQs over 135.

Dispelling the myth that gifted kids are maladjusted nerds.

They became successful doctors and lawyers.

But this brings up the modern debate over tracking,

segregating the top three to 5 % into gifted classes.

Critics argue it creates self -fulfilling prophecies.

Labeling kids as ungifted lowers expectations and widens achievement gaps.

Proponents argue forcing a brilliant math student to sit through lessons they mastered three years ago is detrimental.

They want appropriate developmental placement for all to promote equity and excellence.

Which segues into the deepest question, nature versus nurture.

Let's look purely at the science.

Twin studies are our most powerful tool.

Identical twins reared together have virtually identical scores.

But the kicker is identical twins reared apart in entirely different environments.

Thomas Bouchard found their scores are still remarkably similar.

Behavior geneticists estimate the heritability of intelligence is roughly 50 to 70%.

It's polygenetic involving many genes.

But environment has a heavy impact.

Mistreated children adopted into enriched homes see significant intelligence score enhancement.

But here's the massive twist.

As adopted children grow into adulthood,

their intelligence scores correlate less with their adoptive parents and more with their biological parents.

The correlation with their adoptive families approaches zero.

Zero genetic influences become more apparent as we gain the freedom to choose our own environments.

To understand this, we have to define heritability correctly.

Heritability is the variation among people in a group that is attributable to genetics.

It does not mean 50 % of your individual intelligence comes from your genes.

It only explains why a group differs.

Let me use a greenhouse analogy.

If I have a greenhouse with 100 % identical sunlight and water, the environment is controlled perfectly.

If tomato plants grow to different sizes, 100 % of the variation is genetic.

The heritability is 100%.

Flawless.

The textbook uses Mark Twain's barrel boys analogy.

If you raise boys in identical barrels, heritability is 100%.

But if environments vary wildly, heritability drops.

And genes and environments interact.

James Flynn's basketball analogy.

A kid with a slight genetic height advantage makes the team, gets expert coaching, and the environment multiplies that tiny edge into a massive advantage.

We also know severe deprivation bludgeon's native intelligence.

Jamie Vicker Hunt studied a destitute Iranian orphanage in the seventies.

Infants received almost no attention.

They couldn't sit up at age two.

The deprivation physically suppressed their genetic potential.

But Hunt instituted tutored human enrichment, playing language games.

And by 22 months, the infants were thriving.

Early intervention in impoverished environments works.

It's the philosophy behind Project Head Start.

Though you can't magically fast forward a normal infant into a genius, the Mozart effect, playing classical music to a baby to boost IQ, is debunked.

But a child's mindset matters immensely.

Carol Jack's research.

A fixed mindset, believing intelligence is unchanging, makes students give up easily.

A growth mindset, believing intelligence is like a muscle, builds grit and success.

So how does environment explain group differences in gender and race?

And let's be explicitly clear, we are putting politics entirely aside and strictly reporting what the psychological data shows.

Let's start with gender.

Regarding general intelligence,

overall, g, males and females are virtually identical.

But there are specific differences.

Females are better spellers, excel at verbal fluency, locating objects, and are better emotion detectors, like in Rosenfall's film clip study.

Males tend to excel in spatial ability tests, like mentally rotating 3D blocks, and math problem solving, though females excel at math computation.

Evolutionary psychologists argue ancestral fathers navigating and tracking needed spatial rotation, while mothers gathering plants and raising infants needed object memory and emotion detection.

Other researchers point to social expectations.

In highly gender equal cultures like Sweden, the math gap virtually disappears.

And there's male variability.

The male bell curve is flatter and wider, so boys outnumber girls at both extreme highs and lows.

Moving to racial and ethnic differences, the text notes the bell curve for white Americans setters around 100, black Americans around 85, and Hispanic Americans roughly midway.

But geneticists emphasize that under the skin, the races are remarkably alike.

Individual differences within a race are much greater than differences between races.

Let me use a baking analogy.

One giant bowl of dough, identical recipe.

Put half in a warm proofing oven, half in a cold fridge.

The loaves in the warm oven vary slightly because of yeast, but they're massive.

The fridge loaves vary slightly, but they're tiny.

The variation within a batch is the yeast, but the massive difference between the batches is 100 % environmental temperature.

The textbook uses a similar flower box analogy.

The psychological community applies this logic to racial gaps.

Systemic disparities in wealth and education are the different environments.

And evidence supports this.

Deaf kids score lower on spoken language tests because of the environmental language barrier.

Black and white infants score equally well on early novel stimuli tests.

And in college, where black and white students are exposed to comparable educational environments, black students' scores increase dramatically faster, shrinking the aptitude gap.

The soil, not the seed.

So are the tests themselves biased.

We have to use two definitions of bias.

Everyday bias means the test detects performance differences caused by cultural experience.

So early tests, asking immigrants why people buy fire insurance if you've never owned property, you missed the question.

That's cultural bias.

But scientific bias hinges on predictive validity.

Does it predict future behavior accurately for one group, but poorly for another?

Major U .S.

aptitude tests are not scientifically biased.

A score of 95 predicts the same academic outcome, regardless of race or gender.

But we still see capable people underperforming due to stereotype threat, coined by Claude Steele.

Steven Spencer gave a difficult math test to men and women.

Women performed worse because the cultural stereotype that women are worse at math hijacked their working memory with anxiety.

But when he explicitly told them the test had no gender differences, the anxiety vanished and their scores equaled the men's.

The expectation of negative evaluation drains cognitive resources.

But simple self -affirmation exercises, like having students write about their personal values for 15 minutes, boost GPAs significantly.

The bottom line on testing is a paradox.

Tests literally discriminate between abilities.

But historically, civil service tests were created to reduce subjective discrimination like nepotism and racial bias in hiring.

They were an attempt at an objective playing field.

Wow, we've covered an immense amount of ground.

From Spearman's Gray to Gardner's Savants, from Brain Cortex Thickness to the Scottish Archives, and the debates over heritability and stereotype threat.

To synthesize, intelligence is real and measurable.

It's heavily influenced by genetics, but requires a nurturing environment to bloom.

Practical and emotional intelligence and 10 years of gritty practice are just as critical for navigating life.

I want to leave you with a final provocative thought.

Artificial intelligence models are now scoring in the 99th percentile on the exact same standardized human aptitude tests we've been discussing.

The SAT, the bar exam.

If a machine can effortlessly ace a test designed to measure human intelligence,

will our society be forced to redefine what intelligence actually means?

Will we value emotional and creative intelligence above all else, simply because those are the only domains the machines haven't conquered yet?

It's a fascinating frontier.

And on behalf of the Deep Dive, acting as your last -minute lecture team today, I want to give a massive warm thank you to you, the listener.

Thank you for your curiosity, your dedication to learning, and for sitting down at the study table with us.

It's been a privilege unpacking this with you.

So the next time you hear someone throw around an IQ score, just remember, the diagnostic landscape of the brain is murky, dynamic, and wonderfully complex.

See you next time on the Deep Dive.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Mental capacity has long been debated in psychology, with fundamental questions about whether intelligence represents a unified general ability or emerges as distinct, independent skills. Intelligence itself is best understood as the capacity to acquire knowledge through experience, apply reasoning to novel problems, and adjust behavior to changing environments, though psychologists caution against reifying this construct into a fixed biological entity. Multiple theoretical frameworks compete to explain intelligence's structure: Spearman's general intelligence factor proposes that performance across cognitive domains correlates because of an underlying unitary ability, while Thurstone's model identifies seven separate mental clusters including spatial, numerical, and verbal reasoning. Gardner's multiple intelligences theory diverges further, proposing eight autonomous cognitive systems supported by evidence from savant syndrome, where individuals with otherwise low general intelligence demonstrate extraordinary capability in singular domains. Sternberg's triarchic approach categorizes intelligence into analytical reasoning for academic problems, creative adaptation to novel situations, and practical problem-solving in everyday contexts. Emotional intelligence—the capacity to recognize, interpret, regulate, and productively employ emotional information—often predicts life outcomes more powerfully than traditional IQ measures. Neurobiological research reveals modest correlations between brain volume and intelligence scores, with high performers typically showing greater gray matter density in regions supporting memory and attention, plus enhanced neural processing speed. Intelligence assessment began with Binet's mental age concept and evolved through the Stanford-Binet revision and Wechsler's widely-used scales, which separately measure verbal and performance domains. Reliable and valid testing requires standardization against normative distributions, consistency of measurement, and demonstrated predictive accuracy. Intelligence demonstrates remarkable stability after early childhood, becoming highly consistent by age seven. While heritability estimates suggest genetic factors account for roughly fifty to seventy-five percent of intelligence variation, environmental conditions profoundly shape cognitive development, as evidenced by severe deficits from institutional deprivation and modest gains from early intervention programs. Individual differences emerge across gender lines with females typically excelling in verbal domains and males in spatial reasoning, though average general intelligence remains equivalent. Group-level score differences reflect primarily environmental and educational disparities rather than genetic variation, and while standardized measures may embed cultural knowledge assumptions, stereotype threat—the anxiety of confirming negative group stereotypes—independently suppresses performance across all demographic groups.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Unit 11: Testing and Individual Differences

Related Chapters