Chapter 25: Quantitative Genetics and Multifactorial Traits

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome back to the Deep Dive.

So for a while now, our genetics discussions, they've really focused on that straightforward Mendelian world, haven't they?

Absolutely.

One gene, a clear outcome, you know, like type A blood or whether a pea is wrinkled or smooth, very distinct category.

Exactly.

But let's be honest, most traits we actually care about in the real world, things like human height or crop yield, even disease susceptibility, they just aren't that simple.

They don't fit neatly into those boxes.

That's right.

And that's the major shift we're diving into today.

We're moving from that simpler world into what we call quantitative inheritance.

It's messier, sure, but it's absolutely fundamental.

Quantitative meaning we measure them, like with numbers.

Precisely.

Think height in inches, weight in pounds, milk yield in gallons.

These traits show continuous variation, a whole spectrum of possibilities.

And typically they're polygenic.

Polygenic, meaning multiple genes are involved working together.

Exactly.

Lots of genes contributing.

And crucially, they're often multifactorial as well.

Okay, multifactorial.

That brings in the environment, right?

Genes plus life.

Yeah, pretty much.

It's the interplay.

Your genes might give you the potential to be tall, but your actual height.

That depends heavily on things like nutrition during childhood,

genetics and environment interacting.

Makes sense.

Now, before we get lost in that continuous spectrum, the sources mention a couple of other related types of polygenic traits.

Good point.

Yeah, it helps to distinguish them.

First, you have traits.

These are still polygenic, influenced by multiple genes in the environment.

But the outcome is countable, like whole numbers only.

You got it.

Big number of eggs, hen lays or scales on a fish or seeds in a pea pod.

You can't have, you know, 3 .7 eggs.

Okay.

And the other type.

Those are threshold traits.

These are really fascinating because on the surface they look Mendelian again.

You either have the condition, like type two diabetes, or you don't.

Discrete outcomes.

But underneath,

it's still polygenic.

Exactly.

There's an underlying continuous scale of genetic liability or predisposition.

It involves many genes, maybe some environmental factors too.

Your risk increases along this continuous scale.

Until bang, you cross a certain threshold and the disease phenotype appears.

Precisely.

So our mission today, really, is to unpack the Mendelian basis that still underlies all this complexity.

Okay.

And then look at the tools geneticists use to figure out, well, how much is genes versus how much is environment?

Okay, let's start with the history.

Because wasn't there a big debate early on?

Like, could Mendel's discrete factors even explain this smooth, continuous variation?

Oh, absolutely.

A huge debate around the turn of the 20th century.

But folks like William Bateson and G.

Odney Yule proposed the solution, the multiple factor or multiple gene hypothesis.

They basically said, what if it's multiple Mendelian factors acting together?

And the proof came from wheat.

Yeah, the classic experiment was by Herman Nilsen, LL in 1909, studying wheat grain color.

He crossed a true breeding red grain wheat, let's call it AAPB, with a true breeding white one.

Okay, so the F1 generation, they'd all be heterozygous, AAB.

What did they look like?

They were intermediate pink.

Now, if this was just simple, incomplete dominance with one gene, you'd expect the F2 generation, when you self -cross the F1s, to show a 1 .2 .1 ratio of red dot pink dot white.

Right.

But that's not what he found, was it?

Not at all.

He found five distinct phenotypic classes, different shades from white to full red, and they appeared in a very specific ratio, 1 .4 .6 .4 .1.

Ah, the classic 116th ratio.

That points towards two genes, right, segregating independently.

Exactly.

That ratio was the giveaway.

And it led directly to this crucial idea of additive alleles.

Additive alleles, meaning each positive allele, like the uppercase A or B, just adds a little bit to the trait, like adding little dollops of pink.

That's a great analogy.

The intensity of the red color depended simply on the number of these additive uppercase alleles the plant had.

Didn't matter if it was AABB or AABB or AB.

They all had four additive alleles that looked the same.

Two additive alleles, like an AB, gave the intermediate pink.

Zero additive alleles, AAB, gave white.

So it's not about dominance in the usual sense.

It's just accumulation.

Each contributing allele adds a fixed amount.

That's the fundamental concept, yes.

It connects Mendel's simple rules to the continuous variation we see.

Each additive allele contributes equally, and their effects pile up.

Okay, so if that's the case, and if we can maybe ignore the environment for a moment, can we estimate how many genes are involved in a trait like this?

Yes, under certain simplifying assumptions.

If the environment's effect is minimal and all additive alleles contribute equally, you can use a couple of simple formulas based on the F2 generation.

Like looking at the extremes.

Right.

The proportion of F2 individuals that look exactly like one of the original extreme parents, the darkest red or the whitest white in Nils and L's case, is equal to 14, where N is the number of gene pairs involved.

So in the wheat example, 116th were fully red and 116th were white.

$116 equals $142.

So N equals two gene pairs, makes sense.

And the other related rule is that the number of distinct phenotypic categories you see in the F2 should be two plus a multiple.

Okay, so for any two genes, that's 22 plus one equals five categories, which is exactly what Nils and L saw.

Precisely.

But, and this is a big, but those formulas really only work in ideal conditions.

Equal contribution, no environmental noise.

Which almost never happens in reality, right?

The environment always muddies the waters.

Exactly.

And that's why for most quantitative traits, we absolutely have to turn to statistical analysis.

We can't track every single allele perfectly.

Because you just can't measure every single individual in a population.

You need to work with samples.

Right.

You take a representative sample.

And if your sample is large enough and represents the population well, the measurements for a quantitative trait usually follow a normal distribution.

The bell curve.

Everyone's seen that shape.

Yep.

And the center of that bell curve is the arithmetic average.

Simple enough.

But the average alone doesn't tell the whole story.

You could have two groups with the same average height, but one group is all very similar and the other has really tall and really short people.

Exactly.

And that's where variance topic comes in.

Variance is key.

Forget the complex formula for a second.

Conceptually, it measures the spread or dispersion of the data points around that mean.

So it tells you how much individual variation there is within the group.

Yes.

And crucially for us, variance is often how we detect the effects of genetic segregation.

When genes are shuffling around in the F2 generation, it typically increases the variance compared to the more uniform F1 generation.

Okay.

So variance quantifies the spread.

And then we often use the standard deviation

secondly, which is just the square root of the variance.

Right.

The standard deviation is useful because it gets us back into the original units of measurement, inches, pounds, whatever.

And there's that handy rule of thumb for normal distributions.

About 68 % of your data falls within one standard deviation of the mean.

Gives you a good sense of the typical range.

And then if you're looking at two traits at once, like does higher body weight correlate with more eggs laid?

We use tools like covariance and more intuitively the correlation coefficient.

Yes.

That number between a nevic of one and plus one that tells you how strongly two things trend together.

Exactly.

Positive R means as one trait increases, the other tends to increase.

Negative R means as one goes up, the other goes down.

And R near zero means, well, not much relationship.

But disclaimer time correlation doesn't prove causation.

Just because two things vary together doesn't mean one causes the other.

The eternal mantra of statistics.

Absolutely crucial to remember.

Okay.

Let's make this concrete with that tomato example from the sources.

They cross a big 18 ounce tomato parent with a small six ounce one.

Right.

The P1 generation.

Then they look the offspring, the F1 generation, all genetically uniform heterozygotes have an average weight of

Okay.

Intermediate.

And the F2 generation, where all the gene mixing happens.

Interestingly, the F2 also averages around 12 ounces.

The mean hasn't really changed between F1 and F2.

But the variation must have changed, right?

Because the F1s are all AB, BCC, whatever, they're genetically identical.

Any variation there is mostly environmental wiggle.

Precisely.

The F1 variance was small, maybe around 1 .3 in the example.

But in the F2 generation, where you get all possible combinations of alleles segregating out, the variance explodes.

It shot up to about 4 .3.

That jump in variance, that's the statistical signature of polygenic inheritance showing itself in the F2.

That's it.

Exactly.

That increased spread is the genetic segregation made visible through statistics.

And we can even use those earlier formulas roughly.

If maybe 172 of the F2 tomatoes were as small as the six ounce parent or as large as the 18 ounce parent.

Okay.

172 isn't a neat $14.

It's between 100, 604 and three and 100 and 566 TART in six dollars.

And three, four.

Right.

So you'd estimate that tomato weight in this cross is likely controlled by maybe three or four major gene pairs.

It's an estimate, but it gives you a ballpark.

Okay.

This sets us up perfectly for the really big question, especially for breeders or human geneticists.

How much of the variation we actually see in a population is down to the genes?

V dollars versus the environment.

Yes, that's the core concept of heritability.

It's defined as the proportion of the total phenotypic variation, VPS, in a specific population in a specific environment that can be attributed to genetic factors.

Right.

And it's super important to understand what it isn't.

If we say human height has a heritability of say 0 .65, that does not mean 65 % of your personal height is genetic.

Absolutely not.

It's a population statistic.

It means that in the population studied, about 65 % of the differences in height between individuals could be accounted for by genetic differences among them.

It's about the variance, not the individual's trait value.

And it can change if the environment changes or if you look at a different population.

Critically important distinctions.

So we define the total phenotypic variance, V pale bellers, as being made up of genetic variance, V VO bellers and environmental variance.

But there's also a potential interaction term.

Ah, the genotype by environment interaction variance, V VO times V VO.

That sounds complicated.

It means that different genotypes might respond differently to the same environmental change.

Think of two varieties of corn.

Variety A might yield amazingly well in rich soil, but terribly in poor soil.

Variety B might be mediocre in both.

Their response curves to the environment are different.

That difference in response is VG times E.

So the total picture is VP equals VG plus VE plus VG times E.

Correct.

And to make it more nuanced, that genetic variance, VJ itself, can be broken down further.

Okay.

Into what?

Mainly into additive variance, dominance variance, and interactive variance, also called epistatic variance.

Additive variance comes from those additive alleles we talked about, the simple cumulative effects.

Dominance variance arises from dominant recessive interactions at a single locus.

Interactive variance comes from interactions between different loci epistasis.

Oh, okay.

So why break V jellers down like that?

Because it leads to two different measures of heritability.

Broad sense heritability, 2H222, is a proportion of total variance due to all genetic factors.

2HLS equals VG V phi 2.

Okay.

That gives the overall genetic contribution.

Yes.

But for practical purposes, especially in breeding, we're often more interested in narrow sense heritability, 292.

This only considers the additive genetic variance, $20 equals VAVP2.

Why focus just on the additive part?

Why leave out dominance and interaction effects?

Because the additive effects are the ones that are reliably transmitted from parents to offspring.

Dominance depends on having specific allele pairs like AA, and interactions depend on specific combinations of alleles at different genes.

These combinations get broken up during meiosis and sexual reproduction.

Ah.

So the additive variance is what breeders can most effectively select for.

It's the predictable transmissible part of genetic variation.

Exactly.

VB dollars determines how much offspring resemble their parents for the trait on average.

That's why 2 -day -day is so useful for predicting the outcome of artificial selection.

Artificial selection is choosing the best individuals to breed for the next generation.

Right.

And narrow sense heritability directly predicts the response to selection.

There's a concept called realized heritability calculated after a selection experiment.

Yeah.

2 -day dealies or SCE6.

R is the response how much the average treat value changed after one generation of selection.

And S is the selection differential, how different the selected parents were from the original population average.

Precisely.

So 2 -H2O2 tells you how effectively selection can shift the population mean.

The sources mention that amazing long -term corn experiment in Illinois,

selecting for high and low oil content since 1896.

That's incredible.

Isn't it?

Over a century of selection.

And what it showed is fascinating.

As they selected for, say, high oil content generation after generation, they successfully increased the average oil percentage dramatically.

But what happened to the heritability?

The narrow sense heritability gradually decreased.

It started reasonably high, maybe around .34 for high oil.

But after 76 generations, it had dropped to about .12.

Why would it drop?

Because the selection process was using up the additive genetic variants.

They were gradually fixing the alleles that contribute to the high oil content in the population.

As veal gets smaller, .22 deals get smaller, even if the total variance hasn't changed much.

Eventually, if all individuals become homozygous for the high oil alleles, V dollars and thus 2 -H2O2 would drop to zero.

There's no more additive genetic variation left for selection to act upon.

Wow.

So success in selection actually eliminates the heritability that made the selection possible in the first place.

In a sense, yes.

It also explains something often observed.

Traits highly essential for survival and reproduction, like fertility or conception rate, often have very low narrow sense heritability.

Why low?

You'd think they'd be strongly genetic.

Natural selection has already been acting on them for millennia.

It is likely already optimized the additive genetic component, removing most of the V dollars.

There's not much additive variation left for artificial selection to easily grab onto.

That makes a lot of sense.

Okay, shifting gears to humans.

We can't do selection experiments, so how do we estimate heritability?

The classic method involves twins, right?

Yes, the twin study approach.

It compares monozygotic, MC, or identical twins, with dizygotic, DZ, or fraternal twins.

The logic being, MZ twins develop from a single fertilized egg, so they share, theoretically, 100 % of their genes.

Any differences between them should be purely right.

Whereas DZ twins develop from two separate eggs fertilized by two separate sperms, so they're like regular siblings, sharing on average 50 % of their genes.

So you look at a specific trait.

If both twins in a pair share the trait, that's called concordance, and if MZ twins show significantly higher concordance for a trait than DZ twins do.

The inference is that the trait has a substantial genetic component.

The greater difference in concordance between MZ and DZ pairs, the higher the inferred heritability.

Like, blood types have very high MZ concordance, almost 100%, suggesting very high genetic control.

Whereas something like getting measles might have high concordance in both MZ and DZ pairs if they grew up together, suggesting a strong environmental influence, exposure to the virus.

Exactly.

It's a powerful comparative tool.

However,

modern genomics has thrown a few wrenches in the classic twin study assumptions.

Oh, how so?

We always hear MZ twins are identical.

Well, mostly identical at the sequence level, but we now know they aren't perfectly identical.

They can accumulate differences in things like copy number variation, CNV, where segments of DNA are duplicated or deleted differently.

Or one twin might develop somatic mosaicism, meaning genetic changes occurred in some cells after the initial split of the embryo.

So small genetic differences can exist even between MZ twins.

Yes.

And perhaps even more significantly, we now understand the importance of epigenetic differences.

Epigenetics.

Modifications to DNA, like methylation, that affect gene activity without changing the DNA sequence itself.

Precisely.

While empty twins might start out epigenetically very similar, their epigenetic patterns can diverge significantly over their lifetime due to different environmental exposures, diets, lifestyles, etc.

These epigenetic changes can alter gene expression and lead to fetidific differences, even with the same

sequence.

So some differences between MZ twins that we used to assume were purely environmental might actually be due to these acquired epigenetic differences blurring the lines a bit.

It certainly adds another layer of complexity to interpreting twin data.

It highlights that the interplay between genes and environment is incredibly dynamic, right down to the molecular level.

Okay, so twin studies give us estimates, but they have limitations.

The ultimate goal would be to find the actual genes involved, right?

The specific DNA location.

Absolutely.

And that brings us to quantitative trait loci, or QTLs.

Finding the specific genes for polygenic traits is hard because each one might have only a small effect.

So a QTL isn't necessarily a single gene?

Not necessarily.

It's defined as a specific region on a chromosome that is known to contain one or more genes contributing to a quantitative trait.

QTL mapping is the process of finding these regions.

How does that work, generally?

The basic strategy involves crossing two parental lines that are very different for the trait you're interested in.

Maybe one line selected for high yield and another for low yield.

You create an F1 generation, then an F2 mapping population.

And in that F2 population, you look for links between DNA markers and the trait.

Exactly.

You genotype all the F2 individuals using hundreds or thousands of molecular markers spread across the genome, things like RFLPs, SMPs, microsatellites, which are just known variable DNA sequences.

Then you measure the quantitative trait in each F2 individual.

And you use statistical analysis to see if any specific DNA markers tend to be inherited together or co -segregate with the high or low values of the trait more often than expected by chance.

That's the core idea.

If a specific marker consistently shows up in the F2 individuals with, say, high yield, it suggests marker is physically close on the chromosome to a QTL that influences yield.

It's linked.

The tomato fruit weight story is a great example here, isn't it?

Trying to figure out the genetic basis for that huge difference between tiny wild tomatoes and giant cultivated ones.

A classic example.

It took years of work, but research has eventually mapped over two dozen QTLs contributing to fruit size and shape.

And they actually nailed down one major one, right?

The FFW $2 $2 $2 TTL?

Yes, FFW $2 $2 cents on chromosome two.

They found it accounts for a remarkable 30 % of the variation in fruit weight between the wild and cultivated types.

And they even pinpointed the responsible gene within that QTL, a gene called orfx day.

So what does orfx do do?

It turns out orfx is encodes a protein that actually negatively regulate cell division during fruit development.

It acts like a break.

Ah, so the big cultivated tomatoes just have versions of the orfx idle gene that put the brakes on less or maybe apply them later.

Essentially, yes,

the illiterate is found in cultivated tomatoes lead to lower expression or less effective function of this negative regulator.

Fewer breaks mean more cell division leading to larger fruit.

It's a beautiful example of how variation in a single gene acting within a polygenic system can have a major quantitative effect.

And there are other QTLs too, like ones controlling the number of seed compartments, which also affects size and shape.

Right, like the L01 -Feferi QTLs.

It's a combination of factors.

This level of detail leads us to the sort of cutting edge now.

Expression QTLs or EQTLs.

What are these?

So EQTLs take it a step further.

They aren't necessarily loci containing genes that code for proteins directly involved in the trade itself.

Instead, EQTLs are genomic regions that specifically regulate the expression level of other genes.

So they're like genetic volume knobs, controlling how much product other genes make.

That's a perfect way to put it.

Often EQTLs are found in those non -coding parts of the genome, what used to be called junk DNA.

These regions can contain sequences that affect things like transcription factor binding or chromatin structure, or even how pre -mRNA is spliced.

And variations in these EQTL regions can lead to differences in how much protein is produced from a target gene somewhere else.

Exactly.

And this is proving incredibly important for complex human diseases like asthma, cardiovascular disease, autoimmune disorders.

Researchers can link specific DNA variants like SMPs in these regulatory regions to variations in the expression levels of nearby or even distant genes.

So you're not just looking at the gene's code, but at the genetic factors controlling its activity level.

Right.

It helps connect genetic variation to functional consequences at the gene expression level, which might be closer to the disease mechanism.

You can find driver genes whose expression levels are influenced by EQTLs and seem central to disease networks.

It opens up new possibilities for understanding disease and maybe even finding new therapeutic targets.

Wow.

Okay.

So we've really gone on a journey here.

We started with the realization that most traits aren't simple Mendelian outcomes.

Right.

They're quantitative, polygenic, often multifactorial.

We saw how the concept of additive alleles provides the Mendelian underpinning.

Then we needed statistics variants, standard deviation to even measure and analyze this variation, especially to separate genetic from environmental influences using heritability.

Broad sense, narrow sense, realizing that heritability itself can change with selection.

Then we looked at twin studies and their modern complexities before finally getting to the genomic tools, QTLs and now EQTLs that let us pinpoint the actual DNA regions, sometimes even genes or regulatory elements involved in these complex traits.

It's a progression from phenotype to statistical abstraction and finally back down to the specific molecular DNA level.

Quite a story.

It really is.

Okay.

As a final thought for your listeners to maybe chew on, think about that genotype by environment interaction, the VG times ELO term.

If the best set of genes in one environment might actually be suboptimal or detrimental in another, what does that really imply about the whole idea of genetic improvement or optimization?

How crucial is considering the specific context, the environment, whether you're breeding crops, raising livestock, or even thinking about personalized medicine in humans?

Something to ponder.

Definitely food for thought.

It underscores that complexity we've been talking about.

Always a pleasure exploring these topics.

We really hope this deep dive helped clarify the fascinating world of quantitative genetics for you.

Thanks for tuning in.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Quantitative inheritance addresses traits that show continuous variation across populations rather than segregating into discrete categories, encompassing phenotypes ranging from human height to agricultural yield. These characteristics are polygenic, meaning their expression depends on the cumulative effects of alleles at multiple loci combined with environmental influences. Meristic traits, which are counted in whole numbers such as seed quantity per pod, represent a partial exception to continuous variation, while threshold traits like Type II diabetes manifest as a limited number of phenotypic classes despite being controlled by an underlying continuous genetic liability. The multiple-gene hypothesis explains how many genes of small individual effect can collectively produce the observed phenotypic spread when alleles contribute additively to the trait. Determining the number of loci involved requires analyzing F2 populations for the proportion of offspring displaying extreme parental phenotypes or observing the total count of distinct phenotypic classes generated. Since complete population measurement is rarely feasible, quantitative genetics relies fundamentally on statistical methods applied to representative samples. The mean establishes central tendency through simple arithmetic averaging, while variance quantifies the dispersion of values by computing average squared deviations from the mean. Standard deviation, derived as the square root of variance, expresses this dispersion in original measurement units. Geneticists employ correlation coefficients, calculated from covariance, to evaluate relationships between two different quantitative traits. Heritability, the cornerstone of quantitative genetic analysis, estimates what proportion of phenotypic variation within a specific population results from genetic differences rather than environmental variation. Critically, heritability estimates are population and environment specific and are not intrinsic to a trait itself. Total phenotypic variance partitions into genotypic variance, environmental variance, and interaction variance between genotype and environment. Broad-sense heritability encompasses all genotypic variance, whereas narrow-sense heritability isolates only additive variance and provides superior predictions of how populations will respond to selection pressures. The selection differential and observed response to selection allow direct estimation of narrow-sense heritability in artificial selection experiments. Twin studies leverage the comparison between phenotypic similarities in monozygotic twins, who share identical genotypes, and dizygotic twins, who share half their alleles, to estimate heritability in human populations. Recent evidence demonstrates that identical twins show minor genotypic differences through somatic mosaicism and copy number variation, plus substantial epigenetic differences including differential DNA methylation that accumulate throughout life and influence gene expression patterns. Quantitative Trait Loci mapping identifies genomic regions harboring polygenes by tracking cosegregation between molecular markers like SNPs and quantitative phenotypes in segregating families or populations. Expression QTLs represent a specialized category, functioning as genetic variants that regulate target gene expression levels and frequently underlie susceptibility to complex diseases including asthma and other multifactorial disorders.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 25: Quantitative Genetics and Multifactorial Traits

Related Chapters