Chapter 7: Control of Gene Expression

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Imagine this for a second.

Every single cell in your body, right, from the neurons firing in your brain to the cells working away in your liver, they all have the exact same genetic instruction manual, the exact same DNA.

Yeah, it's kind of wild when you think about it.

So how on earth does a brain cell know it's supposed to be a brain cell and a liver cell knows its job?

They're doing vastly different things.

How is that possible?

That's the million dollar question, isn't it?

Welcome to the deep dive.

Today, we're jumping right into one of biology's most fundamental processes, how cells control which genes they switch on and which they keep off.

It's all about gene expression and it's well, it's way more intricate and dynamic than you might initially think.

Absolutely.

Understanding this is really it's a shortcut to being genuinely well informed about how life works at its core.

So our mission today is basically to unpack a key chapter, chapter seven from a real heavyweight textbook, Molecular Biology of the Cell.

We want to distill the core concepts, write the mechanisms, the structures, those key experiments.

Exactly.

And define the technical term simply, connect it all back to actual living systems and research.

So by the end, you'll have a solid grasp of this complex symphony playing out inside every single one of your cells.

OK, so we'll be looking at everything from how these special proteins literally read your DNA code to the really surprising roles of non -coding RNAs these master conductors were still learning so much about.

It's a fascinating journey.

All right, let's dive in.

The Grand Orchestra of Gene Control,

an overview.

OK, so let's start with that central paradox we just mentioned.

You've got cells like a neuron and a liver cell.

Totally different jobs, totally different looks.

Right.

Like figure seven one in the book shows, visually distinct.

But inside,

identical DNA genomes.

It's like they have the same musical score.

But they're playing completely different instruments or maybe even entirely different pieces based on that score.

Exactly.

Different performances.

And, you know, for a while, biologists actually wondered if cells maybe lost genes as they specialized.

Seemed logical, right?

Yeah.

Jettison the stuff you don't need.

But then came these landmark experiments.

Cloning a whole frog from just the nucleus of an adult skin cell.

Mind blowing.

Or growing an entire carrot plant from one single root cell.

These showed that generally differentiation happens without altering the DNA sequence itself.

So the genes are still there.

They're all still there.

It proved that a cell's identity isn't about which genes it possesses, but about which genes it actively expresses, which ones it turns on.

OK, so what is being expressed then?

Is it totally different in every cell?

Well, there's a lot of overlap, actually.

Many gene products are common to almost all cells.

Think about basic structural proteins.

The polymerases needed for copying DNA and RNA, ribosomal proteins, the essentials.

For housekeeping stuff.

Exactly.

But then you have the specialists.

Chemoglobin is the classic example.

Red blood cells make tons of it.

Other cells make none.

Or that tyrosine amino transferase in liver cells you mentioned.

Figure 7 3B highlights that.

Highly specific.

And here's where it gets really interesting.

A typical human cell expresses maybe 30 to 60 percent of its roughly 25 ,000 genes.

That many?

Wow.

Yeah, about 20 ,000 protein coding ones, maybe 5 ,000 non -coding RNA genes.

And critically, the level of expression, how much product is made varies for almost every single gene between different cell types.

It's not just on or off.

It's like a volume knob for each gene.

Precisely.

And this unique spectrum of messenger RNAs, the mRNAs, acts like a fingerprint.

So you can tell a cell type just by reading its mRNAs.

Absolutely.

If you sequence all the mRNAs in a cell, you can unambiguously identify it.

We're even discovering new subtypes of cells this way, refining classifications we thought were settled.

Think about figure 7 to 5 showing how this works.

OK, that makes sense for identifying stable cell types.

But what about changes?

How does, say, a liver cell respond to a signal like starvation?

Good question.

That involves external cues like maybe glucocorticoid hormones released during stress or intense exercise.

The cell needs to alter its gene expression profile in response.

So how does it control all this?

Where are the switches?

There are multiple control points, like a whole series of checkpoints along the production line from DNA to RNA to protein.

Figure 7 -6 lays it out nicely.

OK, walk us through them.

All right.

First, and arguably most important, is transcriptional control, deciding if and how often a gene is transcribed into RNA.

Step one.

Then, RNA processing control, how that initial RNA transcript is modified, like splicing out introns.

Got it.

Next, RNA transport and localization,

which finished mRNAs get shipped out of the nucleus.

And where do they go in the cytoplasm?

Location, location, location.

Then, translational control,

which mRNAs actually get translated into protein by the ribosomes.

OK.

And finally, controls over the lifespan and activity of the products themselves.

mRNA degradation control, protein degradation control, and protein activity control.

So seven levels of control.

That's a lot.

It is.

But for most genes, the main event, the primary control point, is transcriptional control.

Why is that?

It's just the most efficient.

Why waste energy making RNA and potentially even protein if you don't need the final product?

Transcriptional control prevents the cell from making all those superfluous intermediates.

Stops the process right at the source.

Makes sense.

Transcription control, the DNA binding maestros.

All right.

So if transcriptional control is king, how do cells actually do it?

How do they decide transcribe this gene, but not that one?

It largely comes down to proteins called transcription regulators.

Yeah.

These are the real workhorses.

The DNA binding maestros, as you called them.

Exactly.

They recognize very specific short sequences of DNA, usually just five to maybe 12 nucleotide pairs long.

These target sequences are called cis -regulatory sequences.

Cis, meaning on the same molecule, right?

Same chromosome.

Precisely.

And it's amazing.

Something like 10 % of all our protein coding genes are dedicated just to making these transcription regulators.

Wow.

That's a significant investment.

It is.

And these regulators have this incredible ability to read the DNA double helix from the outside.

Yeah.

They don't need to unwind it.

They recognize the unique patterns of chemical groups like hydrogen bond donors and acceptors or hydrophobic patches exposed in the grooves of the DNA helix, mostly in the major groove, which is wider and more information rich.

Figures seven to seven and seven, eight show this beautifully.

So they're feeling the shape and chemistry of the groove.

Kind of, yeah.

Figure seven to nine really emphasizes the importance of that major groove.

And do these proteins all look the same?

How do they recognize different sequences?

They have specific structural motifs, recurring 3D shapes that allow them to make these precise contacts.

Panel seven one gives a great overview.

Like what?

Well, there's the helix turn helix, the homeodomain, leucine zippers, zinc fingers, even motifs using beta sheets.

Okay.

These motifs typically insert a part of their structure, often an alpha helix or a beta sheet right into the major groove and the specific amino acid side chain sticking out from that structure, determine the exact DNA sequence it binds to.

It's incredibly specific.

A molecular lock and key.

Pretty much.

And often they enhance their binding power through teamwork.

Many regulators work as dimers, two units bound together.

Like holding hands.

Exactly.

They can be identical units, homeodimers or different ones, heterodimers.

This effectively doubles the length of the DNA sequence they recognize.

Ah, so that makes binding even more specific and stronger.

Figure seven 10 shows that doubling.

Conversely, much higher finity and specificity, like gripping a rope with two hands instead of one.

Okay.

And this teamwork, does it change how they bind?

Yes.

It often leads to something called cooperative binding.

Imagine two regulators that individually bind DNA fairly weakly, but when they bind near each other on the DNA, they can also interact weakly with each other.

Those weak protein, protein interactions stabilize their binding to the DNA significantly.

So the whole complex is much more stable than the individual part.

Exactly.

And this often results in a binding pattern that's more like an on -off switch.

You see an S -shaped curve instead of a gradual one.

If you plot binding versus concentration, we'll get figure seven 11.

It becomes more all or none.

Interesting.

But wait, isn't our DNA usually wrapped up tightly around proteins?

Nucleosomes.

Doesn't that get in the way?

Excellent point.

DNA is packaged into nucleosomes and that absolutely plays a regulatory role.

It's not just packaging.

It's part of the control system.

How so?

Well, generally regulators bind less tightly to DNA wrapped in a nucleosome.

The binding site might be facing inwards or the DNA might be too rigid.

So it is an obstacle.

It can be.

But here's the cool part.

The DNA at the ends of a nucleosome isn't locked down tight.

It kind of breathes.

It transiently unwraps for milliseconds.

A tiny window of opportunity.

Exactly.

And if a regulator sometimes called a pioneer factor, figure seven 13, binds during that brief window, it can prevent the DNA from rewrapping.

Holding the door open.

Kind of.

And this makes it much easier for a second regulator to bind nearby.

It's another form of cooperative binding, but mediated by the nucleosome structure itself.

Figure seven 12 illustrates this.

So the nucleosome structure itself enables a kind of sequential binding.

Precisely.

And this is probably why many key regulatory sequences in eukaryotes are found in nucleosome free regions.

They've been actively opened up by these cooperative binding events.

Okay.

This is way more dynamic than I pictured.

Oh, completely.

It's crucial to get away from the static pictures in textbooks.

These regulators are constantly binding and dissociating often very rapidly.

Like figure seven 14 tries to convey.

So high affinity doesn't mean stuck forever.

Not at all.

It just means they stay bound longer on average before falling off, but there's constant turnover, constant replacement.

Proteins are essentially scanning the DNA via weak transient binding until they hit their high affinity target.

It's a frantic state of motion inside the cell.

Not just sitting there waiting, constant searching.

Switching genes on and off, bacterial versus eukaryotic styles.

Okay.

Let's compare how bacteria and eukaryotes handle this gene switching.

Bacteria seems simpler.

Right?

Generally, yes.

They often use a system called operons.

A great example is a tryptophan operon in E.

Coli shown in figure seven 15.

It's a set of genes five, in this case needed to make the amino acid tryptophan.

They're all grouped together and transcribed as a single mRNA molecule.

Very efficient.

Okay.

So how is it controlled?

By a protein called the tryptophan repressor.

It's an allosteric protein, meaning its shape and activity change when it binds to another molecule.

And that molecule is tryptophan.

You got it.

When tryptophan levels are high, tryptophan binds to the repressor, changing its shape.

Figure seven 16.

This new shape allows the repressor to bind to a specific DNA sequence called the operator, which sits right near the promoter.

Blocking the way.

Exactly.

It physically blocks RNA polymerase from binding and starting transcription.

So when there's plenty of tryptophan, the cell stops making more.

When tryptophan levels drop, the repressor releases and the genes switch back on.

A neat little negative feedback loop.

Simple and effective.

Do they have activators too?

They do.

Some bacterial promoters are inherently weak, meaning RNA polymerase doesn't bind well on its own.

Activator proteins can help.

Like the CAP protein.

Figure seven 17.

That's a classic example.

CAP binds near the promoter, but only when it first binds a small molecule called cyclic AMP or KMP.

And CMP levels go up when?

When glucose, the preferred food source, is scarce.

So CAP binding helps RNA polymerase latch onto these weak promoters, activating genes needed to use alternative food sources.

So repressors turn things off.

Activators turn things on, often responding to small molecules.

Precisely.

And sometimes you get both working on the same operon.

Like the famous lac operon.

The perfect example.

The lac operon is controlled by both the lac repressor, which responds to the absence of lactose, and the CAP activator responding to the absence of glucose.

Figure seven 18 shows the logic.

So the operon is only fully on when?

When lactose is present.

So the repressor falls off and glucose is absent.

So CAP binds and activates.

It's like a little logic gate.

Condition A, A and D, condition B must be true.

Exactly.

It shows how cells can integrate multiple signals to make a very specific decision.

Simple, but powerful processing.

You mentioned DNA looping earlier.

Does that happen in bacteria too?

It does.

Yes.

Even though their genomes are pretty compact, sometimes a regulator needs to find quite far from the promoter.

DNA looping allows these distant regulators to physically contact the RNA polymerase machinery.

Figure seven 19 gives an example.

But while looping is kind of an exception in bacteria, it's basically the rule in eukaryotes.

Ah, okay.

So eukaryotes much more complex.

Definitely.

For starters, eukaryotic RNA polymerase too, the one that transcribes protein coding genes, needs a whole host of helper proteins called general transcription factors, about 27 subunits.

Way more complex than the bacterial polymerase and the gene control region.

All the DNA involved in regulating a single gene is often enormous.

It includes the core promoter where polymerase binds, plus all the cis regulatory sequences, which can be scattered over tens, even hundreds of thousands of base pairs.

Look at figure seven 20.

Wow.

How do regulators binding way out there influence the promoter?

That's where looping comes in big time.

And eukaryotic regulators rarely work alone.

They assemble in large groups, often recruiting other huge protein complexes called co -activators and co -repressors.

And these co -activators, repressors, they don't bind DNA themselves.

Figure seven 21.

Generally, no, they're recruited by the DNA bound regulators.

Co -activators help activate transcription.

Co -repressors help shut it down.

So how do you activators activate?

What do they actually do?

They do several things.

They help attract and position the general transcription factors and RNA polymerase at the promoter.

But crucially, they also recruit those co -activators, many of which modify the local chromatin structure.

Ah, back to chromatin.

So they loosen it up.

Exactly.

The co -activators can add chemical tags to histones, covalent modifications.

They can physically slide or even evict nucleosomes using chromatin remodeling complexes or swap out standard histones for variants.

All of this makes the underlying DNA, especially the promoter, more accessible.

Figure seven 22 shows these mechanisms.

So it's like sculpting the chromatin landscape.

That's a great way to put it.

Figure seven 23 shows a nice example of how a cascade of histone modifications can prepare the promoter.

Okay.

So that's about getting transcription started.

Anything else?

Yes.

Another key control point, especially for genes that need to be switched on rapidly, involves releasing paused polymerase.

Paused, like it starts and then stops.

Exactly.

For some genes, RNA polymerase binds, starts transcribing maybe 50 nucleotides or so, and then just stalls.

It sits there poised.

Figure seven 24 C.

Waiting for a signal.

Precisely.

Activators are needed to give it the signal to release the pause and continue transcribing the rest of the gene.

It allows for very quick responses.

Clever.

And what happens when multiple activators bind near the same gene?

You often get transcriptional synergy.

The combined effect is much greater than just adding up the effects of each activator individually, it's multiplicative.

Figure seven 25.

So one plus one doesn't equal two.

It equals like 10.

Something like that.

It makes the regulatory response much deeper and more sensitive to the combination of signals.

Very powerful.

You also mentioned condensates like little workshops.

Yeah, this is a really exciting area.

It seems transcription regulators,

co -activators, and even RNA polymerase itself can come together to form these dynamic, non -membrane bound droplets called biomolecular condensates.

Figure seven 26.

What do they do?

They seem to concentrate all the necessary machinery in one place, increasing the local concentration of factors and making transcription initiation much more efficient and organized, like setting up a dedicated temporary workshop right where you need it.

Fascinating.

Okay.

What about eukaryotic repressors?

How do they work?

Are they just blocking polymerase like in bacteria?

Sometimes, but they have many more tricks up their sleeve.

Yeah.

Here's seven 27.

They might compete directly with activators for the same DNA binding site.

Okay.

Competitive binding.

Or they might bind near an activator and mask its activating surface.

Hide its functional part.

Right.

Or they can interact directly with the general transcription factors, preventing assembly of the initiation complex.

Interfering with the core machinery.

And crucially, many repressors recruit chromatin remodeling complexes to pack the chromatin up more tightly, or histone deacetylases to remove activating marks from histones.

Reversing the activating modifications.

Exactly.

And perhaps most powerfully, they can recruit histone methyl transferases that add specific methyl groups to histones like H3K9E3 or H3K27E3, which are strong signals for forming compact, silent, heterochromatin.

The really tightly packed stuff.

Yes.

And this heterochromatin can sometimes spread and become self -reinforcing, leading to very stable long -term gene silencing that's hard to reverse.

So much more diverse mechanisms than just locking polymerase.

Definitely.

And finally, to keep all this organized.

Right.

With regulators potentially affecting genes miles away.

How do you stop them interfering with the wrong gene?

That's where insulator DNA sequences come in and also barrier sequences.

They act like boundary elements.

Figure 728.

Pretty much.

They often work by organizing the chromatin into loops.

This keeps a gene and its specific regulatory elements together in their own neighborhood, preventing enhancers from accidentally activating the wrong gene next door and stopping the spread of repressive heterochromatin into active regions.

So they define regulatory domains.

Exactly.

Essential for maintaining order in a complex genome.

Cell identity and memory.

How cells remember who they are.

Okay.

So we've talked about turning genes on and off, but a really critical aspect, especially in multicellular organisms like us, is cell memory.

How does a cell, once it becomes, say, a liver cell, stay a liver cell?

And how do its daughter cells also know to be liver cells after division?

That's the core question.

The cell needs to remember the specific pattern of gene expression that defines its identity and pass that memory on.

How does that work?

Can we see it in action?

A fantastic example comes from the fruit fly embryo, Trosophila, and a gene called even skipped or Eve.

In the very early embryo, different transcription regulators are present in gradients across the embryo.

Figure 729.

Like molecular coordinates.

Exactly.

Providing positional information.

Now the Eve gene has this amazing modular control region, Figure 730.

Different segments of this region are responsible for turning the gene on in specific, narrow stripes across the embryo.

Seven stripes in total.

So each stripe has its own control module.

Pretty much.

And each module integrates information from those gradients.

Let's take strike two, for example, Figure 731, 732.

For Eve to be expressed only in that precise stripe, it needs the right concentration of specific activators, like bicoid and hunchback, to be present.

And the right concentration of specific repressors, like crupulin giant, to be present or below a certain threshold, just outside that stripe.

Wow.

So it's reading the combination of multiple signals.

Precisely.

It's combinatorial control and action.

The cell integrates all these inputs to make a sharp on -off decision in a very specific location.

Figure 733.

It shows how complex patterns arise from combining these regulators.

That's incredibly precise.

And often the regulators themselves are activated by signals coming from outside the cell.

Figure 734.

Like hormones.

Hormones, growth factors, signals from neighboring cells.

These external cues trigger internal signaling pathways that might lead to, say, making a new regulator, activating an existing one by phosphorylation, removing an inhibitor, or allowing a regulator to enter the nucleus.

Lots of ways to activate them.

Yeah.

And the effect depends on the cell type.

Take the mammalian alpha -globin gene, part of hemoglobin, Figure 735.

It has multiple enhancer modules.

They only work in red blood cell precursors because only those cells have the correct combination of other transcription regulators needed to cooperate with the factors binding those enhancers.

Ah.

So the context matters enormously.

Absolutely.

And this combinatorial power is immense.

Just a handful of transcription regulators, acting in different combinations, can generate a vast diversity of cell types.

Figure 736.

Like using a small alphabet to make countless words and sentences.

That's a perfect analogy.

The combination is key.

If it's all about the combination, could you like change the combination and change the cell type?

Can you reprogram a cell?

That's been one of the most revolutionary discoveries in modern biology.

The answer is yes.

Seriously.

Scientists have experimentally converted liver cells into functional nerve cells just by forcing them to express three key neuron -specific transcription regulators, Figure 737.

Wow.

Or converting fibroblasts, connective tissue cells, into muscle cells by expressing a single regulator called myOD,

or even inducing eye structures to form on the leg of a fly by mis -expressing the eyeless gene, Figure 738.

Highs on the leg.

That's wild.

But perhaps the most striking example is the creation of induced pluripotent stem cells, or IPS cells, Figure 739.

Taking a normal adult cell, like a skin cell.

And by introducing just a few key master transcription regulators, the ones are Octophore, SOX2, and KLF4, you can rewind its developmental clock.

Back to a stem cell state.

Back to a pluripotent state where it can potentially differentiate into any cell type in the body.

These master regulators coordinate the expression of thousands of other genes, completely resetting the cell's identity.

Figure 740.

That's incredible.

It really underscores the power of these regulators.

It does.

And sometimes a single regulator can coordinate the response of many genes simultaneously.

Like that glucocorticoid receptor you mentioned earlier,

Figure 741.

Exactly.

When the hormone binds, the activated receptor can rapidly switch on a whole battery of genes, maybe metabolic enzymes in the liver during starvation.

It completes the necessary combination for activation at many different promoters.

But its effect would be different in, say, a skin cell.

Right.

Because the other regulators present in the skin cell combination are different.

Context is everything.

Okay.

So how does this memory, the cell identity actually persist through cell division?

What are the mechanisms?

The simplest and probably most common mechanism is positive feedback.

Figure 742.

How does that work?

A master transcription regulator, once turned on, not only activates genes specific to that cell type, but also activates transcription of its own gene.

Ah, so it ensures its own continued production.

Exactly.

This creates a cell -sustaining circuit.

When the cell divides, the regulator protein gets distributed to both daughter cells, where it continues to stimulate its own production, thus maintaining the cell identity.

Figure 740B shows this kind of self -sustaining loop.

Clever.

Are there other circuit designs?

Yes.

Cells use various network motifs.

There are things like flip -flop devices, Figure 743, that can stably exist in one of two states, creating decisive, irreversible switches.

And feed -forward loops, Figure 744, 745, that can filter out transient signals or measure the duration of a signal.

Cells are sophisticated information processors.

Okay.

Positive feedback is one way to remember.

What else?

Another crucial layer, especially invertebrates, is inherited modifications to the DNA or chromatin itself, epigenetic inheritance.

One major form is DNA methylation.

Adding a methyl group to the DNA.

Specifically, to cytosine bases.

Usually those followed by guanine CG sequences, Figure 746.

And the key is how this pattern is maintained.

After DNA replication, one strand is old and potentially methylated.

The other is new and unmethylated.

An enzyme called maintenance methyl transferase recognizes these hemimethylated sites and adds a methyl group to the new strand, faithfully copying the pattern.

Figure 747.

So the pattern gets passed down.

The methylation usually means?

Usually means gene repression.

Dense methylation can directly interfere with the binding of transcription factors,

and importantly, it recruits proteins that condense chromatin, forming repressive heterochromatin.

Figure 748.

It helps lock genes in an off state, making vertebrate gene silencing generally less leaky than in bacteria.

Right.

You mentioned CG sequences.

Are they common?

Actually, most CG's have been lost from vertebrate genomes over evolutionary time, partly due to DNA repair mechanisms.

But there are regions called CG islands.

Figure 749.

CG islands?

Yeah, stretches of DNA, maybe a thousand base pairs long that are unusually rich in CG's and have been protected from methylation, especially in the germ line.

Figure 750.

There are about 20 ,000 in the human genome.

And where are they found?

Often near the promoters of genes,

especially housekeeping genes that are widely expressed.

And here's the interesting thing.

They tend to remain unmethylated, even if the gene isn't active.

Why is that significant?

Because proteins that bind to these unmethylated CG islands often help keep the chromatin in a more open promoter -friendly state.

It allows RNA polymerase to potentially bind, even if it doesn't start transcribing yet.

It keeps the gene poised, ready for activation.

So lack of methylation at these islands is like a ready signal.

Kind of, yeah.

It facilitates potential activation.

Now, methylation is used to silence some genes very stably, including through a process called...

Genomic imprinting.

I've heard of that.

Breaks Mendelian rules, right?

It does.

It's where the expression of a gene depends on whether it was inherited from the mother or the father.

Only one copy, either paternal or maternal, is active.

Figure 751.

There are about 300 such genes known in humans.

How does that work?

Methylation is often key.

Take the EVF2 gene, figure 752A.

On the paternal chromosome, an insulator sequence near the gene gets methylated.

This prevents an insulator -binding protein from blocking enhancer, so the gene is active.

On the maternal chromosome, the insulator is unmethylated, the protein binds, blocks the enhancer, and the gene is silent.

Wow.

So methylation controls an insulator, which controls an enhancer.

Complex.

It is.

Other imprinted genes involve long non -coding RNAs, like the KC and Q1 gene example, figure 752B.

Okay, that's silencing one copy of a gene.

What about silencing a whole chromosome?

Ah, you're thinking of X inactivation, a truly dramatic example of epigenetic control.

In female mammals, right?

To balance gene dosage with males.

Exactly.

Females have two X chromosomes, males have one X and one Y.

To equalize the expression levels of X -linked genes, one of the two X chromosomes in each female somatic cell is randomly chosen and transcriptionally silenced early in development.

Figure 754.

Randomly.

So different cells silenced different Xs.

Yep.

This leads to mosaicism.

The classic example is the coat pattern of calico cats.

Different patches of cells have inactivated different X chromosomes carrying different coat color genes.

Figure 753B.

How is an entire chromosome silenced?

It's orchestrated by a remarkable long non -coating RNA called Zist.

Figure 755.

This RNA is transcribed from the X chromosome that is destined for inactivation, and it literally coats that entire chromosome.

Paints the chromosome.

In a way, yes.

And this ist coat then recruits a whole host of enzymes that modify histones, methylate DNA, and induce the chromosome to fold into a highly compact transcriptionally inert structure called a bar body.

Incredible.

Is every single gene silenced?

Interestingly, no.

About 15, 20 percent of genes on the inactive X actually escape inactivation.

It seems they reside in regions that form loops extending out from the condensed core of the inactive X, keeping them accessible for transcription.

So even large scale silencing has nuances.

Absolutely.

So pulling this all together, epigenetic inheritance refers to these heritable changes in phenotype that don't involve changes in the actual DNA sequence.

And the mechanisms we've seen are Figure 756.

We've got positive feedback loops involving transcription factors acting in trans.

We have inherited histone modifications acting in cis, inherited DNA methylation, also cis, and even self -propagating protein aggregation states like prions acting in trans.

A whole toolkit for cellular memory.

Exactly.

These mechanisms allow for stable cell identities crucial for building and maintaining complex multicellular organisms.

They create heritable states distinct from transient, everyday fluctuations in gene expression.

Beyond transcription, post -transcriptional controls.

OK, so we spend a lot of time on transcriptional control because it's often paramount.

But the regulation doesn't stop once transcription begins.

Right.

You mentioned those other steps.

Figure 757 controls after transcription starts.

Exactly.

Post -transcriptional controls.

Every step from the initial RNA transcript to the final active protein can be and often is regulated.

A single gene might be subject to multiple layers of control.

OK, what's an example after transcription starts, but before the RNA is finished?

One mechanism is transcription attenuation.

This is where transcription starts normally, but then prematurely halts before the whole gene is copied.

It just stops part way through.

Why?

It's another way to regulate output.

A great example is the HIV virus.

Its genome transcription by the host cell's RNA polymerase attack tends to terminate early.

So it wouldn't make the full viral RNA?

Right.

But HIV produces a protein called TAT, which binds to a specific hairpin lute structure that forms in the nascent viral RNA.

This binding prevents the premature termination, allowing the polymerase to transcribe the entire viral genome.

Very clever hijacking of the host system.

Wow.

OK, what about controls built into the RNA itself?

Yes.

Riboswitches.

These are fantastic.

They're short sequences within an RNA molecule, often in the untranslated regions, that can directly bind a small molecule like a metabolite.

Figure 758.

The RNA itself binds the signal.

No protein needed.

Exactly.

And when the small molecule binds, the RNA changes its 3D shape.

This conformational change then affects gene expression.

It might block transcription termination or might block the ribosome binding site, preventing translation.

So the RNA senses the environment and regulates itself.

Pretty much.

They're very economical.

No need to make a regulatory protein.

And their existence lends strong support to the RNA world hypothesis.

The idea that RNA played a much more central catalytic and regulatory role early in evolution.

Makes sense.

OK, what about after the RNA is made, but before translation splicing?

Alternative RNA splicing is a huge deal in eukaryotes.

Figure 759.

A single gene with multiple exons and introns can be spliced in different ways to produce multiple distinct mRNA molecules.

And therefore multiple different proteins from one gene.

Exactly.

It massively expands the proteome, the repertoire of proteins without needing a correspondingly huge number of genes.

Get this.

The dSCAM gene in Drosophila can potentially produce 38 ,000 different protein variants through alternative splicing.

Figure 760.

38 ,000 from one gene.

That's insane.

It is.

It allows for incredible molecular diversity, crucial for things like neuronal wiring.

And splicing isn't always constitutive.

It can be regulated.

Figure 761.

Regulatory proteins can bind near splice sites and either block or enhance the use of that site, ensuring the correct version of the protein is made in the right cell or at the right time.

So cells can control which version gets made.

Yes.

And there's even back splicing, which creates stable circular RNAs whose functions we're still actively exploring, maybe acting as sponges for mRNAs or scaffolds.

RNA is just full of surprises.

What about the other end of the mRNA, the 3 -fit end?

That can be regulated too.

The specific site where the pre -mRNA is cleaved and the poly -A tail is added can be controlled.

This can alter the C -terminus of the encoded protein.

Figure 762.

How?

Often by changing the concentration or activity of general RNA processing factors.

A classic example is B -lymphocytes switching from making a membrane -bound form of antibody to a secreted form.

The difference lies in where the RNA is cleaved and poly -annihilated at the 3 -foot end.

Same gene, different processing, different protein location.

Precisely.

And the mRNA itself can be chemically modified.

More than just splicing and capping.

Oh, yeah.

There are over a hundred known covalent modifications to mRNA bases.

Figure 763.

One of the most studied is N6 -methylenicine, M6A.

Methylation again, but on RNA this time.

Yes.

It's dynamically added and removed by specific enzymes, writers and erasers.

This M6A mark can influence mRNA structure, stability, splicing, or even promote its translation by recruiting specific reader proteins.

It's like another layer of code.

A hidden epitranscripto.

Exactly.

And even more dramatic is RNA editing.

Editing?

Like changing the sequence after it's been copied from the DNA?

Yes.

Altering the actual nucleotide sequence of the mRNA, which changes the coded message.

Common types involve deamination, changing adenosine to inosine A to I, or citadine to uridine C to U.

Doesn't that change the protein?

It absolutely can.

A famous example is an ion channel in the brain where A to I editing changes a single codon, swapping one amino acid for another.

This tiny change is essential for normal brain function.

Another example is the apolipoprotein B gene, figure 765.

CDU editing creates a premature stop codon in gut cells, producing a shorter protein needed for fat absorption, while the liver makes the full length version needed for lipoprotein transport.

So it creates protein diversity from the same gene, like splicing, but by changing the code itself, why?

Good question.

It might correct mistakes in the DNA, generate useful protein variants, or maybe it even evolved as a defense against viruses.

It's still an area of active research.

OK, so the mRNA is processed, maybe edited, then it has to get out of the nucleus.

Is that controlled?

Yes.

RNA transport is regulated.

Most RNA synthesized in the nucleus actually never makes it out.

It gets degraded.

Only properly processed mRNAs are typically exported.

How is that controlled?

Through the nuclear pore complex, which checks to the right processing signals.

But again, viruses like HIV have tricks.

HIV produces a protein called REV, figure 766, 767.

We met TAT earlier.

Right.

REV binds to unspliced or partially spliced viral RNAs, which normally wouldn't be exported because they still contain introns, and mediates their transport out of the nucleus.

This is crucial for the virus to make its structural proteins and package new viral genomes.

It also helps time the production of different viral proteins.

Clever again.

OK, mRNA, it's in the cytoplasm.

Does it matter where it goes?

Absolutely.

mRNA localization is critical, figure 768.

Many mRNAs are actively transported and anchored to specific locations in the cytoplasm before they are translated.

Why?

To ensure proteins are synthesized precisely where they are needed.

This is vital in development for creating asymmetries in the embryo and essential for large polarized cells like neurons, where specific proteins need to be made at distant synapses.

How does the mRNA know where to go?

Usually signals in its three -foot untranslated region, 3 -mini UTR, act like a molecular ZIP code that's recognized by motor proteins or anchoring proteins, figure 769.

OK, so it's arrived at the right spot.

Now, translation, is that always automatic?

Nope.

Translational control is another major regulatory point.

Bacteria often control it by regulating access to the Shine -Dalgarno sequence needed for ribosome binding, figure 770.

Blocking or unblocking the ribosome's landing pad?

Pretty much.

Eukaryotes have more complex mechanisms.

There's global control, for instance, under stress conditions like starvation or viral infection.

The cell can phosphorylate an initiation factor called EF2, figure 771.

This dramatically slows down overall protein synthesis globally.

Saving energy during tough time.

Exactly.

But there's also control of specific mRNAs.

Eukaryotes can use things like upstream open reading frames, the URFs, and the 5 -foot UTR that can regulate translation of the main protein coding sequence downstream, or internal ribosome entry sites, IRS, figure 772, that allow ribosomes to initiate translation in the middle of an mRNA, bypassing the normal 5 -foot cap requirements, something viruses often exploit.

So lots of ways to control which proteins get made from the available mRNAs.

And finally, once translated, the lifetime of the mRNA itself is controlled mRNA stability.

How long does an mRNA last?

It varies hugely.

Bacterial mRNAs are typically very short lived, maybe minutes.

Eukaryotic mRNAs range from minutes, example for growth factors allowing rapid changes, to hours or even days, for example, for stable proteins like beta globin.

What determines the lifespan?

The main pathway involves the gradual shortening of the poly A tail at the 3 -foot end.

Think of it like a timer.

Once the tail gets short enough, the mRNA is rapidly degraded, either by removing the 5 -foot cap and degrading from that end, or by continuing degradation from the 3 -foot end, figure 773.

Sequences in the 3 -foot UTR are critical here, often binding proteins that either stabilize or destabilize the mRNA.

Can you give an example?

A beautiful one is iron regulation, figure 774.

When iron levels are low, a protein called aconitase binds to specific sequences in the protein and the transferrin receptor, an iron import protein.

Binding to ferritin mRNA blocks its translation, while binding to transferrin receptor mRNA stabilizes it, preventing its degradation.

So low iron means less storage protein made, more import protein made.

Makes sense.

Exactly.

When iron levels rise, iron binds aconitase, causing it to release the mRNAs, reversing the effects.

It's elegant feedback.

And where does all this happen?

You mentioned pea bodies.

Right.

These mRNA fates storage degradation often occur within specialized cytoplasmic mRNA condensates.

Pea bodies, processing bodies, are sites where translationally repressed mRNAs can accumulate either to be eventually degraded or stored for later reactivation.

Figure 775, 776.

Storage depots.

Kind of.

And then there are stress granules, which form rapidly when translation is globally inhibited due to cellular stress.

They seem to be temporary holding sites for stalled translation initiation complexes and RNAs, keeping them safe until conditions improve and translation can resume.

So even after transcription, the cell has this incredibly rich toolkit for controlling every subsequent step.

Absolutely.

It's layers upon layers of regulation.

The unseen conductors regulation by non -coding RNAs.

OK, so far we've mostly focused on DNA and proteins and the mRNAs that code for proteins.

But there's a whole other universe of regulation involving RNAs that don't code for proteins, the non -coding RNAs and cRNAs.

We've mentioned a few like ribosomal RNA, transfer RNA, this.

Exactly.

Those are some of the well -known abundant ones.

But what's truly revolutionized biology in recent decades is the discovery of thousands of other ncRNAs with critical regulatory roles.

And some are even enabling amazing new technologies.

Like gene editing.

That's one major application stemming from understanding these systems.

A huge area is RNA interference or RNAi.

Figure 777.

What's the basic idea?

It's a process where small RNA molecules, typically 20, 30 nucleotides long, act as guides to target other nucleic acid molecules, usually RNAs, for silencing.

Small RNAs controlling other RNAs.

Right.

There are three main classes of these small guide RNAs.

Mycoronase, mRNAs, small interfering RNAs, CERNAs and PEEWEE interacting RNAs, PEEERNAs.

OK, let's break those down.

MicroRNAs.

Myriads are tiny RNAs encoded by our own genome.

Humans have over a thousand different ones, and it's estimated they regulate the activity of at least half of all our protein coding genes.

Half?

That's massive.

How do they work?

They start as longer precursor RNAs.

They get processed down to the mature mRNA.

This mRNA then gets loaded into a protein complex called RISC, RNA Induced Silencing Complex, which includes a key protein called argonaut.

Figures 778, 779.

The RNAs within RISC then guides the complex to target mRNAs through base pairing, usually in the 3 -foot UTR.

And what happens then?

If the pairing is near -perfect, which is rare in animals, RASC cleaves the target mRNA, destroying it.

More commonly in animals, the pairing is less extensive.

This typically leads to translational repression.

The ribosome is blocked, and eventually the mRNA gets destabilized and degraded, often by being shuttled to those P bodies.

So they mostly fine -tune expression levels downwards.

Exactly.

They act like rheostats dampening gene output.

And a single mRNA can often target hundreds of different mRNAs, while a single mRNA can be targeted by multiple different mRNAs, allowing for complex combinatorial control.

Wow.

Okay.

What about CERNAS, small interfering RNAs?

CERNAS are primarily involved in cellular defense, particularly against viruses and transposable elements, jumping genes, that produce double -stranded RNA, dsRNA, during their life cycle.

dsRNA is the trigger.

Yes.

An enzyme called Dyser chops up any long dsRNA it finds into short 22 -nucleotide CERNAN duplexes.

These CERNAS are then loaded into RASC, just like mRNAs.

The CERNAS guides RISC to find and destroy any other RNA molecules in the cell that have a complementary sequence, typically the viral or transposin RNA.

The system can even be amplified, making it a very potent defense.

Like a cellular immune system against foreign RNA.

That's a great analogy.

In plants, this RNA response can even spread systemically, making the whole plant resistant.

Amazing.

Does RNA only work on RNA?

Can it affect DNA or transcription?

Ah, yes, it can.

CERNAS can also guide complexes to modify chromatin and induce transcriptional gene silencing.

Oh.

In some organisms, CERNAS associate with argonaut and other proteins to form a complex called RITES, RNA -induced transcriptional silencing.

RITES uses a CERNAS to find complementary sequences in nascent RNA transcripts as they're being made at the DNA level, figure 7H.

While the gene is being transcribed.

Exactly.

And once targeted, RITES recruits enzymes that modify the nearby histones, adding repressive marks like H3K93, and sometimes even methylate the DNA itself.

This effectively converts the gene's locus into silent heterochromatin.

So RNAi can lead to long -term silencing at the DNA level too.

Yes.

It's crucial for controlling transposins and maintaining the silenced structure of important regions like centromeres.

Okay.

MIRNAS, CERNAS, what were the third type?

PIRNAS.

PEEWEE interacting RNAs, PIRNAS.

These are slightly longer than MIRNAS, CERNAS, and are most prominent in the germ line, the cells that give rise to sperm and eggs.

Protecting the next generation.

Precisely.

Their main job seems to be silencing transposable elements in germ cells, preventing these jumping genes from causing mutations that would be passed on.

This is especially critical during stages when other epigenetic silencing marks are temporarily erased.

They work with PEEWEE proteins, a subclass of argonaut proteins, to both cleave transposin mRNAs and promote repressive chromatin formation at transposin DNA sequences.

A dedicated germ line defense force.

You could say that.

And like RNAi, these small RNA pathways are not just fascinating biology.

They're incredibly useful tools.

For research.

Yes.

Scientists routinely use synthetic CERNAS to deliberately knock down the expression of specific genes to figure out what they do.

And as we mentioned, there's huge therapeutic potential RNA range.

RNAi based drugs are already approved for treating certain genetic diseases.

So understanding these natural pathways opened up new technologies.

Absolutely.

And RNAi isn't the only defense.

Cells also have sequence specific DNA binding proteins, like the CrabZFP family, that directly recognize transposing sequences in the genome and recruit silencing machinery to shut them down.

Multiple layers of defense.

Always.

And speaking of defense systems turned into tools.

CRISPR.

Ah, the gene editing revolution.

That came from bacteria.

It did.

CRISPR -Cas is fundamentally an adaptive immune system in bacteria and archaea against invading viruses, bacteriophages.

How does it work in bacteria?

When a bacterium survives a viral infection, it incorporates small pieces of the viral DNA into a specific region of its own genome called the CRISPR locus.

This locus acts like a memory bank of past infections.

Like vaccination cards.

Sort of.

The CRISPR locus is then transcribed into RNA, which gets processed into small CRISPR RNAs.

Each CRRNA contains a sequence matching a past invader.

These CRRNAs then associate with Cas, CRISPR -associated proteins, often a nucleus like Cas9.

And this complex searches for?

It searches the cell for DNA sequences matching the CRRNA guide.

If it finds a match, usually in an invading viral genome, the Cas protein cuts the viral DNA, neutralizing the threat.

An RNA guided DNA muscle.

Exactly.

And the profound insight was realizing this system could be reprogrammed.

Scientists figured out how to design custom guide RNAs to target almost any DNA sequence in any organism and use the Cas enzyme to make a precise cut at that location.

Which the cell then repairs, allowing for edits.

Precisely.

It's revolutionized genetic engineering, all from studying a bacterial immune system.

Incredible.

Okay.

One last category, long and non -coding RNAs, LNC RNAs.

You call them the dark matter.

Yeah, because for a long time, we didn't know what most of them did.

These are defined as NCRNAs longer than 200 nucleotides.

And there are thousands of them, maybe 5 ,000 or more in humans.

Figure seven Ernie too.

More than protein coding genes.

No, wait, fewer, but still a lot.

Still a surprisingly large number.

And their functions are incredibly diverse and many are still unknown.

But we're starting to see some common themes or modes of action.

Such as?

One major role is acting as scaffolds.

They combine multiple proteins and hold them together in a specific arrangement to form a functional complex.

Think of the telomerase RNA component or exist coding the X chromosome.

Okay.

Architectural roles.

Another is acting as guides.

They can use base pairing to bind to specific DNA or RNA targets, thereby recruiting associated proteins that the LNC RNA binds to, to those specific locations, bringing the right machinery to the right place.

Exactly.

And a third emerging role is as organizers of cellular structures, particularly biomolecular condensates.

Some LNC RNAs seem to be crucial for seeding or maintaining the structure of compartments like the nucleolus or P bodies.

Helping to form those dynamic workshops.

Seems so.

There are other roles too.

Some act as antisense regulators.

Some act as sponges that bind and sequester mirenaise.

They can act locally near where they're transcribed, cysts or diffuse and act elsewhere.

Trans.

It's a really dynamic and complex field, a major frontier in understanding genome regulation.

Definitely sounds like it.

Outro.

Wow.

Okay.

That was quite the journey.

An incredible deep dive into gene expression control.

It really covers a lot of ground, doesn't it?

From the basic switches to these elaborate epigenetic and RNA based networks.

We've gone from transcription factors, literally reading DNA grooves through chromatin remodeling, cell memory, reprogramming.

All the way to RNA interference, CRISPR, and the mysteries of long non -coding RNAs.

It really highlights how cells, starting with that identical and genetic blueprint, managed to orchestrate such vast differences in what they look like and what they do, that constant intricate symphony.

It is.

And what strikes me is the sheer sophistication of it all.

These interlocking control networks essentially allow a single cell to compute its behavior, integrating signals from its past and its present environment.

Making decisions based on complex information.

Exactly.

And yet even with all this detailed knowledge about the parts list and many of the mechanisms, we're really still just beginning to understand how it all comes together to specify a complete cell type, let alone build an entire organism like us from a single fertilized egg.

There's still so much more to learn about how the whole system integrates.

Absolutely.

The complexity suggests a depth of biological programming that frankly still challenges our full comprehension.

It's humbling, really.

So maybe the next time you just sit and think or, you know, digest your lunch, take a moment to appreciate the silent,

unbelievably complex symphony playing out inside every single one of your cells.

That constant adaptation, that memory, the orchestration that makes life possible.

We really hope this deep dive has given you a fresh appreciation for the molecular marvels that make you, well, you.

It's been fascinating to explore with you.

Thank you so much for joining us on this exploration of gene control.

Keep learning, keep questioning, and we'll see you next time on the deep dive.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Gene regulation at the molecular level involves a sophisticated interplay of proteins, chromatin dynamics, and chemical modifications that enable cells to express different genes despite possessing identical genetic material. Transcription factors recognize and bind to specific DNA sequences within promoters and enhancers, functioning as molecular switches that either activate or block transcription of target genes. These regulatory proteins rarely act in isolation; instead, they assemble into multi-protein complexes that integrate signals from diverse cellular pathways, allowing cells to respond appropriately to changing environmental and developmental cues. The physical accessibility of DNA plays a critical role in determining which genes can be transcribed, as nucleosomes and higher-order chromatin structures can effectively silence genes by blocking transcription factor access. Histone modifications and DNA methylation patterns establish stable epigenetic states that persist through cell divisions, enabling cells to maintain their identity while still retaining the ability to activate or silence genes when biological circumstances demand it. Beyond transcriptional regulation, cells employ multiple post-transcriptional mechanisms to fine-tune protein production, including selective removal of introns through alternative splicing, controlled degradation of messenger RNA molecules, spatial sequestration of transcripts to specific cellular locations, and suppression of gene expression through microRNA-mediated pathways. Regulatory elements such as enhancers, silencers, and insulators function as information integration hubs that coordinate expression across large genomic distances, often with the assistance of mediator proteins that physically bridge regulatory proteins and RNA polymerase. Feedback mechanisms and genetic switches generate robust yet flexible control systems that allow cells to adopt distinct developmental pathways and maintain stable phenotypic states. Master regulatory proteins can reprogram cell fate, as exemplified by transcriptional networks that govern developmental processes and the mechanisms underlying induced pluripotent stem cell technology. Transcriptional memory systems ensure that once developmental decisions are made, they remain stable across numerous cell divisions, providing long-term cellular memory without requiring continuous signaling input.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 7: Control of Gene Expression

Related Chapters