Chapter 12: Gene Transcription and RNA Modification
Welcome to Last Minute Lecture.
This free chapter overview is designed to help students review and understand key concepts.
These summaries supplement not replaced the original textbook and may not be redistributed or resold.
For complete coverage, always consult the official text.
Welcome to the Deep Dive.
Today we're diving into something pretty fundamental, how life actually, you know, uses its blueprints.
Yeah, we all hear about DNA, the big instruction manual.
Right, but the real action, the dynamic part, is how those instructions get read and turned into, well into us.
Exactly.
So today we're exploring gene transcription and RNA modification.
We're pulling key ideas from chapter 12 of Brooker's Genetics, seventh edition.
And our goal here isn't just to list facts, but to really get a feel for the mechanisms, the clever experiments,
the way information actually flows from DNA out into the cell.
Making it accessible, hopefully.
It all starts with what they call the central dogma, right?
The basic idea, DNA makes RNA, RNA makes protein.
And transcription is that critical first step, making an RNA copy from the DNA template.
Crucially, without changing the original DNA, it's like photocopying a page from a rare book you use the copy, leave the original safe.
Okay, so let's break down that DNA blueprint.
What exactly is a gene at the molecular level?
Well, simply put, it's a stretch of DNA that codes for a functional product.
That could be an RNA molecule itself, or, you know, a polypeptide that becomes a protein.
It's a specific instruction set.
Yeah.
And it needs signals, right?
Like punctuation.
Absolutely.
You've got the promoter, which is basically the start here signal for transcription.
Okay.
And then the terminator, which, well, says stop here.
Simple enough.
Start, stop.
But biology is never that simple, is it?
Huh?
Rarely.
There are also regulatory sites.
Think of these less like on -off switches and more like dimmer switches.
Ah, controlling how much transcription happens.
Exactly.
Or when it happens, they allow for really fine -tuned control, which is essential.
And when the RNA is made, it reads only one side of the DNA helix.
That's right.
One strand is the template strand.
That's the one the RNA polymerase actually reads.
And the other?
That's the non -template strand, sometimes called the coding strand, because its sequence looks almost exactly like the RNA sequence.
Oh, right.
Except with T instead of U.
Exactly.
So this whole process, transcription, it happens in stages.
Yep.
Three universal stages, whether you're a bacterium or a human.
Initiation, getting started.
Elungation, building the RNA chain.
And termination, stopping.
And proteins drive this.
RNA polymerase is the main enzyme, I guess.
That's the key player, yes.
Along with various transcription factors, proteins that help recognize the signals and manage the process.
It sounds so neat now.
But figuring this out must have been a journey.
How did we even realize DNA was the template for RNA?
Especially back in the day.
Oh, it's a great story.
Back in 1956, Vulcan and Astrakhan were looking at bacteria infected with viruses.
Okay.
And they noticed the new viral RNA being made had a base composition remarkably similar to the viral DNA, not the bacterial DNA.
Ah, a huge clue.
DNA
and monad had this, well, incredible insight.
The mRNA hypothesis.
Exactly.
They proposed this idea of a messenger RNA carrying the code from DNA to the ribosomes.
And this was before anyone had actually isolated mRNA.
Wow.
That's some serious scientific foresight.
Truly remarkable.
It really set the stage for understanding gene expression.
So focusing on bacteria first, those promoters, the start signals, there's a numbering system.
Yeah.
The spot where transcription actually starts is labeled plus one.
Anything before that upstream gets negative numbers like negative 10 or negative 35.
And those numbers aren't random, are they?
The negative 10 and negative 35 spots are important.
Extremely important.
They contain specific sequences.
The negative 35 sequence is often TTGCA.
And the negative 10 sequence, the famous Pribnow box, is typically tut -tut.
And these are like ideal sequences, the consensus.
Yes.
The most common version found across many genes.
And here's what's cool.
The closer a gene's promoter is to that consensus.
The better transcription works.
Generally, yes.
Stronger binding, faster initiation.
Experiments, like with the lac operon promoter, showed that if you mutate the promoter to be less like the consensus, transcription slows right down.
Shows how specific these interactions are.
And the machine reading this is the RNA polymerase hollow enzyme.
Right.
In Busteria, it's made of the core enzyme, the part that actually links the RNA nucleotides together and this crucial extra piece called the sigma factor.
And sigma factor is the guide dog, finding the promoter.
That's a great way to put it.
Its main job is promoter recognition.
It has a specific shape, a helix turn helix motif that lets it latch onto those nanofertifive and nanoten sequences really specifically.
So initiation starts with the hollow enzyme kind of scanning the DNA.
Loosely, yeah.
Then sigma factor spots the promoter and it clamps down tighter.
That's the closed complex.
Closed because the DNA is still double stranded.
Exactly.
To start making RNA, you need to unwind it.
This happens mostly at the negative 10 region.
Because it's to top lots of A's and T's.
You got it.
AT pairs only have two hydrogen bonds versus three for GC pairs, so they're easier to pull apart.
That creates the open complex.
And then RNA synthesis starts.
A short piece gets made and then, interestingly, the sigma factor usually pops off.
That's the signal that we're moving from initiation to the next stage.
Elongation.
Okay, so elongation.
The core enzyme is now chugging along the DNA.
Yep.
Sliding along, keeping a bubble of about 17 base pairs of DNA unwound.
And it's building the RNA chain always in the five prime to three prime direction.
Adding nucleotides that match the template strand.
U for T, G for C.
Correct.
And, you know, something quite neat is that on a single chromosome, different genes might use different DNA strands as the template.
Oh, really?
So the polymerase can read left to right on one gene and right to left on another, depending on how the promoter is oriented.
Precisely.
It adds flexibility.
Okay, so it's elongating.
How does it know when to stop?
Termination, you mentioned two ways in bacteria.
All right.
First is Rho -dependent termination.
This needs a helper protein, the Rho protein.
What does Rho do?
It's a helicase, basically an unwinder.
It recognizes a specific sequence on the growing RNA called the RUT site.
As the RNA gets longer, sometimes it folds into a stem loop structure.
This can make the RNA polymerase pause.
Giving Rho time to catch up.
Exactly.
Rho travels along the RNA, catches up to the paused polymerase, and then unwinds the RNA from the DNA template, releasing the transcript.
Clever.
And the other way, Rho -independent.
Also called intrinsic termination.
This one doesn't need Rho.
The RNA sequence itself does the trick.
Again, the RNA forms a stem loop, which causes the polymerase to pause.
But right after the stem loop, the RNA sequence is really rich in uracils.
Lots of U's pairing with A's in the DNA template.
Right.
And those AU pairs are weak.
Remember, only two hydrogen bonds.
So while the polymerase is paused, the weak connection in that U -rich region just isn't strong enough to hold the RNA on.
Falls off.
Pretty much.
It spontaneously dissociates.
There's also a protein, new SA, that seems to help stabilize that pause, making termination more efficient.
Okay, so bacteria have these elegant, relatively simple systems.
But then we look at eukaryotes, like us, and things get way more complicated.
Why?
Well, think about it.
Eukaryotic cells are bigger.
They have internal compartments, organelles.
And multicellular organisms need different genes active in different cells at different times.
Precisely.
You need much, much more sophisticated control over gene expression.
Development cell specialization.
It demands layers of regulation bacteria just don't need.
And part of that complexity is having multiple RNA polymerases.
Yes.
Eukaryotes have three main ones, each specialized.
RNA polymerase mostly does ribosomal RNA genes.
The structural RNA is for ribosome.
Right.
RNA polymerase 3 handles transfer RNA genes and one type of RNA.
But the one we focus on most for protein coding genes is RNA polymerase 2.
Pol2 makes the messenger RNA, the mRNA.
Correct.
And interestingly, despite these specialized roles, if you look at their structures, all three eukaryotic polymerases share core similarities with the single bacterial one.
Suggests a common ancestor, evolutionary conservation.
Absolutely.
The basic machine is conserved, but adapted for more complex tasks.
So eukaryotic promoters must also be more complex.
Definitely.
For Pol2 genes, you have a core promoter, right near the start site.
This often includes the famous Tata box, usually about 25 base pairs upstream.
Ta -ta.
That's the one.
The core promoter is enough to allow for what's called basal transcription, just very low level of activity.
But you need more for real gene expression.
Yes.
That's where regulatory elements come in.
Things like enhancers that boost transcription and silencers that inhibit it.
And these can be really far away from the gene itself.
Thousands of base pairs away, upstream or downstream, even within introns.
But they still influence that core promoter.
Do they loop the DNA around?
That's part of it, yeah.
These DNA sequences, enhancers, silencers, the Tata box are called cis -acting elements because they're on the same DNA molecule as the gene they control.
Cis for same side?
Right.
And the proteins that bind to these sites are called transacting factors.
They can come from genes anywhere else in the genome.
So the factors diffuse and find their target DNA sequences.
Exactly.
It creates this intricate network of control.
Okay.
Let's talk eukaryotic initiation.
With Pol2, it sounds like a big production.
It is.
You need Pol2 itself, of course, but also a whole suite of proteins called general transcription factors, or GTFs, and this huge complex called mediator.
Wow.
How does it assemble?
It usually starts with a GTF called T -dead -eyed, which contains the Tata binding protein Tbp, recognizing and binding the Tata box.
That's the anchor point.
Kind of, yeah.
Then others join in a specific order.
TFIb, then Pol2, arrives escorted by TFIf, then TFIe, and TFIiH.
Altogether, that's the pre -initiation complex.
Still a closed complex at this point.
Yes.
Then TFIiH plays a couple of crucial roles.
It acts as a helicase to unwind the DNA at the start site, creating the open complex.
Just like the unwinding at the Pribnow box in bacteria.
Similar principle, yeah.
And TFIiH also has kinase activity.
It adds phosphate groups to a region on RNA Pol2, the carboxyl terminal domain, or CTD.
Vosphorylation.
What does that do?
It's like flicking a switch.
It causes Pol2 to release its grip on some of the GTFs, particularly TFIiB, allowing it to escape the promoter and start elongation.
And where does mediator fit in?
The big complex.
Mediator is fascinating.
It's like a central coordinator.
It physically bridges RNA Pol2 and the GTFs at the core promoter with those distant regulatory factors bound to enhancers or silencers.
So it integrates all the signals.
Exactly.
It helps regulate that CTD phosphorylation, really controlling the transition from initiation to elongation.
Roger Kornberg won a Nobel Prize for figuring out a lot of this.
Amazing complexity.
Okay.
So Pol2 gets going.
How does it terminate in eukaryotes?
Is it like bacteria?
It's different.
For Pol2 genes, termination happens, usually quite a bit downstream from a specific signal in the RNA, the polyadenylation signal sequence A -ABU -A.
And there are theories about how it actually stops.
Two main models.
The allosteric model suggests that passing that signal sequence somehow changes Pol2, makes it less stable, and it eventually just falls off the DNA.
Well, it runs out of steam.
Sort of.
The other is the torpedo model.
This one's a bit more dramatic.
After the RNA is cut near the polyadenylation signal,
an exonuclease, an enzyme that chews up RNA from an end, latches onto the remaining bit of RNA still attached to the polymerase.
And it starts degrading it.
Yes.
Chasing after the polymerase like a torpedo.
When it catches up and crashes into Pol2.
It knocks it off the DNA.
That's the idea.
And these models might not be mutually exclusive.
Reality could involve bits of both.
Okay, so we have our primary RNA transcript.
But in eukaryotes, it's often not ready yet, right?
This brings us to RNA modification.
And a big surprise came in the 70s.
A huge surprise.
Scientists discovered that the sequence of a gene in the DNA wasn't always a direct continuous match to the sequence of the final mRNA.
They weren't collinear.
What was going on?
They found that the initial RNA transcript, the pre -mRNA, contained stretches of sequence that were cut out before the RNA went to the ribosome.
These are the introns, intervening sequences.
Exactly.
And the parts that were kept and joined together to make the final mature mRNA were called exons.
So this cutting and pasting process is RNA splicing.
Precisely.
And it's just one type of RNA modification.
There's also processing by cleavage, capping the front end, polyad annihilation, adding a tail, sometimes even RNA editing, changing bases, and base modification.
Lots of ways to shape that initial transcript.
Let's look at processing first, cutting RNA into pieces.
Good example is ribosomal RNA.
In eukaryotes, a large precursor, the 45S rRNA, is transcribed.
Then, in the nucleolus, it gets precisely cleaved into the mature 18S, 5 .8S, and 28S RNAs that form the ribosome.
Okay.
What about tRNA?
Transfer RNA needs processing, too.
Oh, yes.
Quite a bit.
The precursor tRNA gets trimmed at both ends.
Enzymes called exonucleases chew back from the ends, while endonucleases cut internally.
Like ANISA.
Exactly.
ANISA -P is a classic endonuclease.
It creates the precise five -foot end of the mature tRNA.
Then other enzymes trim the three -minute end.
And finally, tRNA nucleotidal transferase adds the crucial CCA sequence at the three -minute end.
That's where the amino acid attaches for protein synthesis.
Right.
And tRNAs also get extensive base modifications, changing some bases into unusual forms.
An ANISA that led to another massive discovery, didn't it, about RNA itself?
It absolutely did.
For the longest time, the central dogma of biochemistry was proteins are the enzymes, the catalysts.
But in the 1980s, Sidney Altman and his group found that the catalytic activity of ANISA wasn't in its protein part, but in its RNA part.
Wow.
RNA acting as an enzyme.
Yes.
This led to the concept of ribosomes, RNA molecules with catalytic function.
It completely changed our view of RNA.
It wasn't just a passive messenger.
That's incredible.
Okay.
Back to splicing, removing introns.
You said some can do it themselves.
Yes.
Some introns called Group I and Group II introns are self -splicing.
They act as their own ribozymes.
Thomas Seck won a Nobel for showing this with Group I introns in tetrahemina RNA.
No external proteins needed.
But that's not the common way for mRNA, is it?
No.
Most pre -mRNA splicing in eukaryotes requires a huge, complex machine called a
It's assembled from smaller units called SNRNPs, small nuclear ribonuclear proteins.
Think U1, U2, U4, U5, U6 SNRNPs.
Each contains small nuclear RNA and several proteins.
And what does the spliceosome actually do?
Three main things.
It recognizes the boundaries between introns and exons the splice sites.
It holds the pre -mRNA in the right shape, and it catalyzes the cutting and joining reactions.
How does it know where to cut?
This SNRNT is recognized specific short sequences at the five -foot splice site, the three -foot splice site, and an internal branch point within the intron.
U1 binds the five -foot site, U2 binds the branch point.
Then others join, looping the intron out.
Exactly.
The intron forms a characteristic lariat structure.
Then the spliceosome makes two precise cuts and ligates the two exons together.
And is there RNA catalysis here too, like in RNA?
Yes.
It turns out the U6 SNRNA, possibly along with U2, forms the catalytic core of the spliceosome.
It functions as a metalloribosome using magnesium ions, and interestingly, its active site looks remarkably similar to that of self -splicing Group II introns.
So existing a common evolutionary origin for splicing itself.
Amazing.
But why bother with introns at all?
It seems like a lot of extra work transcribing them just to cut them out.
That was a big question.
Why this apparent waste of energy?
The answer turns out to be a huge advantage.
Alternative splicing.
Meaning you can splice the same pre -mRNA in different ways.
Precisely.
You can skip certain exons or choose between mutually exclusive exons.
So from one single gene, one pre -mRNA, you can generate multiple different mature mRNAs.
And therefore multiple different proteins.
Exactly.
It vastly increases the coding potential of the genome.
You don't need a separate gene for every single protein variant.
So you get protein diversity, specialized functions.
Absolutely.
Think of the atropomyosin gene.
It has multiple exons.
In smooth muscle, it's spliced one way.
In striated muscle, it's spliced differently, including different exons.
Resulting in slightly different protein versions suited for each muscle type.
Yes.
Subtle functional differences tailored to the cell's needs, all from one gene.
And this isn't just random.
It's controlled.
Oh, definitely.
It's regulated by proteins called splicing factors.
Yeah.
Some act as repressors, causing an exon to be skipped.
Others act as enhancers, promoting the use of a particular splice site.
It's another layer of intricate gene regulation.
Okay.
Besides splicing, what about the ends of the mRNA capping?
Right.
The five -era cap.
Very early on, as the mRNA is being made, a special modified guanine nucleotide is added to the very beginning, the five -foot end.
It's a seven -methylgrana -cine cap.
That's important.
Critically important.
It's needed for the mRNA to get exported out of the nucleus.
It's recognized by the ribosomes to start translation.
And it even helps with the splicing of the first intron, a real multitasker.
And the other end,
the three -mera.
That gets the polyA tail, a long string of adenine nucleotides, hundreds sometimes added after transcription.
It's not coded in the DNA gene itself.
How does that happen?
That AAUAAA polyadenylation signal sequence we mentioned earlier gets recognized.
An enzyme cuts the RNA just downstream of it, and then another enzyme, polyepilimary, starts adding A, one after another.
And the tails function.
Is it like the cap?
Similar roles, yeah.
It helps with nuclear export.
It greatly increases the stability of the mRNA out in the cytoplasm, protects it from degradation.
And it also plays a role in efficient translation initiation.
Interesting.
But you mentioned that in bacteria,
polyadenylation does the opposite.
It triggers degradation.
Isn't that wild?
Same modification, totally different outcome depending on the organism, evolution finding different uses for the same tool.
Okay, one more modification.
RNA editing.
This sounds like actually changing the message after it's written.
That's exactly what it is, changing the base sequence of the RNA molecule itself.
It can involve adding or deleting bases, like the uricils added in trypanosome mitochondria.
Which apparently caused a lot of disbelief initially.
It did.
Or it can involve chemically converting one base to another.
C to U is common, or A to I, inosine, which the ribosome reads as G.
Is there a good example in mammals?
The classic one is the epilepiprotein B gene.
In the liver, the full length mRNA is made, producing a large protein.
But in intestinal cells, a single C in the mRNA is edited to a U.
C to U, what does that change?
It changes a codon for glutamine, CAA, into a stop codon, UAA.
So the intestinal cells make a much, much shorter truncated version of the protein with a completely different function related to fat absorption.
Wow.
So one gene, two very different proteins, just through a single base edit in the RNA.
Exactly.
Like alternative splicing, it's another way to generate protein diversity from the same gene.
So if we pull back and compare bacteria and eukaryotes,
the difference in complexity is huge.
It really is.
Bacteria.
Usually one RNA polymerase, simple promoters, node 4 -5, node 10.
Sigma factor for initiation, row or intrinsic termination, very rare splicing, no capping.
PolyA tail often means destroy, streamlined.
Eukaryotes.
Three polymerases, complex promoters with tata boxes and regulatory elements far away.
Lots of GTFs plus mediator for initiation,
complex termination involving cleavage and polyadenylation signals, extensive splicing via the spliceosome, five foot capping, polyA tail for stability and translation, sometimes RNA editing.
It's all about layers of regulation and the potential for diversity, reflecting the complexity of eukaryotic life.
It's just staggering.
The sophistication of it all.
How the cell takes that static DNA code and interprets it, modifies it, optimizes it in so many dynamic ways.
It really is remarkable molecular machinery.
It leaves you wondering, doesn't it, with all this plasticity from things like alternative splicing and RNA editing, what else might we uncover about how cells generate so much functional diversity?
Yeah.
How much more complexity is hidden in these post -transcriptional steps?
How could understanding that better change things like medicine or biotechnology in the future?
Lots to think about.
Definitely food for thought.
Well, thank you for joining us on this Deep Dive, and thank you, our listeners, for being part of the Deep Dive family.
Keep exploring the wonders of science.
ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.
Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.
Support LML ♥Related Chapters
- Transcription and RNA ProcessingPrinciples of Biochemistry
- The Genetic Code and TranscriptionEssentials of Genetics
- TranscriptionGenetics: A Conceptual Approach
- Gene Expression I: Genetic Code & TranscriptionBecker's World of the Cell
- Transcription & RNA ProcessingPrinciples of Genetics
- Gene Expression: TranscriptioniGenetics: A Molecular Approach