Chapter 14: Transcription: Synthesis of RNA

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome to the Deep Dive.

Imagine for a moment this incredible orchestration happening inside every single cell in your body.

Yeah, it's like a bustling molecular factory, isn't it?

Totally.

Constantly converting your genetic blueprint, your DNA, into the actual workhorse molecules of life.

And today, we're diving into one of the most fundamental steps in that factory.

Transcription.

Exactly.

We're going to unpack a key chapter from Mark's basic medical biochemistry, a clinical approach.

Focusing specifically on how RNA gets synthesized from a DNA template.

Our mission really is to guide you step by step through the concepts, the pathways.

And importantly, the clinical examples that make this stuff, you know, stick.

We want to make it understandable, memorable.

You'll walk away, hopefully, with a clearer picture of how RNA is made and also the, well, surprising differences between bacteria and us.

And how tiny errors, I mean, really tiny, can have huge health impacts.

We'll try to take vivid pictures with words so you can almost see these molecular machines working.

Visualize it, yeah.

Okay, so let's start right at the beginning.

What is transcription, fundamentally?

At its core, it's simply the synthesis of RNA using DNA as a guide, a template.

Like making a working copy from a master blueprint.

Precisely.

The DNA is the blueprint, RNA is the copy.

It's the very first step in expressing a gene, tuning that genetic info into something functional.

And the main enzyme doing this copying, the scribe, we could call it.

That's RNA polymerase.

It's the central player here.

Right.

It creates a single -stranded RNA molecule.

And the sequence of this RNA, it's almost identical to one of the DNA strands.

The coding strand or sen strand?

Exactly, but with one key difference you absolutely have to remember.

RNA uses uracil, U, instead of thamine, T.

So where DNA has an A in the coding strand, the RNA copy gets a U.

Got it.

A pair is with U and RNA, not T.

And the direction matters too, doesn't it?

Oh, absolutely.

The DNA template strand, the one being red, is read from 3' to 5'.

And the new RNA molecules build, synthesized, in the opposite direction, 5' to 3'.

It sounds a bit like DNA replication, but you mentioned a key difference between RNA polymerase and DNA polymerase.

Yes, a really critical one.

RNA polymerases can just start synthesis from scratch.

They don't need a primer molecule like DNA polymerases do.

Ah, okay, so they just begin.

They just begin.

And another thing, while they do have some proofreading, it's much less thorough than DNA polymerases.

Meaning more mistakes.

Meaning a higher error rate, yeah.

Maybe around one error in every 100 ,000 bases added.

It's a trade -off, really.

Less accuracy, maybe, but it gets the job done.

So with our huge genome, how does RNA polymerase know where to start copying a gene?

Does it just guess?

Huh, no, thankfully not.

There are specific DNA sequences called promoters.

Promoters, like signals?

Exactly, like signals.

They tell RNA polymerase, bind here, start copying from this point.

Okay.

And they don't just signal the start.

They also influence how often a gene gets transcribed, controls its expression level.

And there are other sequences involved, too.

Yes, things like promoter proximal elements and enhancers.

These can fine -tune the frequency, sometimes from really far away on the DNA.

They help regulate things precisely.

That's a great foundation.

Now, it gets really interesting when you compare how this works in different life forms, like bacteria versus us.

Oh, hugely different worlds.

Let's start with bacteria.

Their system is beautifully simple, in a way.

How so?

They have just one type of RNA polymerase.

It does everything makes mRNA, rRNA, tRNA,

all three major types.

One enzyme fits all.

Pretty much.

And crucially, bacteria don't have a nucleus.

Right, that's a big difference.

So the DNA is just out in the cytoplasm.

Exactly.

And this means ribosomes, the protein -making machines, can jump onto the mRNA while it's still being made by the RNA polymerase.

Wow.

So transcription and translation happen at the same time.

Yes.

It's called coupled transcription translation.

It allows bacteria to make proteins very, very quickly.

Super efficient.

And knowing this difference is actually useful in medicine, isn't it?

Absolutely.

This leads us to a clinical point, someone called Isabelle S.

Isabelle had HIV, but then also developed tuberculosis caused by mycobacterium tuberculosis.

Part of her treatment was the antibiotic refampin.

And refampin works because it specifically targets and inhibits the bacterial RNA polymerase.

Ah, so it stops the bacteria from making RNA but leaves our own human polymerases alone.

Precisely.

It exploits that difference between the bacterial and eukaryotic machinery.

Selective toxicity.

Clever.

A very effective strategy against bacterial infections like TB.

Yes, definitely.

Now, switching gears to eukaryotes,

the complexity brings its own vulnerabilities.

Sometimes tragically so.

You mean things that can go wrong with our polymerases?

Yes.

Consider the case of Catherine T.

She accidentally ate aminetofalate's mushrooms.

Oh no, the death cap.

The very one.

These mushrooms contain a deadly toxin, unamanitin.

And what does it do biochemically?

It's a potent inhibitor, specifically of eukaryotic RNA polymerase II.

Polymerase II.

That's the one that makes mRNA, right?

Yeah.

The instructions for proteins.

Exactly.

So imanitin effectively shuts down the production of nearly all protein -coding instructions in our cells.

That sounds catastrophic.

It is.

It leads to severe GI problems, electrolyte chaos, and eventually terrible liver and chile damage, often irreversible.

The mortality rate is significant, 10, 20 percent.

Catherine was lucky.

Hers was mild.

But it really shows how vital RNA pol II is.

We can't live without it.

So you mentioned eukaryotes have more complex machinery.

We have three polymerases.

That's right.

We divide the labor.

Polymerase II mostly makes ribosomal RNAs.

The components of ribosome.

Polymerase III makes small RNAs, like transfer RNAs, and one type of RNA, the 5S RNA.

And polymerase II, the one imanitin hits, is for mRNA.

Primarily mRNA, yes.

And also microRNAs, which are important gene regulators.

And unlike bacteria, this all happens inside the nucleus.

So transcription is in the nucleus, translation is out in the cytoplasm.

No coupling.

The RNA has to be processed first, then shipped out, adds more layers of control.

Let's talk about that processing.

Once pol II makes that first RNA copy, the pre -mRNA,

it's not ready yet, is it?

Not at all.

That initial primary transcript is pretty raw.

It needs significant modification before it can leave the nucleus and be translated.

Often described as getting a head and a tail.

That's a good way to think of it.

Let's start with the head.

The five cap.

OK.

While pol II is still chugging along making the RNA, a special modified guanosine nucleotide is added to the very beginning, the five end.

And it's added backwards, sort of.

It is.

It's a unique five to five foot triphosphate linkage.

And that guanine is methylated at position seven.

Sometimes the ribose sugars nearby get methylated too, cap zero, one, or two.

Do those different caps, like cap one versus cap two, actually mean something different for the mRNA?

It seems they might, yeah.

It's an area of active research, but these subtle differences could influence the mRNA stability or how well it gets translated.

It's like fine tuning signals.

And what's the main job of this cap?

Two key things.

It protects the mRNA from being chewed up by enzymes.

Like a protective helmet.

Exactly.

And it's crucial for the ribosome to recognize the mRNA and know where to start protein synthesis.

And there's a nutrient link here too.

Something about vitamins.

Right.

Those methyl groups for the cap come from a molecule called S -adenosylmethionine, or SAM.

OK.

And to regenerate SAM after it donates its methyl group, your body needs folate and vitamin B12.

So a B12 or folate deficiency could actually impact mRNA capping.

It could potentially affect it, yes.

It shows how interconnected metabolism and gene expression are.

OK.

So that's the cap at the phi fed.

What about the tail?

That's the 3 -poly -A tail.

This happens after transcription finishes.

After the stop codon.

Right.

The polymerase transcribes past the stop codon, and then there's usually a signal sequence in the RNA.

A -A -U -A -A -A.

A signal for what?

A signal for cleavage.

Yeah.

The RNA gets cut nearby, and then a different enzyme, poly -polymerase, tacks on a long string of adenine nucleotides, maybe over 200 As.

And it does this without a DNA template, just adds As.

Just adds As.

No template needed for this step.

And this tail, like the cap, is for protection.

Yes.

It helps protect the 3N from degradation, and it's also a binding site for proteins involved in regulating the mRNA's lifespan and translation.

And again, this process has to be incredibly precise.

Absolutely critical.

Let's go back to Lisa Ann, the girl with Blyplus dallasalisemia.

The blood disorder.

Yes.

In her case, the problem was a single base change in that polyadenylation signal.

A -U -A became A -E -A -A -E.

Just one letter difference.

Just one letter.

But it drastically reduced the amount of properly processed blobin mRNA.

She only made about 10 % of the normal amount.

Wow.

A tiny change.

A huge effect on protein levels.

It really underscores the need for precision.

It really does.

Every step matters.

Okay, so we have a cap, we have a tail, but the message itself often needs editing, right?

This thing called splicing.

Yes, splicing.

Yeah.

This is maybe the most complex modification.

Eukaryotic genes are often interrupted.

Interrupted.

Yeah, the coding sequences, called exons, are often broken up by non -coding sequences called introns.

So the pre -mRNA has both exons and introns.

Correct.

The introns are like extra bits that don't actually code for the final protein.

They need to be cut out.

And the exons need to be stitched together perfectly.

Perfectly.

That's splicing.

Removing the introns and joining the exons together in the correct order.

It has to be incredibly accurate.

One base off and the whole protein code downstream can be scrambled.

How does the cell manage that accuracy?

There are specific sequences at the boundaries, the junctions between introns and exons.

Usually introns start with GU at the five -foot end and end with AG at the three -foot end.

Okay, signal sequences again.

Right.

And the actual cutting and pacing is done by a huge molecular machine called the spliceosome.

Spliceosome.

It's made of small ribonucleoproteins, SNRNPs for short.

Sometimes called SNRPs.

SNRPs.

I like that.

Yeah, it's memorable.

These SNRPs recognize the splice sites and orchestrate the whole process.

So what happens if splicing goes wrong?

Because of mutation, maybe?

Very serious consequences.

For instance, some forms of cell thalassemia where basically no functional black globin is made.

The zero meaning none.

Exactly.

These can be caused by mutations right at those critical GU or AG splice sites.

If those are mutated, normal splicing just fails.

Completely.

Needing to know protein.

Right.

But it's not just genetic mutations.

Autoimmune diseases can also target this process.

How so?

Remember Sarah Elle?

She had systemic lupus erythematosus, or SLE?

Lupus, right.

An autoimmune disease.

In SLE, the immune system mistakenly attacks the body's own components.

And one common target,

the SNRNPs.

The very components of the spliceosome.

So the body is attacking its own mRNA editing machinery.

That's right.

It's a fundamental attack on how the cell processes its genetic instructions.

What kind of impact does that have?

I mean, attacking the spliceosome seems like it would cause widespread problems.

It absolutely does.

If splicing is compromised across many genes, it leads to widespread cellular dysfunction.

Sarah's symptoms.

The fatigue, chest pain, joint pain, the classic butterfly rash.

They reflect this kind of systemic disruption.

It really connects molecular machinery failure to clinical disease.

It's clear mRNA processing is intricate.

But you mentioned other RNA types earlier.

Ribosomal RNA, rRNA.

Right.

The backbone of the ribosome.

In eukaryotes, RNA polymerase I makes a big precursor molecule, the 45S pre -rRNA.

45S, that S means.

Sabre unit.

It's a measure of sedimentation rate, basically related to size and shape.

So 45S is pretty large.

And it gets chopped up.

Exactly.

This large precursor is cleaved and processed into those smaller, mature rRNAs.

The 18S, 28S, and 5 .8S rRNAs.

Which then combine with proteins.

Deform the ribosomal subunits.

That's the small 40S subunit and the large 60S subunit.

Which come together to make the working ribosome.

The complete 80S ribosome, ready for protein synthesis.

Okay.

And the third type, transfer RNA, tRNA, the adapters.

Yes, the adapters.

Synthesized by RNA polymerase III, their job is crucial.

Read the mRNA code and bring the correct amino acid.

And they have a very specific shape, don't they?

They do.

A characteristic cloverleaf shape in 2D, which folds up into a 3DL shape.

What are the key parts of that structure?

Well, there are loops like the D -loop and the TEC loop.

But most importantly, there's the anticodon loop.

Anticodon that pairs with the mRNA.

Precisely.

It has a 3 -nucleotide sequence, the anticodon, that base pairs with the complementary 3 -nucleotide codon on the mRNA.

That's how it reads the code.

And where does the amino acid attach?

At the 3 -end of the tRNA, there's always a CCA sequence there.

And the specific amino acid gets attached to that final A.

So these different loops in the CCA end, they're like the tRNA's toolkit for doing its job correctly.

That's a great way to think about it.

The loops help with recognition by enzymes that attach to the amino acid, ensuring accuracy.

The whole structure is optimized for its adapter role.

And tRNAs also get modified after being transcribed.

Extensively.

Bases get modified, like uracil turned into thymine or pseudoradine.

And precursor tRNAs get cleaved by enzymes like RNAsP to produce the final functional molecule.

We've covered the different RNAs and how they're made and processed.

But how does the cell control all this?

Decide which genes to transcribe, when, how much?

Ah, gene regulation.

The control room.

It's incredibly complex, especially in eukaryotes.

We already mentioned promoters, the start signals.

Right, the TATA box.

The TATA box is a key promoter element.

In bacteria, it's called the PRIP -NAU box, usually around the negative intent position.

In eukaryotes, it's the HOGNUS box, often to negative 25.

But not all eukaryotic genes have a TATA box.

Now, only about 12 .5 % or so.

There are other important promoter elements, too, like the initiator element.

NR, DPE, BRE.

They all help position RNA polymerase.

Why have an AT -rich sequence like TATA for a promoter?

Is there an advantage?

There is.

AT -based pairs only have two hydrogen bonds holding the DNA strands together, while GC pairs have three.

Ah, so it's easier to pull the strands apart at an AT -rich region.

Exactly.

It requires less energy to melt the DNA open, which is the first step needed for the polymerase to access the template strand.

Makes sense.

So we have the DNA sequences, the promoters, but proteins are involved in regulation, too.

Absolutely.

We talk about cis -acting elements, those are the DNA sequences themselves.

And transacting factors, those are typically proteins that bind to the cis -acting sequences.

Okay.

And in eukaryotes, these factors are pretty elaborate.

Very.

You have the basal transcription factors or general transcription factors, often labeled TFII something like TFII, TFIIB, et cetera, for polymerase II.

And their job is?

They assemble at the promoter, help recruit RNA polymerase II, and get a low baseline level of transcription going.

TFI contains the TATA -binding protein, TDP, which recognizes the TATA box.

So that's the basic on -switch.

Yeah.

But how do you turn the volume up, get high levels of transcription when needed?

That's where gene -specific transcription factors, sometimes called transactivators, come in, along with co -activators.

And where do they bind?

They bind to other cis -acting sequences, like promoter proximal elements near the start site, or enhancers, which can be thousands of base pairs away.

Thousands.

How do they influence the polymerase from so far away?

The DNA can actually loop around, bringing those distant enhancers close to the promoter region.

These specific factors then interact with the basal machinery, often recruiting other proteins.

Sort of like building a bigger, more powerful complex at the start site.

Exactly.

It significantly boosts the rate of transcription initiation.

Think of the basal factors getting the engine idling, and the specific factors and enhancers stepping on the gas.

You mentioned TFIH earlier as being interesting.

Yes.

TFIH is a great example of complexity.

It's part of the basal machinery, but it has multiple jobs.

It acts as a helicase, using ATP to unwind the DNA at the promoter.

Melting it open.

Right.

And as also a kinase, it phosphorylates RNA polymerase II,

which is thought to help it escape the promoter and start elongating the RNA chain.

And problems with TFIH cause disease.

They do.

Mutations in its helicase parts are linked to xeroderma pigmentosum, a DNA repair disorder causing extreme sun sensitivity and cancer risk.

And going back to Lisa -N and thalassemia, promoter mutations can cause that too.

Yes.

Besides the poly -A signal mutation, some forms of amyrala plus thalassemia are caused by mutations right in the balglobin genes promoter.

A change in the TATA box, for example, can mess up where transcription starts and reduce the overall amount of balglobin mRNA made, maybe down to 20 -25 % of normal.

Which leads to the disease symptoms.

Which leads to the anemia and other symptoms, yes.

Okay, this leads to a sort of bigger picture question.

Why are genomes, human genomes, so much bigger than bacterial ones?

It feels like there's a lot more DNA than just genes.

That's a great observation.

There are several reasons.

First, we're deployed two copies of each chromosome.

Bacteria are usually haploid.

Just one.

Okay, that doubles things but doesn't explain the huge difference.

Right.

Then you have introns.

Those non -coding sequences within eukaryotic genes add a lot of length.

Our initial pre -mRNAs can be 10 times longer than the final spliced mRNA.

Bacteria mostly don't have introns.

Still feels like there's more.

There is.

A massive chunk of eukaryotic genomes is made of repetitive DNA.

Sequences repeated over and over.

Doing what?

Well, some have structural roles.

Highly repetitive DNA, short sequences repeated millions of times, is often found at centromeres and telomeres.

Usually not transcribed.

What other types?

Moderately repetitive DNA.

Repeated maybe hundreds or thousands of times.

This includes some functional genes needed in large amounts like histone genes, but also a lot of non -coding sequences including mobile elements.

Mobile elements like jumping genes.

Kind of.

Things like ALU sequences.

These are short interspersed elements.

Say 90s, they make up maybe 6 -8 % of our entire genome.

Wow.

And lines, long interspersed elements, another 5 % or so.

These can and sometimes do copy themselves and jump around the genome.

And that jumping can cause problems.

It definitely can.

If a line inserts itself into the middle of an important gene, it can disrupt it.

This has been shown to cause some cases of hemophilia by disrupting the factor AIDS gene.

And ALI sequences.

They can cause trouble too.

Because there are so many similar ALU sequences scattered around, sometimes the DNA recombination machinery gets confused during meiosis and uses two different ALU repeats as alignment points.

Leading to.

Leading to deletions or duplications of the DNA between them.

This mechanism is known to cause some cases of familial hypercholesterolemia by deleting part of the LDL receptor gene.

So this repetitive DNA isn't just inert filler.

It has real consequences.

Absolutely.

It contributes to genome evolution, but also to disease.

It's fascinating though.

It leads to that paradox, right?

The frog paradox.

Ah, yes.

Some frogs, and even some amphibians and plants, have way more DNA per cell than humans do.

But they aren't necessarily more complex organisms.

Exactly.

That extra DNA is mostly massive amounts of repetitive sequences.

It really shows that genome size doesn't directly correlate with biological complexity.

It's more about the genes you have and how you regulate them.

We've covered so much.

From the nuts and bolts of transcription to these large -scale genome features, maybe we can quickly bring together some of the clinical connections we've touched on.

Good idea.

We saw how thalassemias are often caused by mutations that disrupt transcription or RNA processing, affecting initiation of the promoter, splicing, or polyadenylation, like in Lisa The severity depends on how much functional globin chain gets made.

Right.

And we touched on HIV.

Yes.

Understanding the HIV life cycle is key to treatment.

The virus uses reverse transcriptase to make DNA from its RNA genome.

Which is backwards from normal transcription.

Right.

Hence retrovirus.

Then integrase inserts that viral DNA into our host genome.

Our own RNA polymerases then transcribe the viral genes.

Finally, a viral protease cuts the resulting viral proteins into functional pieces.

And drugs target these specific viral enzymes?

Precisely.

Reverse transcriptase inhibitors, integrase inhibitors, codease inhibitors, they all exploit vulnerabilities in the viral life cycle based on our understanding of these molecular processes.

It really ties everything together.

What a journey.

From RNA polymerase starting its work to the complex processing and the vast landscapes of our genome, we've really delved into RNA's origin story.

We certainly have.

Transcription is that essential first step, turning the blueprint into action.

Highly regulated, incredibly diverse across life.

We've hit the polymerases, the promoters, the capping, tailing, splicing, all crucial parts.

So hopefully you listening feel you've taken a deep dive into this fundamental process.

The bridge between DNA and function.

You've seen the precision involved.

And the sometimes devastating consequences when that precision fails, even slightly.

It really is elegant, isn't it?

Thinking about these countless copies being made right now inside you.

Mind bogglingly complex and efficient.

So here's a final thought to leave you with.

Given this complexity, how much more do you think we still have to discover about how all this is controlled?

And what secrets might that hold for understanding and maybe one day treating more human diseases?

It feels like we're still just scratching the surface, doesn't it?

There's so much more intricate regulation yet to uncover.

A lot more deep dives ahead, perhaps.

Thank you so much for joining us today.

We hope you feel a bit more informed and maybe even amazed by the molecular world working inside you.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

RNA polymerases catalyze the fundamental process of converting genetic instructions encoded in DNA into functional RNA molecules, a central requirement for all cellular protein synthesis and gene regulation. Unlike DNA polymerases, these enzymes initiate synthesis independently without requiring primers but sacrifice extensive proofreading mechanisms, resulting in higher mutation rates that organisms tolerate in RNA due to its temporary nature. Prokaryotic systems employ a single RNA polymerase guided by sigma factors that recognize specific promoter sequences like the Pribnow box, enabling efficient gene activation in response to cellular needs. Eukaryotic transcription involves three specialized polymerases with distinct functions: Polymerase I synthesizes ribosomal RNA molecules, Polymerase II generates messenger RNA and regulatory microRNA species, and Polymerase III produces transfer RNA and certain other small RNAs. Gene expression is controlled through cis-acting DNA elements such as promoters and enhancers that regulate transcription frequency, working in concert with trans-acting transcription factors that facilitate polymerase recruitment and activation. Following transcription in eukaryotes, nascent RNA undergoes extensive processing modifications essential for stability and function, including addition of a protective 5-prime methylated guanosine cap, attachment of a poly-adenine tail at the 3-prime end, and precise removal of non-coding intron sequences by the spliceosome complex. Spliceosomes, composed of small nuclear ribonucleoproteins, execute exact intron excision while preserving exon sequences through catalytic snRNA components, while exon shuffling mechanisms generate protein structural diversity throughout evolutionary time. Specialized RNA molecules receive particular attention: ribosomal RNA undergoes methylation-directed processing within the nucleolus, transfer RNA adopts a cloverleaf secondary structure and incorporates modified nucleosides like pseudouridine and dihydrouridine, and transfer RNA molecules receive a universal CCA sequence addition enabling amino acid attachment during translation. Clinical applications illuminate transcriptional diseases, from beta-thalassemia caused by mutations disrupting regulatory sequences to alpha-amanitin toxicity that irreversibly poisons eukaryotic Polymerase II, alongside selective bacterial targeting by rifampin and autoimmune conditions like systemic lupus erythematosus affecting splicing machinery. Repetitive DNA sequences including Alu and LINE elements influence transcriptional regulation and associate with genetic diseases, demonstrating transcription's profound impact on human physiology.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 14: Transcription: Synthesis of RNA

Related Chapters