Chapter 6: How Cells Read the Genome: From DNA to Protein

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome, welcome back to the Deep Dive.

Today we're embarking on one of the most fundamental and frankly incredible journeys in all of biology.

We're talking about how our cells actually read their own instruction manual, the genome.

It's this amazing process, isn't it?

Taking that DNA blueprint.

Exactly.

And turning it into the proteins that basically define everything about us.

How we look, how we work.

It really is astounding when you think about how far we've come.

I mean, the DNA structure itself, that was only figured out in the early 1950s.

Now we can sequence entire genomes, humans, sure, but even like Neanderthals.

Right.

And that progress, doesn't just give us biochemical details or evolutionary clues though it does that brilliantly.

It also shows us something profound.

Biology, despite looking incredibly complicated, isn't infinitely complex.

There are these unifying principles.

That's our mission today, really, to cut through all that information, maybe some of the complexity you feel, and give you a clear path, a shortcut to understanding these core mechanisms.

And what's so fascinating, as he said, is how universal this is.

Across all life, whether it's simplest bacterium or us, the basic flow of genetic information is remarkably similar.

Like central dogma, right?

Exactly.

That core idea, DNA makes RNA, RNA makes protein.

It's fundamental.

But, and this is a big but for eukaryotes like us, there are really significant variations.

Our RNA gets processed a lot.

It's in the nucleus before it even goes out.

Precisely.

Things like RNA splicing, adds layers of control, quality checks.

It's crucial for understanding how our cells handle these huge, sometimes kind of messy, genomes.

And just to note, while we're focusing on protein production, sometimes that RNA molecule is the final product.

These non -coding RNAs, they do all sorts of structural, even catalytic, jobs.

It's not like a neat dictionary, is it?

The genome in multicellular organisms?

Oh, definitely not.

It reflects billions of years of, well, evolutionary chaos sometimes.

You've got coding bits, the exons, broken up by these long non -coding introns, vast stretches of DNA with no obvious job.

Genes for proteins that work together might be scattered across different chromosomes.

It's a lot.

Not a cell just handles it.

Thousands of times a second.

It's this incredible molecular precision.

Researchers struggle, but the cell just does it.

So plenty of mysteries left, mind you.

Absolutely mind boggling.

Yay.

Okay, so let's dive into step one.

DNA to RNA, transcription.

And one powerful thing here is efficiency, right?

Like making copies of a recipe.

Exactly.

The DNA is the master cookbook.

You don't take the whole book to the kitchen.

You make RNA copies of just the recipe you need.

Lots of them, potentially.

And then each RNA copy can make many protein molecules.

Right.

So the cell can really dial protein production up or down.

Lots of protein.

Transcribe and translate like mad.

Just a little.

Dial it back.

But there's a challenge here.

A theme we'll keep seeing.

Accuracy.

Base pairing isn't perfect.

No, it's not.

A correct match is only slightly more stable than a wrong one.

So the cell needs layers of proofreading, error correction.

It's constantly managing imperfection.

Okay, so transcription.

Copying DNA into RNA.

Yeah, it's literally transcribing.

Same language nucleotides, just a different format.

RNA instead of DNA.

RNA is chemically a bit different, you know.

Ribose sugar, not deoxyribose.

And Uacil, U instead of thymine, T.

But U still pairs with A, just like T does.

Correct.

G with C, A with U.

But the big structural difference, RNA is single -stranded.

And that lets it fold up.

Into complex 3D shapes, just like proteins.

That's key to its versatility structure, catalysis, all sorts of things the DIA can't really do.

So who's the enzyme doing the copying?

That's RNA polymerase, the workhorse.

Similar to DNA polymerase in some ways.

Uses a template, base pairing, builds 5' to 3'.

Right, those fundamentals are the same.

But key differences too.

The new RNA strand doesn't stay stuck to the DNA, it peels off immediately.

So the DNA double helix reforms behind it.

Exactly.

And RNA molecules are much shorter, usually just copying one gene or a small group of genes, not whole chromosomes like in DNA replication.

Makes sense, they're temporary messages.

Precisely.

RNA polymerases are these amazing molecular machines.

They catalyze the bonds, step along the DNA, unwind it just ahead.

Like a little bulldozer.

Yeah, a very precise one.

Unwinds, copies, lets the DNA zip back up behind.

Uses those energy -rich ribonucleoside triphosphates, ATP, CTP, UTP, GTP as fuel and building blocks.

And because that RNA peels off right away, you can have multiple polymerases working on the same gene simultaneously, like an assembly line.

Wow.

So you can get tons of RNA copies quickly.

Over a thousand transcripts per gene per hour, potentially.

Very efficient.

Does RNA polymerase need a primer, like DNA polymerase?

No, it doesn't.

It can start a chain from scratch, and it's slightly less accurate, maybe one mistake every 10 ,000 bases compared to one in 10 million for DNA polymerase.

Which is okay, because RNA isn't the permanent genetic store.

Exactly.

A mistake in RNA is temporary.

It also has a sort of modest proofreading ability, can back up and snip out a wrong base.

And they're very processive.

Once they start, they tend to finish the whole RNA molecule.

Interestingly, structurally, they look quite different from DNA polymerases, suggesting maybe independent evolution.

So we have different types of RNA being made.

The main ones for proteins are

messenger RNAs, mRNAs.

That's right.

They carry the message, the code for building a specific protein.

But lots of genes, maybe 15 % in yeast, thousands in humans, produce RNA as the final product.

The non -coding RNA.

Yep.

Huge variety.

Ribosomal RNA, our RNA forms the ribosome core.

Transfer RNA, tRNA, the adapters in translation.

Small nuclear RNAs, SNRNAs for splicing.

MicroRNAs, cernes for gene regulation.

A whole zoo.

It really is.

Each chunk of DNA transcribed is a transcription unit.

In us, usually one gene per unit.

Bacteria are more efficient.

Sometimes several related genes are transcribed together on one long mRNA.

But most of the RNA in a cell isn't even mRNA.

Surprisingly, no.

mRNA is only like 3 -5 % of total RNA.

Most of it is RNA, making up those ribosomes.

There are thousands of different mRNAs, but usually only 10 -15 copies of each specific type floating around.

So how does RNA polymerase know exactly where to start and stop on that huge DNA molecule?

Crucial question.

That's all about signals in the DNA sequence.

It differs a bit between bacteria and eukaryotes.

So let's start with bacteria.

It's a bit simpler.

Initiation, starting transcription, is the key regulatory point, deciding which genes get turned on and how much.

Makes sense.

Control the start, control the output.

Precisely.

Bacterial RNA polymerase needs a little help or protein called a sigma factor.

The core enzyme plus sigma factor makes the hollow enzyme.

And sigma factor is the guy.

It's like the GPS.

It recognizes specific DNA sequences called promoters, which signal the start site.

The hollow enzyme slides along DNA until sigma latches onto a promoter.

Then what happens?

It clamps down, unwinds the DNA helix, locally creates a transcription bubble, exposing the template strand.

Sigma helps hold that bubble open.

And RNA synthesis begins.

Yep.

The first few bases are added, sometimes with a bit of stuttering abortive initiation, making short, useless bits.

But eventually, the polymerase breaks free from the promoter, releases the sigma factor, and shifts into high -speed elongation mode, maybe 50 nucleotides a second.

Until it hits a stop sign.

A terminator sequence.

Often, this sequence, when transcribed into RNA, folds into a hairpin shape, which helps physically dislodge the RNA from the polymerase.

Then the core enzyme is free to find another sigma factor and start again.

Exactly.

The whole process involves really complex shape changes in the proteins and the DNA.

Are these promoter and terminator signals always the same?

No, there's variation.

We talk about consensus sequences, the most common base at each position or sequence logos, which show the frequency.

Subtle differences determine promoter strength.

Stronger promoter means more RNA, more protein.

And the promoter sequence is asymmetric, which is vital.

It ensures the polymerase binds in only one direction, defining which DNA strand is the template.

This can differ gene by gene.

Okay, so that's bacteria.

How does it work in us, in eukaryotes?

You said it's more elaborate.

Much more.

First off, we don't have just one RNA polymerase.

We have three main types in the nucleus.

Three?

Wow.

Yep.

Poli makes most ribosomal RNA.

Pol III makes tRNA and some other small RNAs.

But the star player for us today is RNA polymerase Pol II.

It makes all the protein -coding mRNAs and lots of non -coding RNAs, too.

Does Pol II look like the bacterial one?

Structurally quite similar, actually, but its operation is way different.

Two big things.

It needs a whole team of helper proteins, the general transcription factors, not just one sigma factor.

Okay, more helpers in the second thing.

It has to deal with chromatin.

Our DNA isn't naked like in bacteria.

It's wrapped around histones packaged tightly.

Pol II needs help navigating that.

Right.

So tell me about these general transcription factors, TFIA, TFII, etc.

Think of them as the setup crew.

They have to assemble at the promoter before Pol II can even bind correctly.

They position Pol II, help pry open the DNA helix, and then launch the polymerase on its way.

And they're general because they're needed for almost all Pol II genes?

Exactly.

The assembly often starts with one factor, TAID, binding to a specific DNA sequence called the TATA box, usually a bit upstream from the start site.

The TATA box?

I've heard of that.

It's a key landmark.

The binding protein, TVP, actually bends the DNA quite sharply, creating a physical signpost for other factors to gather around.

That's like planting a flag.

A very distorted flag, yeah.

Then the other factors in Pol II assemble, forming this big pre -initiation complex.

One factor, TFIIH, is crucial.

It uses ATP energy to unwind the DNA at the start site.

And it does something else important.

Phosphorylation.

Yes.

It phosphorylates a long floppy tail domain on Pol II, the CTD or C -terminal domain.

That phosphorylation acts like a switch, kicking Pol II loose from the general factor so it can start transcribing.

And that tail is important later, too.

Hugely important.

It becomes a binding platform, a scaffold for all sorts of RNA processing factors that need to act on the new RNA as it emerges.

It's multitasking central.

But wait, you said this is just part of the story in a real cell because of chromatin.

Right.

In vivo, you also need transcriptional activators binding to distant enhancer sequences,

a huge mediator complex that links everything together,

and chromatin remodelers and histone modifying enzymes to loosen up the DNA packaging.

Wow.

It's a whole committee meeting just to start transcribing one gene.

It really is.

A highly coordinated one, though.

Okay, so Pol II escapes the committee and starts moving elongation.

Is it smooth sailing?

Not always.

It can move jerkily, pause sometimes.

It needs elongation -factors proteins that ride along, help it stay attached to the DNA, and navigate through those nucleosomes in tricky sequences.

And pausing can be regulatory.

Yes.

Sometimes the cell deliberately pauses the polymerase near the start as a control point.

And as it moves, enzymes associated with it modify the histones it passes, leaving a kind memory trace.

We're still figuring out all the implications of that.

Transcription also creates tension.

Supercoiling.

Yeah, imagine twisting a rope.

As Pol II unwinds the DNA helix ahead, it causes overwinding positive supercoils in front and underwinding negative supercoils behind.

And that tension needs to be relieved.

Or the polymerase would stall.

In eukaryotes, enzymes called topoisomerases act like swivels, relieving this tension.

A bit of tension might actually help pop DNA off nucleosomes,

though.

Bacteria have a special topoisomerase, DNA gyrase, that actively pumps negative supercoils into the DNA using ATP.

Why negative?

Negative supercoiling makes it easier to unwind the helix locally, which is energetically favorable for things like starting transcription.

So gyrase primes the DNA in a way.

Okay, now a key difference you mentioned.

Yeah.

In eukaryotes, transcription is tightly coupled with RNA processing.

Absolutely crucial.

Bacterial mRNA is basically ready to go as it comes off the DNA.

Eukaryotic pre -mRNA needs serious editing in the nucleus first.

Capping, splicing, polyadenylation.

Exactly.

Gotta add a special 5 -cap to the beginning, snip out the non -coding introns via splicing, and add a polyA tail to the end.

Quality control checks.

That's a great way to think about it.

Does the mRNA have a proper start?

Are the introns gone?

Does it have a proper end?

Only then can it leave the nucleus.

And splicing allows for alternative versions.

Yes, alternative splicing.

You can splice the same pre -mRNA in different ways to get related but distinct proteins from one gene.

It's a major source of protein diversity in eukaryotes.

Amazing flexibility.

So, first modification.

Capping.

Happens right away.

Almost immediately.

As soon as pole 2 makes about 20 -30 nucleotides, the 5 -foot end gets modified.

A special guanine nucleotide is added backwards, 5 -to -5.

Three enzymes do this, and they actually hitch a ride on that phosphorylated pole test and tail.

Perfectly positioned.

Right there.

Ready to act.

This cap is vital.

It marks the 5 -foot end, distinguishes mRNA from other RNAs, binds proteins for processing and export, and later helps initiate translation in the cytoplasm.

There's even a newly found alternative cap involving NADP, plus Guam may be linking RNA processing to the cell's energy status.

Fascinating stuff.

Okay, then splicing.

Removing the introns.

This is a big surprise discovery, right?

Huge.

Back in 1977,

bacterial genes were just continuous coding sequences.

Eukaryotic genes, like the beta -globin gene or the massive factor VIII gene, were chopped up into exons and introns.

So the cell transcribes everything.

Introns and exons.

Then snips out the introns and stitches the exons together.

The intron is removed as a weird loop structure.

A lariat.

Sounds complex.

Oh it is.

It involves five small nuclear RNAs, SNRNAs, and hundreds of proteins forming a dynamic machine called the spliceosome, plus lots of ATP energy per splice.

Why so complex?

Accuracy is paramount.

You absolutely cannot be off by even one nucleotide, and flexibility needs to handle introns of vastly different sizes and sequences.

And the payoff is alternative splicing.

Exactly.

Generating protein variants, like different forms of alpha -tropomyosin for different muscle types from the same gene.

Not all detected variants are functional, but the potential is enormous.

How does the spliceosome know exactly where to cut and paste?

It recognizes short consensus sequences at the 5 -foot splice site, the 3 -foot splice site, and an internal branch point nucleotide within the intron.

But these signals are short and variable.

Right.

So the cell uses other cues.

The core machinery involves those SNRNAs U1, U2, U4, U5, U6, each complex with proteins to form SNRNPs.

Love that name.

These SNRNPs assemble step -by -step onto the pre -mRNA.

U1 binds the 5 -site, U2 binds the branch point, then others join.

It's a dynamic assembly process.

Highly dynamic.

Lots of rearrangements, checking and double -checking.

U6 actually displaces U1 at the 5 -foot site.

ATP hydrolysis fuels these rearrangements, not the splicing chemistry itself.

So APP is for accuracy checks, kinetic proofreading again.

Exactly.

Those rearrangements allow multiple checks of the splice sites before the reaction happens.

Incorrect pairings tend to fall apart during these ATP -driven steps.

And the catalysis, the cutting and joining, is done by RNA?

Amazingly, yes.

The SNRNAs themselves form the catalytic active site.

The splicosome is another ribozyme like the ribosome, a huge hint about the RNA world.

It's complexity, runaway bureaucracy, but with benefits like accuracy.

Precisely.

And after the successful splice, proteins called the exon -junction complex, EJC, are deposited near the splice site, marking it as complete and influencing the mRNA's future.

Given the huge introns and small exons, how does the cell avoid skipping an exon or using a wrong site?

Good question.

Errors like exon skipping or using cryptic sites are dangerous.

Two main strategies help.

One, co -transcriptional loading.

Splicing components jump from the pole to tail onto the nascent RNA as it's made.

This helps define exons in order, reducing the chance of skipping one.

So it keeps track as the RNA emerges.

Right.

Second, exon definition.

The machinery looks for the relatively uniform exons.

Special SRR proteins bind to sequences within exons, marking them, and helping recruit U1 and U2 machinery to the correct nearby splice sites.

Marking the exons makes them easier to find and connect properly.

Exactly.

Whereas other proteins preferentially bind the introns, it creates contrast.

And mutations can mess this up, causing disease.

Absolutely.

A mutation might weaken a real splice site or create a new cryptic one.

Leads to exon skipping, intron retention,

aberrant proteins.

Beta thalassemia is a classic example.

Maybe 10 % of human genetic disease mutations affect splicing.

But this flexibility also allows regulated alternative splicing.

Correct.

Cells deliberately control splicing patterns to make different proteins in different tissues or at different developmental stages.

It's control variation.

Okay, capping and splicing done.

What about the end of the mRNA?

The 3N end?

Also processed,

specific sequences in the DNA, like AAUAA, get transcribed into the RNA.

Proteins riding on the pole to tail recognize these signals.

More hitchhikers on the tail.

Yep.

Proteins like CSTF and CPSF bind the RNA signals, then recruit other factors to cleave the RNA at a specific spot, releasing it from the still transcribing pole 2.

And then the poly -A tail gets added.

Right.

An enzyme called poly -A polymerase, PP, adds about 200 adenine nucleotides, one by one, without a template.

Poly -A binding proteins bind the growing tail and help determine its final length.

What happens to pole 2?

It's still going, right?

It is, but the RNA it's making now lacks a 5 -foot cap.

A 5 to 3 -month exonuclease starts chewing up this uncapped RNA, eventually catches up to pole 2, and helps terminate transcription.

Clever mechanism.

And genes can have multiple poly -A sites.

Yes, allowing another layer of regulation to produce different mRNA 3 -ments, which can affect stability or translation.

So, after all this processing, capping, splicing, poly -A denilation, how does the cell ensure only the good mRNAs get out of the nucleus?

It's all about the proteins bound to the mRNA.

A mature, export -ready mRNA will have a cap binding complex, exon -junction complexes marking successful splices, and poly -A binding proteins.

The absence of Cessna RNP proteins also signal splicing is done.

So it's like a passport with all the right stamps.

Exactly.

If it lacks the right stamps, where it still has intron -lining proteins, it's retained in the nucleus and degraded by the nuclear RNA exosome.

Degradation is the default.

Export requires positive signals.

How do they get out?

Through nuclear pores?

Yes, the nuclear pore complexes.

NPCs.

These are complex gates.

Small stuff diffuses, but large mRNA protein complexes need active transport, mediated by specific nuclear transport receptors.

You can actually see this happening.

In some specialized systems, yes.

You can visualize these large mRNA protein granules moving through the NPC.

It's a dynamic process.

Proteins hop on and off as the mRNA matures and gets exported.

And it's fast.

Ten milliseconds.

Still a mystery how.

Still working on that.

And crucially, some proteins that bind in the nucleus stay bound in the cytoplasm, influencing where the mRNA goes, how long it lasts, how well it's translated.

The journey shapes its destiny.

Okay, before we follow mRNA into the cytoplasm for translation, let's quickly revisit non -coding RNAs, especially ribosomal RNA, our RNA.

It's the most abundant.

By far about 80 % of total RNA.

It's made by RNA polymerous society, which lacks that CT detail.

So no capping or polyadenylation.

And cells need loads of ribosomes.

Millions per cell generation.

So we have hundreds of copies of the RNA genes to keep up.

Humans have around 200 copies per haploid set.

And the RNA is processed too.

Extensively.

Three of the four eukaryotic RNAs are carved out of a single large precursor transcript.

It undergoes chemical modifications, methylations,

pseudoridylations,

small nuclear RNAs, snornaes.

They act as guides, base pairing to the precursor RNA to position the modifying enzymes precisely.

Many snornaes are cleverly hidden within the introns of ribosomal protein genes.

Very efficient packaging.

And all this RNA synthesis and processing happens in the nucleolus.

That's right, the ribosome factory.

It's the most obvious structure in the nucleus, but it's not membrane bound.

A biomolecular condensate, like a dense dynamic blob.

Exactly.

Formed by the aggregation of RNA genes, precursor RNAs, enzymes, snor NPs, ribosomal proteins, everything needed.

You can see nuclei fusing, showing that fluid -like behavior.

Its size reflects ribosome production.

Absolutely.

Bigger nucleolus means more ribosomes being made.

It forms around the clustered rRNA gene loops from different chromosomes.

Ribosome assembly itself is incredibly complex, needing hundreds of helper proteins.

Similar dynamic rearrangements as splicing, but ribosomes are stable machines, reused thousands of times.

And the nucleolus makes other things too.

Yes.

Other RNPs like U6, SNRNP for splicing, telomerase, tRNAs.

It's a general RNP assembly hub.

Are there other condensates like this in the nucleus?

Yes.

Kajal bodies and interchromatin granule clusters, or speckles.

Also dynamic, membrane -less.

They concentrate components to speed things up.

Like a memory lines or storage depots?

Kind of.

Kajal bodies seem involved in maturing and recycling SNRNPs and SNRNPs.

Speckles are more like stockpiles of mature splicing factors.

They become really important when cells need to divide rapidly.

So the nucleus isn't just a bag of enzymes.

It's highly organized with these dynamic compartments.

Definitely.

And splicing itself happens co -transcriptionally at thousands of sites, forming transient factories, condensates of transcription, and splicing machinery, often near those speckled depots.

It's all about efficiency through local concentration.

Okay.

That's a fantastic overview of getting from DNA to a mature, processed RNA ready for the next stage.

The complexity, the regulation, the quality control.

It's stunning.

It really is.

And the roles of RNA beyond just carrying a message catalysis, structure, regulation, are central themes.

Right.

So now the mRNA is exported to the cytoplasm.

Time for the main event.

Yeah.

Translation.

Making protein.

This is the real translation, right?

Going from nucleotides to amino acids.

Exactly.

The coding problem.

Four nucleotide letters, 20 amino acid letters.

How do you map one to the other?

Cracked in the swing in 60s.

And the answer is the genetic code.

Read in triplets.

Codons.

Groups of three consecutive nucleotides.

Since there are four bases, four by four by four gives 64 possible codons.

More than the 20 amino acids.

So it's redundant.

Highly redundant.

Most amino acids have multiple codons.

Plus, three stop codons signal the end.

And one start codon, AUG, signals the beginning and also codes for methionine.

And the reading frame is critical.

Absolutely.

You have to start at the right point and read in strict groups of three.

Shift by one base and you get complete gibberish.

So how does the codon connect to the amino acid?

Not directly.

No.

Through adapter molecules,

transfer RNAs, tRNAs, small RNA molecules, about 80 nucleotides.

They bridge the gap.

Perfectly put.

They fold into a specific L shape.

One end has the anticodon three bases that pair with the mRNA codon.

The other end carries the specific amino acid matching that codon.

And the redundancy.

Multiple codons for one amino acid involves wobble.

Yes.

Wobble base pairing at the third codon position.

The rules are a bit looser there, allowing one tRNA anticodon to recognize more than one codon.

Clever efficiency.

Do tRNAs get processed like mRNAs?

Yes.

Quite extensively.

Made by pole third as precursors, then trimmed.

Some even have introns removed by a unique protein -based splicing mechanism.

Quality control ensures only correctly folded ones mature.

Plus, lots of chemical modifications.

About one in ten bases is modified.

Affects structure, accuracy of codon recognition, and amino acid attachment.

And what attaches the amino acid to the correct tRNA?

The heroes here are the aminoacylcyl tRNA synthetases.

Usually 20 different enzymes, one for each amino acid.

They're the matchmakers.

The crucial matchmakers.

They use ATP energy to link the amino acid to its correct tRNA partner with a high -energy bond.

That energy is then used later to form the peptide bond in the protein.

So it's a two -adapter system.

Synthetase matches amino acid to tRNA.

Then tRNA matches codon on mRNA.

That classic experiment swapping the amino acid after it was attached proved the tRNA is the true adapter reading the code.

How accurate are these synthetases?

Mixing up amino acids would be bad.

Very accurate.

They use a two -step check.

First, the correct amino acid fits best in the active site.

Second, there is an editing pocket.

An editing pocket?

Yeah.

After attachment, the enzyme tries to shove the amino acid into this pocket.

The correct one usually doesn't fit, but slightly incorrect ones do.

And if they fit, they get hydrolyzed, snipped off.

Hydrolytic editing.

Like DNA -proof reading.

Exactly.

Boosts accuracy enormously.

Maybe one mistake in 40 ,000.

Plus, they recognize the tRNA structure itself, adding another layer of fidelity.

Okay.

Synthetases have charged the tRNAs.

Now, building the protein chain, where are amino acids added?

Always to the C -terminal end of the growing chain.

So proteins are made N -terminus to C -terminus.

The energy comes from that bond to the tRNA?

Yes.

The growing chain stays attached to a tRNA.

Peptidil tRNA.

Adding the next amino acid breaks that bond, but immediately forms a new, similar high -energy bond with the incoming amino acid.

Each amino acid brings its own activation energy for the next addition.

Neat.

And the machine that coordinates all this.

The ribosome.

A massive complex of proteins and ribosomal RNAs.

Millions per cell.

Incredibly complex and efficient.

Synthesizes an average protein in about a minute.

Assembled in the nucleolus.

Exported.

Subunits join up in the cytoplasm.

Right.

Large and small subunits come together on an mRNA to start translation.

Are bacterial and eukaryotic ribosomes very different?

Subunits are larger with more components.

Small subunit handles the tRNA codon matching accuracy.

Large subunit catalyzes the peptide bond formation.

And they move along the mRNA.

Yep.

Pulling the mRNA through.

Reading codons three bases at a time.

Starts near the five -foot end.

Stops at a stop codon.

Releases the protein subunit separate.

Eukaryotes add maybe four amino acid sec.

Bacteria up to 20 sec.

Does the ribosome have specific slots for the tRNAs?

Three main sites.

A site for aminoacyl tRNA, the incoming one.

P -site for peptidil tRNA, holding the growing chain.

And E -site for exit, where the spent tRNA leaves.

This ensures the reading frame is maintained.

So walk me through the cycle.

How is each amino acid added?

Okay, four basic steps repeated.

Step one.

A new charged tRNA enters the A site, matching the codon there.

Step.

Actide bond formation.

The polypeptide chain attached to the tRNA in the P -site is transferred and linked to the amino acid on the tRNA in the A site, catalyzed by the large subunit.

Step three.

Large subunit shifts forward relative to the small subunit, moving the tRNAs into hybrid EP and PA sites.

Step four.

Small subunit catches up, moving the mRNA exactly three nucleotides.

This ejects the empty tRNA from the E site and resets the A site for the next incoming tRNA.

Cycle repeats.

This whole cycle is driven and made accurate by elongation factors

EF2F1 and EFGEF2.

They use GTP hydrolysis.

GTP hydrolysis, again, for energy and accuracy.

Yes.

EF2 brings the new tRNA to the A site.

It checks the codon -anticodon match.

A correct match triggers a shape change in the ribosome.

Induced fit.

Exactly.

The small subunit RNA folds around a correct match.

This triggers EF2 to hydrolyze its GTP.

Incorrect tRNAs usually fall off before GTP hydrolysis.

First proofreading step.

And there's a second check.

Kinetic proofreading again.

There's a slight delay after GTP hydrolysis before the amino acid is added.

Incorrectly bound tRNAs are more likely to dissociate during this delay.

Two checks give amazing accuracy, 99 .99%.

Induced fit and kinetic proofreading.

Common themes for accuracy at an energy cost.

Absolutely fundamental principles.

Protein synthesis is very energy hungry because of these fidelity mechanisms.

And you mentioned the ribosome is a ribozyme.

The RNA does the work.

That was the huge discovery from the structural work around 2000.

The ribosome is mostly RNA.

The RNAs form the core, the shape, position the tRNAs, and crucially catalyze the peptide bond formation.

So the proteins are just scaffolding?

Largely, yes.

They stabilize the RNA structure, help with conformational changes, but the catalytic site is RNA.

Proteins are nowhere near where the peptide bond forms.

Wow.

A relic of the RNA world.

A powerful piece of evidence for it, yes.

A spliceosome, too.

RNA catalysis is still fundamental in modern cells.

How does translation start?

Setting that reading frame seems critical.

Utterly critical.

It starts at the AUG codon.

Usually using a special initiator tRNA carrying methionine.

So all new proteins start with methionine, though it might get removed later.

Generally, yes.

In eukaryotes, this initiator met tRNA loads onto the small subunit with initiation factors, EIFs.

This complex binds the mRNA's 5 -foot cat.

And scans for the first AUG.

Right.

It moves along the mRNA, 5 to 3 -foot, using ATP -powered helicases to unwind structure until it finds the first AUG in a good context.

The COZAC sequence helps.

Then factors leave.

The large subunit joins, ready to go.

Leaky scanning allows variation sometimes.

Yes.

If the first AUG context isn't perfect, the ribosome might skip it and start at a later one, producing different protein versions.

How do bacteria start?

No 5 -foot cap there.

They have a ribosome binding site, the Shine -Dalgarno sequence, just upstream of the AUG.

It base pairs directly with the 16S rRNA in the small subunit, positioning the AUG correctly.

Which allows for polycystronic mRNAs multiple proteins from one transcript.

Exactly.

Eukaryotic mRNAs are typically monocistronic, one protein per mRNA.

And stopping.

Stop codons UAA, UAG, UGA.

Right.

No tRNA recognizes them.

Instead, release factors proteins bind to the A site when a stop codon arrives.

And they trigger release?

They cause the ribosome to add a water molecule instead of an amino acid to the peptide chain, cutting it free from the tRNA.

Then the whole complex dissociates.

Where does the new protein go as it's being made?

It snakes through a tunnel in the large subunit.

It's lined with mostly hydrophilic surfaces, kind of like Teflon, letting the chains slide out without getting stuck.

And translation is usually happening on polysomes.

Multiple ribosomes on one mRNA.

Yes.

Polyribosomes are polysomes.

As soon as one ribosome moves clear of the start site, another can hop on.

Greatly amplifies protein production from a single mRNA molecule.

Most mRNAs have 10 -20 ribosomes translating simultaneously.

Is the genetic code absolutely universal?

Almost.

Remarkable universality.

Strong evidence for common ancestor.

But a few minor exceptions exist.

Some fungi, ciliates, and especially mitochondria often have slightly different codon assignments.

And translation recoding.

Using a stop codon for an amino acid.

Yeah, like incorporating selenocysteine at specific UGA codons, guided by structures in the mRNA,

adds another layer of complexity.

And many antibiotics target bacterial protein synthesis, right?

Exploit the differences between bacterial and eukaryotic ribosomes.

Exactly.

Drugs like tetracyclines, tryptomycin, erythromycin.

They bind to specific sites on the bacterial ribosome, inhibiting different steps, tRNA binding, translocation, peptide bond formation.

Vital medicines and also useful lab tools to selectively block protein synthesis in bacteria or even in our mitochondria chloroplasts, which have bacteria -like ribosomes.

Okay, we've made a polypeptide chain.

But it's not a protein yet, it needs to fold.

Right.

Folding into the correct 3D shape, binding cofactors, getting modified, maybe assembling with other subunits.

All that info is encoded in the amino acid sequence, guiding it to the lowest energy state.

Does folding start right away?

Some initial folding, like alpha helices, can start even as the chain emerges from the ribosome exit tunnel.

Enzymes might even modify the N -terminus as it appears.

But major folding happens after release.

And most proteins need help folding correctly, chaperones.

Yes, molecular chaperones are crucial.

They prevent misfolding, aggregation, proteins getting stuck in wrong shapes.

How do they work?

They bind to exposed hydrophobic patches, signs of improper folding, and use ATP energy to help the protein try again, giving it multiple chances to find the right conformation.

Many are heat shock proteins, HSPS, made more under stress.

Like HSP70 and HSP60.

Keys families.

HSP70s often bind early as the protein emerges, preventing aggregation.

HSP60s form an isolation chamber.

An isolation chamber.

Yeah, a barrel -like structure.

A misfolded protein enters, the chamber closes, using ATP, providing a protected environment for it to refold correctly away from other sticky proteins.

Then it's released.

It's energy -intensive, ensuring accuracy.

Are there other folding aids?

Translation speed itself might play a role.

Ribosome pauses at tricky spots could give segments time to fold properly before the next part emerges.

The mRNA sequence might influence folding kinetics.

And assembly with partner proteins?

Yes.

Sometimes a newly made subunit folds correctly only when it binds to its already folded partners.

Being made on polysomes can help keep subunits close for efficient assembly.

But what if, despite all this help, a protein still misfolds or gets damaged?

The cell has a disposal system.

The ubiquitin -proteasome pathway.

Ubiquitin.

That's the tag for destruction.

It is.

Exposed hydrophobic patches signal trouble.

Enzymes attach chains of ubiquitin protein to the faulty protein.

This polyubiquitin chain is the signal recognized by the proteasome.

And the proteasome is the shredder.

The molecular shredder.

A big barrel -shaped complex found in the cytosol and nucleus.

Also degrades bad proteins kicked out of the ER.

How does it work?

Does it just chew everything up?

No, it's controlled.

The core cylinder contains the proteases inside.

Caps at each end act as gatekeepers.

What do the caps do?

They recognize the polyubiquitin tag.

They also have an unfoldase ring using ATP.

It unfolds the tagged protein and threads it into the core cylinder for degradation into short peptides.

And it's processive.

Holds on until the job is done.

Yes.

Crucial property.

Prevents release of partially degraded, potentially sticky fragments.

The cap also provides a second check.

Needs both the ubiquitin tag and an unfoldable region to really commit to degradation.

So a constant tug of war between chaperones trying to fix proteins and the proteasome trying to destroy them.

A vital balance.

A significant fraction of newly made proteins actually fail quality control and get degraded quickly.

But the proteasome doesn't just handle misfolded proteins, right?

It controls the levels of normal proteins too.

Absolutely.

Regulated destruction is key for controlling many cellular processes.

Think of proteins like cyclins that need to disappear at specific points in the cell cycle.

How is it at controlled?

How do you target a normal protein for destruction?

Various ways.

You can activate the ubiquitin legis E3 enzyme that does the tagging, maybe by phosphorylation.

Or you can modify the target protein itself phosphorylated.

Unmask a hidden degradation signal.

Or cleave it to create a destabilizing N -terminus.

So you can activate the tagger or activate the tag on the target.

Exactly.

Even that common N -terminal modification on many proteins might act as a degradation signal if the protein becomes damaged or misfolds, exposing it.

Wow.

What an intricate multi -layered system.

From DNA sequence to regulated protein degradation.

It really encompasses so much of cell biology.

Transcription, processing, export, translation, folding, quality control, degradation.

Billions of years of evolution optimizing this flow.

The cell invests enormous energy and resources into making proteins correctly and controlling their levels precisely.

It's all about balance.

And dynamic regulation.

Cells constantly adjust these processes based on their needs, which leads us perfectly into what we can explore next time.

How cells regulate gene expression at all these different steps.

Sounds fascinating.

But before we wrap up, that connection you made earlier.

The RNA world.

Ah, yes.

It comes back to that fundamental chicken and egg problem.

DNA needs proteins to be replicated and transcribed.

But proteins need DNA, the RNA, instructions to be made.

How did it start?

The RNA world hypothesis offers a solution.

It suggests that early life used RNA for both jobs, storing genetic information and catalyzing reactions as ribozymes.

DNA and proteins came later, taking over roles RNA wasn't optimally suited for in the long run.

But RNA didn't disappear.

Not at all.

It still plays those ancient catalytic roles in fundamental machines like the ribosome and the spliceosome.

They're like molecular fossils, molecular holdovers from that earlier era.

So this deep dive not only shows us cellular precision, but maybe even glimpses of life's origins.

RNA is the original master molecule.

Fascinating thought to mull over, isn't it?

Truly mind -expanding.

Well, thank you for being part of our deep dive family today as we navigated this incredible journey from gene to protein.

We really hope it's deepened your appreciation for these fundamental biological wonders.

It's been a pleasure diving deep with you all.

Until next time, keep that curiosity alive and keep exploring.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

The process by which cells transform genetic instructions into functional proteins represents one of biology's most fundamental achievements, operating through a coordinated sequence of transcription and translation events. RNA polymerase enzymes catalyze the synthesis of messenger RNA from the DNA template, with the transcription process initiated at specific promoter regions recognized by transcription factors that position the enzyme correctly on the chromosome. Prokaryotic and eukaryotic cells execute this process with notable differences in complexity, particularly evident in eukaryotes where three distinct RNA polymerases handle different gene categories and mRNA undergoes substantial modification before leaving the nucleus. These modifications—including the addition of a protective 5' cap structure, removal of introns through the splicing machinery, and attachment of a 3' poly-adenine tail—are essential for mRNA stability and translation efficiency. A single eukaryotic gene can yield multiple protein products through alternative splicing mechanisms that exclude or include particular exons based on cellular context. Beyond protein-coding sequences, cells produce diverse noncoding RNA molecules including ribosomal RNA, transfer RNA, small nuclear RNA, and microRNA, each serving distinct functions in translation, RNA processing, and gene regulation. The translation machinery reads the genetic code carried by mRNA through interactions between ribosomal structures and transfer RNA molecules, with each codon matched to the corresponding amino acid through precise base-pairing recognition. Translation proceeds through initiation, elongation, and termination phases, each requiring specific protein factors that facilitate ribosome movement and ensure accurate protein assembly. Following synthesis, newly made proteins frequently undergo post-translational modifications such as phosphorylation and glycosylation, which activate biological function or direct proteins to appropriate cellular compartments. Gene expression regulation operates across multiple control points—at the transcriptional level through factor availability, post-transcriptionally through RNA processing choices, during translation through initiation factor control, and post-translationally through protein modification—enabling cells to precisely tune protein levels in response to internal signals and external environmental changes.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 6: How Cells Read the Genome: From DNA to Protein

Related Chapters