Chapter 17: Gene Expression: From Gene to Protein

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replace the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome back to the Deep Dive.

Today we are really getting into the weeds, not just looking at the hood of the car, but taking the engine apart piece by piece.

Yeah, that's a fair analogy.

We're dealing with the biological engine today.

The actual machinery that builds life.

Right.

So we are tackling chapter 17 of Campbell Biology.

That's the 12th edition.

And the title of this chapter is Gene Expression from Gene to Protein.

It is.

And I have to say, going through this source material, this is the chapter where biology stops being about, you know, memorizing names of bones or looking at pretty flowers.

It starts feeling a lot more like computer science.

It really does.

It doesn't feel that way.

It's the shift from the hardware, the physical structures of the cell to the software.

It's the code.

Yeah.

The source material kicks off with this massive question.

One that I think you, the listener, probably take for granted because we hear about DNA so often.

We hear it all the time.

Right.

We know DNA makes us who we are.

But strictly speaking, DNA is just a molecule.

It's just sitting there in the nucleus.

It is a chemical.

So the central question the chapter poses is, how does a simple change in that DNA, result in a dramatic change in appearance?

Like, how do you get an albino donkey?

Or a glowing tobacco plant?

Right.

It is all about the connection between the genotype and the phenotype.

And we should probably lock those terms down right now for everyone.

Definitely.

So genotype is your genetic makeup.

It's the literal sequence of nucleotides written in the DNA.

The letters.

Yes, the letters.

Phenotype, on the other hand, is the physical treat you can actually see,

like the white fur on that donkey or the glowing leaves on the plant.

And gene expression?

Gene expression is the bridge between them.

Precisely.

Gene expression is the process by which DNA directs the synthesis of proteins.

Because proteins are the things that actually do the work.

They do everything.

DNA is just the blueprint.

The proteins are the structure.

They are the workers.

They are the enzymes.

And our mission for this deep dive is to decode that entire process.

We're going to track the history of how we figured this out, which involves, surprisingly, a lot of very moldy bread.

A surprising amount of mold, yes.

I love that.

Major scientific breakthroughs.

Breakthroughs often come from things we just throw in the trash.

It is often the humblest organisms that teach us the most.

But yeah, after the history, we are going to look at the central dogma, the flow of information.

We'll break down the mechanics of transcription, which is writing the code.

And translation?

Building the protein.

And finally, we will look at what happens when that code breaks.

Mutations.

And the modern revolution of CRISPR.

It's a huge journey.

From the microscopic molecular level, right up to the organism level.

So let's rewind.

Let's go all the way back to 1902.

The text introduces us to a British physician,

Archibald Garrett.

Right.

And to set the stage for you, this is before we knew what DNA even looked like.

We didn't know about the double helix.

Watson and Crick were decades away.

Roughly 50 years away, yeah.

But Garrett was incredibly observant.

He was studying metabolic diseases,

specifically one called Alkoptinuria.

Which has a very distinct, very memorable symptom.

Black urine.

Very vivid.

Extremely.

When the urine of these patients is exposed to air, it turns black.

And that's because it contains a chemical called Alkoptan.

Okay, so Alkoptan turns black in the air.

Right.

Now, Garrett knew that most people don't have this reaction.

Most bodies break down Alkoptan before it ever leaves the system.

But his patients couldn't.

They couldn't.

So he makes this brilliant deductive leap.

He suggests that the reason they can't break it down is that they are missing the specific tool to do the job.

The enzyme.

Exactly.

He proposed that they lacked a specific enzyme.

And the crucial part is he noticed this condition ran in families.

It followed a pattern of inheritance, like Mendel's peas.

Yes.

So he connected those two ideas.

An inherited trait results in a missing enzyme.

He actually called these inborn errors of metabolism.

That phrase is just so precise.

Inborn errors of metabolism.

He's essentially saying you have a typo in your manual that prevents you from building this one specific tool.

That is exactly it.

He was the first one to do that.

But this is the first significant part, you see?

very first person to suggest that genes dictate phenotypes through enzymes.

But there is a catch here.

Always a catch.

He couldn't prove it experimentally.

No.

He was observing patients, describing symptoms, making a really solid logical inference.

But he wasn't running controlled lab tests on the DNA itself.

Because the technology just wasn't there.

Exactly.

So proof required something a bit faster than waiting for humans to have babies and tracking their urine.

It required a model organism.

It required bread mold.

Neurospora crassa.

Let's talk about beetle and tatum in the 1930s and 40s.

Looking at their experimental design in the chapter, it is incredibly elegant.

But first, why mold?

Why not fruit flies or mice?

It comes down to efficiency and clarity.

Neurospora has a massive advantage for a geneticist.

It is a haploid organism.

Okay, let's unpack haploid for you listening, just in case it's been a minute since your last biology class.

Sure.

So humans are deployed.

We have two copies of every chromosome.

One from mom, one from dad.

Which provides a kind of safety net.

Right.

If you get a broken gene from your mother, the working copy from your father can often cover for it.

It masks the error.

That is why you can carry a disease gene but not actually show symptoms.

You're a carrier.

The recessive trait hides behind the dominant one.

Exactly.

But Neurospora is haploid.

It only has one copy of each gene.

Period.

There is no backup drive.

So if you break a gene in Neurospora...

The effect is immediate.

It is obvious.

You see the phenotype right away.

Because there is nowhere for the mutation to hide.

Okay, so Beetle and Tatum have their subject.

How did they break the genes?

They were not subtle about it.

They bombarded the mold with x -rays.

Just the blunt instrument approach.

Let's nuke it and see what happens.

It works, though.

The x -rays cause mutations.

Random changes in the DNA structure.

But then they had a new problem.

How do you figure out what you just broke?

And this is the part of the source material that is just so clever.

Figure 17 .2 in the book maps this out.

They used the mold's diet.

Diet to diagnose it.

The minimal -medium versus complete -medium setup.

Walk us through that.

Let's start with the wild -type Neurospora.

This is the normal, healthy stuff you find in nature.

It is a survivalist.

You can put it on what they called minimal -medium.

Which is what, exactly?

It is basically just agar mixed with some inorganic salts, a little glucose for sugar, and the vitamin biotin.

That is it.

Very basic.

But from those basic raw ingredients,

the wild -type Neurospora...

...mold can synthesize everything else it needs.

All 20 amino acids, all the vitamins, lipids, proteins.

It is a complete metabolic factory.

It takes the raw bricks and builds the whole house.

But the mutants, the ones that got hit by the x -rays, some of them lost that ability.

Yes.

Beadle and Tatum found that some mutants couldn't grow on the minimal -medium anymore.

They just starved.

Because their factory is broken somewhere.

Right.

However, if they took those same starving mutants and put them on complete -medium, which is the minimal stuff, plus all the amino acids, acids, and nutrients ready to go, the mutants grew fine.

So that told them the machinery to process food is broken, but if we just hand you the finished product, you survive.

Precisely.

It was a nutritional defect.

But that is still a really broad diagnosis.

They needed to know specifically what was broken.

Was it the ability to make lysine?

Or arginine?

Which specific metabolic pathway got destroyed by the x -ray?

So they ran a process of elimination.

They took the mutant mold and put it in a series of vials.

Imagine a whole row of test tubes.

Each tube has...

the minimal, medium, plus just one specific nutrient.

Just one.

Right.

Tube one has extra valine.

Tube two has extra lysine.

Tube three has extra arginine.

And so on.

So if the mold grows in the tube with added arginine, but it dies in all the other tubes...

Then you know exactly what happened.

The mutation destroyed the specific gene responsible for making arginine.

It is pure logic.

If I give you the finished product and you live, but I give you raw materials and you die, your specific assembly line for that product is broken.

And that beautifully logical setup...

led to the work of Swarovs and Horowitz.

They took this even deeper by focusing specifically on that arginine pathway.

This part of the chapter reads exactly like a logic puzzle.

I want to walk through this slowly because it really proves the step -by -step nature of how genes work.

It cements the idea that genes control specific chemical reactions.

They really do.

See, synthesizing arginine isn't a one -step magic trick.

It's a metabolic pathway.

Think of it as an assembly line with three distinct stations.

Okay.

Let's lay out the first step.

Lay out the stations.

Station 1 converts a starting precursor substance into an amino acid called ornithine.

Precursor to ornithine.

Then station 2 converts that ornithine into citrulline.

Ornithine to citrulline.

And finally, station 3 converts citrulline into arginine.

Okay.

So precursor becomes ornithine.

Ornithine becomes citrulline.

Citrulline becomes arginine.

Three steps.

Three different enzymes required to make those jumps.

Exactly.

Now, Skirb and Horowitz found three different classes of mutants among their mold.

Let's take them one by one.

Class A mutants.

These guys couldn't grow on the minimal medium alone.

But if you gave them ornithine or citrulline or arginine, they grew.

So let's trace the logic there.

If they can grow when given ornithine, it means everything after ornithine works perfectly fine.

They can take that ornithine, make citrulline, and then make arginine.

Right.

So the blockage must be before ornithine.

Station 1 is broken.

Spot on.

The enzyme that turns the precursor into ornithine is missing.

Okay.

What about Class 2?

Class 2 mutants wouldn't grow if you gave them ornithine.

They still starved.

They would only grow if you gave them citrulline or arginine.

Meaning they cannot turn ornithine into citrulline.

Station 2 is broken.

Even if you pile up a mountain of ornithine at their door, they can't process it.

It just sits there.

And the Class 3 mutants.

Let me guess.

They only grow if you give them arginine directly.

Yes.

They couldn't fix citrulline.

Station 3 was broken.

This is so elegant.

It proved that genes don't just vaguely control metabolism.

Specific genes control specific steps in a pathway by producing specific enzymes.

And that solidified a huge concept.

The one -gene -one -enzyme hypothesis.

It was our first really solid theoretical framework.

One gene contains the instructions for one enzyme.

But as with all things in biology, we had to refine that later, right?

As we learned more about what proteins actually do, we realized not everything is an enzyme.

Right.

An enzyme acts as a catalyst for a chemical reaction.

Right.

But what about keratin?

Keratin.

The stuff in your hair and nails.

That is a structural protein.

Or insulin.

Insulin is a signaling protein.

A hormone.

They aren't enzymes, but they are absolutely gene products.

So the rule had to be updated to one gene, one protein.

And then we realized some proteins are actually composites.

Hemoglobin is the classic example.

Yes.

Hemoglobin is the protein in your red blood cells that carries oxygen.

It is made of four subunits.

Two alpha chains and two beta chains.

And each of those chains is a separate polypropylene.

One gene, one polypeptide.

Coded by a separate gene.

Exactly.

So the definition got even more granular.

One gene, one polypeptide.

Now, even that has some asterisks today, which we will get to later when we talk about RNA splicing.

But one gene, one polypeptide serves as the working model for understanding the flow of this whole chapter.

It's the baseline.

So we have the what?

Genes make polypeptides.

Now we need the how.

And this brings us to what's called the central dogma.

Coined by Francis Crick.

One of the men who discovered the double helix structure.

The central dogma describes the direction of traffic in a cell.

DNA to RNA to protein?

Yes.

You cannot go backward well usually.

Yeah.

And you definitely cannot skip the middle man.

I like to think of it like a master document.

DNA is the original irreplaceable blueprint kept locked in the vault, which is the nucleus.

It is way too valuable to risk damaging.

You never take the original blueprint to the dirty construction site.

Exactly.

You make a disposable copy.

And that copy is RNA.

Specifically, messenger RNA.

The process of making that copy is called transcription.

Transcription.

Because you are transcribing, essentially rewriting the code from the DNA dialect into the RNA dialect.

It is the exact same language.

Nucleic acids.

Just a slightly different form.

You're just changing the font, so to speak.

And then you take that mRNA copy out to the construction site, the ribosome in the cytoplasm, and you actually build the machine.

And that step is called translation.

Because you are translating.

You are translating.

You are switching from the language.

The language of nucleotides, the A, C, G, and U, into an entirely different language.

The language of amino acids.

The terminology really matters here.

Transcription is copying.

Translation is changing languages.

So let's look at the math of that language, the genetic code.

We have four letters in the DNA alphabet.

A, T, C, and G.

But we have 20 different amino acids that we use to build proteins.

How do we map four letters to 20 amino acids?

It's a classic mapping problem.

If one nucleotide coded for one amino acid, you would only get four.

Way short of 20.

Right.

If you use pairs like A, A, A, T, A, C, and so on, that is four squared.

16 combinations.

Still short.

You'd be missing four amino acids.

So nature went to triplets.

Three letters at a time.

Four cubed.

64 combinations.

That is plenty.

It gives us enough codes for all 20 amino acids, plus a few special stop signals, plus a lot of redundancy.

And these three -letter words are called codons.

Codons.

Yes.

Let's talk about the template.

Because DNA is a double helix.

Two strands.

But we only read one of them, right?

For any given gene, yes.

We only read what is called the template strand.

The other strand is called the non -template strand.

Or, more commonly, the coding strand.

This is confusing naming convention number one in biology.

Why is the strand we don't read called the coding strand?

Because of how complementary base pairing works.

The mRNA is built to be a complement to the template strand, which means the mRNA ends up looking exactly like the non -template strand.

Oh!

Except with uracil instead of thymine.

Exactly.

U instead of T.

So if you look at the coding strand of the DNA, you can read the genetic message exactly as it will appear in the mRNA.

It's like looking at the file on your computer screen before you hit print.

Okay, that actually makes perfect sense.

So we have 64 codons, but only 20 amino acids.

You mentioned redundancy.

Yes.

This is a built -in safety feature of life.

For example, the codons GAA and JAG, both code for the amino acid glutamic acid.

It's a fail -safe.

It is.

If a tiny mutation happens and changes that last letter from an A to a G, you still get glutamic acid.

The final protein won't even know the difference.

It's redundant.

But there is no ambiguity.

Never.

GAA will always mean glutamic acid.

It will never randomly surprise you and ask for valine instead.

It is a very rigid, unambiguous system.

The textbook uses a really great analogy for how this code is actually read by the cell.

The reading frame.

The example sentence is, The red dog ate the bug.

All three -letter words.

Right.

If you read it correctly, grouping it by threes from the start, it makes perfect sense.

The spacing is everything.

But if you skip the first letter T and you just keep grouping by threes from there.

You get, Her ed oga tet heb ug.

Which is total gibberish.

That is what we call a frame shift.

The cell reads the mRNA in non -overlapping triplets.

Once it starts, it just marches forward three steps at a time.

If you offset that grid by even one single notch, the cell reads the mRNA in non -overlapping triplets.

The entire message downstream is completely destroyed.

Before we get into the nuts and bolts of the machinery that does all this, I want to highlight the universality of this code.

The fact that this specific code, like CCG, meaning proline, is true in me, it's true in you, and it's true in the bacteria sitting on this microphone right now.

It is arguably the strongest evidence we have for a single common ancestor of all life on Earth.

This translation system was developed very, very early on, and it was so successful that evolutionists were able to understand it.

They even effectively locked it in.

It hasn't changed in billions of years.

The text shows this visually with the glowing plants.

Figure 17 .7.

Oh, that's a classic experiment.

They took a gene from a firefly, the luciferase gene, which makes them glow.

And they inserted it into the DNA of a tobacco plant.

And the plant cells read that firefly gene.

They transcribed it into mRNA.

They translated it into ribosomes.

They produced the firefly enzyme perfectly, and the tobacco plant glowed in the dark.

The plant didn't say, hey, wait, this is firefly code.

I can't read this.

No.

It's like taking a standard CD from a Sony player and putting it into a Panasonic player.

The data format is identical.

It works across the board.

They also show a mosquito larva expressing a jellyfish protein, green fluorescent protein, or GFP.

Same concept.

It really is profound.

Okay, let's get mechanical.

How does the cell actually write the note?

This is transcription.

This is happening deep inside the nucleus.

The main player here is a massive enzyme called RNA polymerase.

I always imagine this as a heavy weapon.

Heavy duty machine moving along a DNA train track.

That's a great way to picture it.

It pries the two strands of the DNA double helix apart, and it joins RNA nucleotides together to form the mRNA transcript.

It builds from the five prime direction to the three prime direction.

Now unlike DNA replication, which we know requires a primer to get started, RNA polymerase can just start from scratch, right?

It can.

No primer needed.

It just needs to know exactly where to begin.

Because it can't just attach anywhere randomly.

It needs to find the start of the gene.

And that starting location is called the promoter.

It is a very specific sequence of DNA upstream of the actual gene.

It acts like a landing strip.

The text highlights a specific part of the promoter in eukaryotes.

It's called the TATA box.

T -A -T -A.

Thymine, adenine, thymine, adenine.

Why those specific letters?

Is there a structural or chemical reason?

There is a very good chemical reason.

Adenine and thymine bind to each other across the helix with only two hydrogen bonds.

Guanine and cytosine use three.

So an A -key rich area is physically easier to pull apart.

It is literally a soft spot in the DNA armor where the enzyme can get a foothold to pry the strands open.

That is such a cool structural detail.

Okay, so the TATA box is the beacon.

But in eukaryotes like us, the RNA polymerase is a bit of a diva.

It will not just bind to the promoter by itself.

No, it needs an entourage.

These are special proteins called transcription factors.

These factors have to bind to the TATA box first.

They create a sort of docking platform.

Only after they're in place can RNA polymerase the second swoop in and bind to form the transcription initiation complex.

Figure 17 .8 shows these three stages really well.

Once it is docked, we launch.

This is the elongation phase.

The polymerase moves downstream along the DNA.

It unwinds the helix, exposing about 10 to 20 DNA bases at a time.

It grabs complementary RNA nucleotides floating around in the nucleus and just snaps them onto the three prime end of the growing RNA chain.

And as it moves forward, the new RNA molecule peels away from the DNA template, right?

It doesn't stay stuck to the track.

Correct.

The DNA double helix sips right back up naturally behind the polymerase.

So you have this single -stranded tail of RNA growing out of the machine, dangling on the side as it moves.

And it moves fast.

About 40 nucleotides per second in eukaryotes.

Finally, we hit the end of the gene.

Termination.

This differs between bacteria and us.

In bacteria, it's very simple.

There is a terminator sequence in the DNA.

The polymerase hits it, gets knocked loose, falls off.

And then it's gone.

And the mRNA is completely ready to go.

But in human cells, in eukaryotes, it is messier.

Much messier.

The polymerase transcribes a specific sequence called the polyadenylation signal.

The letters for that are A -A -U -A -A -A in the RNA.

Right.

And then, about 10 to 35 nucleotides pass that signal, a whole separate set of proteins sweeps in and physically cuts the RNA transcript free from the polymerase.

But wait, in our cells, that piece of RNA, the primary transcript, isn't ready for primetime yet.

The text calls it pre -mRNA.

It has the gene's information, but it needs a major makeover before it can leave the nucleus.

This whole step is called RNA processing.

This is the editing room?

Let's talk about the ends of the molecule first.

We modify both the front and the back.

Yes.

The 5' end, the very beginning of the transcript, gets a 5' cap.

This is a specially modified form of a guanine nucleotide.

Like putting a hard hat on the molecule.

Exactly.

It does a few things.

It protects the RNA from being chewed up by hydrolytic enzymes in the cytoplasm.

And it acts as an attach -here signal for the ribosome later on.

Okay, what about the 3' end, the back end?

That gets a poly -A tail.

An enzyme comes in and just adds 50 to 250 adenine nucleotides in a row.

Just A -A -A -A.

Is that just padding?

It sounds like filler.

It essentially is padding.

It helps export the mRNA out of the nucleus through the nuclear pores.

But crucially, it acts like a timed fuse.

A fuse.

Once the mRNA is out in the cytoplasm.

Enzymes immediately start nibbling away at that tail.

The longer the tail, the longer the mRNA survives before the actual coding message gets eaten.

The longer it survives, the more protein the cell can make from it.

If the tail is too short, the message gets destroyed before it can even be read.

Wow.

It's a built -in self -destruct timer to control how much protein gets made.

Exactly.

Now the most confusing part of the editing room involves cutting out the middle of the message.

Splicing.

This was a massive shock to me.

This was a huge shock to biologists when it was first discovered.

We had assumed genes were continuous lines of code.

But eukaryotic genes are split.

They are interrupted by these long stretches of non -coding nucleotides.

The text calls them introns.

Introns.

Short for intervening sequences.

They're literally in the way.

The parts that actually code for the protein are called exons because they are eventually expressed and they exit the nucleus.

So the gene is like a TV movie that has a bunch of commercial breaks spliced into the actual scenes.

That's a great way to view it.

So RNA splicing is the process of precisely cutting out all the commercials, the introns, and taking the movie scenes, the exons, together to make one seamless continuous film.

Oh, it is the cutting.

A huge molecular complex called the spliceosome.

It is made of proteins and small RNA molecules.

It recognizes specific splice sites at the start and end of the intron, cuts it out, and fuses the exons together.

And here's a really cool point the text makes.

It is actually the RNA molecules inside the spliceosome that do the cutting, not the proteins.

They act as ribozymes.

Which totally challenged the old biological dogma that all catalysts had to be proteins.

RNA can have catalytic power too.

Because it's single -stranded, it can fold up into complex 3D shapes and act just like an enzyme.

But I have to ask, why do we have this seemingly inefficient system?

Why spend all the energy to copy a huge piece of DNA into RNA just to immediately chop half of it up and throw it away?

It seems incredibly wasteful.

Evolution loves options.

Having introns allows for a phenomenon called alternative RNA splicing.

Meaning you can edit the movie differently.

Exactly.

You can take the exact same pre -mRNA transcript and treat different segments as exons or introns, depending on the context.

Maybe in a brain cell, segment A is kept as an exon.

But in a muscle cell, that same segment A is treated as an intron and cut out.

So one single gene can give rise to two or more entirely different polypeptides, depending on how the splice system edits it.

Yes.

That completely explains why humans are so incredibly complex, despite having only about 20 ,000 genes.

Which is roughly the same number of genes as a tiny nematode worm.

Right.

A microscopic worm is much simpler than a human, but it has the same number of genes.

The difference is, we extensively remix our genes through alternative splicing.

We get a lot more bang for our genetic buck.

OK.

The mRNA is capped.

It's tailed.

It's spliced.

It's polished.

It leaves the nucleus through a pore.

It enters the cytoplasm.

Now we are ready to build.

Translation.

This is where the code becomes a physical reality.

But the ribosome doesn't speak nucleotide.

We need an interpreter.

And that interpreter is transfer RNA.

tRNA.

Figure 17 .1c shows this beautifully.

Describe the tRNA for us.

So it is a single strand of RNA.

Only about 80 nucleotides long.

But it doesn't stay a straight line.

It folds back on itself because of hydrogen bonding between its own base pairs.

In the 2D translation, it is a single strand of RNA, only about 80 nucleotides long.

So it's not just a single strand of RNA.

It's a single strand of RNA.

But in real 3D space, it twists and folds into an L -shape.

It's a single strand of RNA.

And it has 2 very important business ends.

One end, the 3 prime end, sticks out at the top of the L.

This end grabs a specific amino acid from the cytoplasm.

That is the cargo.

And the other end.

The other end has a loop at the bottom containing the anticodon.

The anticodon.

This is the matchmaker.

Exactly.

The anticodon is a nucleotide triplet that is perfectly complementary to a specific mRNA codon.

Yeah.

For instance, if the mRNA says GGC, which is the codon for glycine, the tRNA with the anticodon CCG will swoop in and bind to it.

And that specific tRNA will always be carrying glycine on its other end.

Always.

It ensures that the chemical reality of the amino acid perfectly matches the genetic code written in the mRNA.

Now, there is a nuance here that the chapter calls wobble.

Yes.

Remember when we did the math?

We have 61 codons that code for amino acids, but we do not have 61 different tRNAs cells only have about 45.

So some of the tRNAs have to double up.

They have to read more than one codon.

We do.

The base pairing rules between the third base of the mRNA codon and the corresponding base of the tRNA anticodon are relaxed.

That third position is the wobble position.

How does that work in practice?

Well, a tRNA with U in the wobble position of its anticodon can safely pair with either an A or G in the third position of the mRNA codon.

It's flexible.

It allows the cell to be highly efficient with its tRNA inventory.

Okay.

Let's talk about.

Let's talk about the factory floor where all this comes together.

The ribosome.

It is a massive complex.

It consists of two distinct subunits, a large one and a small one, and it is made entirely of ribosomal RNA, rRNA, and various proteins.

It is the structure that physically brings the mRNA and the tRNAs together.

It has three parking spots inside it for the tRNAs, the A -site, the P -site, and the E -site.

I think of it as like a tiny mechanical assembly line.

That's the best visual.

Let's go through them.

The P -site, which stands for peptidyl tRNA binding site.

The A -site is right in the middle.

It holds the tRNA that is currently carrying the growing polypeptide chain.

Okay.

The middle holds the growing chain.

The A -site, the aminoacyl tRNA binding site, is the arrival gate.

It holds the newly arriving tRNA that is carrying the very next amino acid to be added to the chain.

A for arrival and the E -site.

E is for exit.

This is where the discharged empty tRNAs leave the ribosome to go get recharged with another amino acid.

Let's run the machine.

Step one of translation is initiation.

First, the small ribosomal subunit.

The small subunit binds to the five prime cap of the mRNA.

It also binds to a very special initiator tRNA, which always carries the amino acid methionine.

So it docks, and then what?

The small subunit scans downstream along the mRNA until it finds the start codon, AUG.

AUG, the universal start signal.

Once it hits AUG, the initiator tRNA binds to it.

Then the large ribosomal subunit clamps down firmly on top, sandwiching the mRNA.

This forms the complete translation initiation complex.

And this step requires energy, by the way, in the form of GTP.

So the machine is assembled.

Step two, elongation.

The actual cycle of building.

Imagine the ribosome now.

The P -site in the middle is occupied by that first tRNA carrying methionine.

The A -site is empty, waiting.

So a new tRNA comes into the A -site.

Right.

If its anticodon matches the next mRNA codon, it stays.

Now the ribosome performs a molecular magic trick.

It snips the entire growing polypeptide chain off of the tRNA sitting in the P -site.

And it forms a peptide bond, attaching that whole chain to the single new amino acids hitting on the tRNA in the A -site.

It passes the baton.

Exactly.

And the really wild part, that peptide bond formation is catalyzed by the RNA itself, not by a protein enzyme.

Another ribozyme in action.

So now the tRNA in the A -site is holding the whole chain, which is now one amino acid longer.

Yes.

Then comes translocation.

The ribosome ranches forward down the mRNA by exactly one codon.

Three nucleotides.

Right.

The empty tRNA that was in the P -site gets shunned.

Shoved into the E -site.

And it exits.

The tRNA holding the chain gets moved from the A -site into the P -site.

And the A -site is now perfectly empty and aligned with the next codon, ready for the next player.

And this cycle just repeats.

Fast.

Very fast.

Adding amino acid after amino acid.

Until it hits step three, termination.

The ribosome eventually slides over a stop P -codon.

UAG, UAA, or UGA.

Does a special stop tRNA come in?

No, there is no tRNA for a stop codon.

Instead, a special protein called...

...a release factor binds directly to the A -site.

A release factor.

It is shaped almost exactly like a tRNA, so it fits perfectly into the A -site.

But it carries a water molecule instead of an amino acid.

Oh, a molecular Trojan horse.

Great way to put it.

It uses that water to hydrolyze the bond between the polypeptide and the final tRNA in the P -site.

The completed polypeptide chain is released into the cytoplasm to go fold up into its 3D shape.

And the entire ribosomal assembly breaks apart, ready to be reused.

Now, normally, it isn't just one single ribosome.

A single ribosome working on an mRNA transcript, right?

That would be too slow.

Way too slow.

You usually have polyrobosomes.

Figure 17 .20 shows an electron micrograph of this.

You see the single mRNA strand, and there's a whole line of ribosomes attached to it, moving along it like beads on a string.

Mass production.

Absolutely.

As soon as the first ribosome moves far enough past the start codon, a second one hops on, then a third.

A single cell can make massive quantities of a needed polypeptide very quickly this way.

Briefly, how do these newly built proteins know where to go?

Some stay in the cytoplasm to work, but some need to go to the cell membrane or be secreted entirely out of the cell.

That routing is handled by the signal mechanism.

Proteins destined for the endolembrane system or secretion have a specific sequence of amino acids at the very beginning of their chain.

It's called a signal peptide.

Like a ziptoad.

Exactly.

As soon as that signal peptide emerges from the ribosome, a complex called the SRP, the signal recognition particle, sees it.

The SRP grabs the whole thing.

It grabs the whole ribosome complex and physically drags it over to the endoplasmic reticulum.

The ER.

The ribosome docks there and finishes translating the protein, effectively pushing the growing chain right through a pore into the ER, where it could be processed for shipping.

Okay, before we talk about what happens when errors occur, I want to touch on the scientific skills exercise in this chapter.

It talks about sequence logos.

This is a fascinating way of visualizing genetic data.

It is.

Suppose you are a researcher.

And you want to know exactly what DNA.

The sequence the bacterial ribosome looks for to join to the mRNA before the start codon.

You could just align 149 different genes from E.

coli and list all the letters.

But looking at a wall of A's, C's, T's, and G's is impossible to read.

Right.

So a sequence logo makes it visual.

In a logo, you look at one position, say 10 bases before the start codon.

You stack the letters A, C, G, and T on top of each other.

The height of each letter tells you how frequent it is at that spot across all 149 genes.

So if the A is drawn really tall, it means almost every single gene has an A at that exact position.

Exactly.

And the overall height of the entire stack of letters tells you the predictive power or the conservation of that spot.

Meaning?

If the stack is very tall, it means the sequence is highly conserved.

Evolution has kept it strict because it's fundamentally important for binding.

If the stack is very short, it implies the base varies randomly between genes.

So it probably doesn't matter much for the function.

It is a brilliant way to see.

It is a brilliant way to see the actual signal hidden inside the noise of the DNA.

Okay, let's talk about when that signal gets scrambled.

Mutations.

The ultimate source of all genetic diversity in evolution, but also the source of a lot of human suffering.

We are focusing specifically on point mutations.

Changes in a single nucleotide pair.

Just one single letter in the billion letter book.

Let's look at substitutions first, where one letter is swapped for another.

Sometimes you get lucky and it's a silent mutation.

Right.

Because of the redundancy we talked about.

Yes, you change GGC to GGU in the mRNA.

Both of those codons being glycine, no harm done.

The protein is exactly the same.

But sometimes it is a missense mutation.

You actually change the amino acid.

And the classic textbook example of this is sickle cell disease.

Walk us through what happens there.

A single thymine is changed to an adenine in the DNA template for the beta globin gene.

That causes the mRNA codon to change, which changes a glutamic acid into a valine in the beta globin.

That's the final hemoglobin protein.

Just one amino acid difference.

But glutamic acid is hydrophilic.

It loves water.

Valine is hydrophobic.

It hates water.

Because of that one single switch, the hemoglobin molecules end up sticking to each other when oxygen levels are low.

They form these long, rigid fibers that physically distort the red blood cell into a sickle shape.

One letter change.

And the blood clots in capillaries and carries less oxygen.

It just goes to show how fragile the structural design of these proteins can be.

Then there's the nonsense mutation.

This is usually worse.

This happens when a substitution changes a normal amino acid codon into an estatopi codon prematurely.

Oh, so the ribosome just ejects the protein early.

Yes.

The translation is cut short.

The rosolipopid diet is truncated and almost always completely non -functional.

It's like stopping a sentence right in the middle.

Exactly.

Those are substitutions.

But insertions and deletions seem like they're usually much worse because they cause the frame shift we discussed earlier.

Right.

Remember?

Remember the red dog ate the bug, becoming her Ed Oga.

If you insert one extra base or delete one base, you shift the entire reading frame.

Every single amino acid downstream of the error is going to be wrong.

It creates a completely gibberish protein that the cell usually has to destroy immediately.

Unless you insert or delete exactly three bases.

Nah, because the reading frame is three.

Right.

If you insert three, you just add one extra amino acid to the chain.

The rest of the sentence downstream still makes sense.

It might still mess up the protein's folding, but it isn't a guaranteed frame shift disaster.

What causes these typos in the first place?

They can happen spontaneously during DNA replication.

But they're also caused by mutagens.

Physical and chemical agents.

Physical mutagens, like x -rays or UV light from the sun.

UV light is particularly nasty.

It can cause two adjacent thymine bases on a DNA strand to fuse together.

That creates a thymine dimer that physically buckles the DNA and confuses the DNA.

And chemical mutagens.

Some are nucleotide analogs.

They look chemically similar to normal DNA bases, but they pair incorrectly, causing massive errors when the DNA is copied.

Others physically insert themselves into the double helix and distort it.

This naturally leads us to the final segment of the chapter.

Because we used to just watch mutations happen and document the diseases.

But now we can actually make them on purpose.

CRISPR -Cas9.

This is undoubtedly the biggest breakthrough in biotechnology in our lifetime.

And the crazy part is...

It wasn't invented from scratch in a lab.

It was found...

In bacteria.

Yes.

Bacteria get infected by viruses too.

Phages.

And they have an immune system to fight them.

They store tiny snippets of viral DNA in their own genome, like a rogue's gallery of past attackers.

This archive is the CRISPR region.

And the weapon they use is Cas9.

Cas9 is a nucleus.

A protein that cuts DNA.

But it is not just a blind axe swinging around.

It uses a guide RNA.

The bacteria transcribes a piece of that viral archive into RNA.

And...

Loads it into the Cas9 protein.

Cas9 then uses that guide RNA to hunt down the exact matching sequence in an invading virus and chop it up.

Guided molecular scissors.

Scientists looked at this and realized something incredible.

We can program the guide RNA.

We can put any sequence we want in there.

Any sequence.

We can tell Cas9 to go cut a very specific gene in a human cell.

Or a plant cell.

Or a mouse.

Figure 17 .28 diagrams this mechanism beautifully.

The Cas9 guide RNA complex binds to the target.

And cuts both strands of the DNA.

So we target a gene.

And Cas9 cuts it.

Then what happens?

The cell panics.

Yeah.

Its DNA is broken.

It desperately tries to repair the cut.

This leads to two possible outcomes that scientists use.

Outcome one.

The cell repairs it sloppily.

It just jams the ends back together.

Usually adding or deleting a few random bases in the process.

This causes a frame shift.

And completely breaks the gene.

That is called a knockout.

Why would we want to break a gene?

It is incredibly useful for research.

If you want to know what a mystery gene does.

You break it.

And you see what happens to the organism.

Okay.

But outcome two is the holy grail of medicine.

Gene repair.

Yes.

If you inject the CRISPR system.

But you also provide a template piece of normal healthy DNA.

The cell can use that template to flawlessly fix the cut.

You can effectively rewrite the broken sequence.

You could theoretically go in and turn the mutated sickle cell hemoglobin gene.

Back into the normal hemoglobin gene.

That is immense power.

And the text makes sure to note Jennifer Budna's warning.

She's one of the co -discoverers of CRISPR.

We are literally rewriting the source code of life now.

That carries a massive ethical weight.

It absolutely does.

We are crossing a major threshold in human history.

So bringing it all home to wrap up the deep dive.

We started with the question.

What is a gene?

And we've seen that definition evolve significantly over time.

From Mendel's idea of a discrete unit of inheritance.

To Morgan finding that there are specific loci on chromosomes.

To Beadle and Tatum's.

One gene, one enzyme.

And now the textbook leaves us with the modern functional definition to close the chapter.

A gene is a region of DNA that can be expressed to produce a final functional product.

Which is either a polypeptide or an RNA molecule.

And that covers everything.

It covers the proteins.

But it also covers the tRNAs, the rRNAs, the ribozymes inside the spliceosomes.

It captures the full, messy, beautiful complexity of it all.

It has been quite a ride.

From Archibald Garrow to looking at black urine.

To.

Glowing tobacco plants.

To editing our own genomes.

The underlying takeaway really is that universality.

Yeah.

The fact that the exact same genetic dictionary AUG means start.

UAA means stopworks in the mold growing on an old piece of bread.

And in the neurons firing in your brain right now.

We are all running the exact same operating system.

Here's a thought to leave you with.

If our environment changes rapidly.

And our genes naturally take many generations to adapt through evolution.

Could our newfound ability to edit our own genome.

With CRISPR.

Become the primary driver of human evolution going forward.

Something to chew on.

Thank you for diving deep with us.

Hopefully the next time you look in the mirror.

You see all that microscopic machinery humming away underneath.

That never stops.

This has been the last minute lecture team.

Thanks for listening.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

The conversion of genetic information into functional proteins unfolds through interconnected molecular processes that constitute the central dogma of molecular biology. Transcription initiates when RNA polymerase recognizes and binds to promoter regions, with transcription factors facilitating this recognition and subsequent synthesis of messenger RNA along the DNA template strand. The transcription process proceeds through initiation, elongation, and termination phases, generating a primary transcript that requires substantial modification in eukaryotic cells. Nascent eukaryotic RNA undergoes extensive processing including the addition of a 5 prime cap structure at one end, polyadenylation at the other to increase stability and translational efficiency, and intron splicing that removes non-coding sequences while joining exons together to produce mature mRNA. Prokaryotic systems bypass these modifications, allowing transcription and translation to occur simultaneously in the cytoplasm. The genetic code serves as the critical bridge between nucleotide sequence and amino acid composition, operating through a triplet codon system wherein each three-nucleotide sequence specifies either a particular amino acid or a termination signal. Translation occurs at ribosomes, where transfer RNA molecules bearing specific anticodons deliver their attached amino acids in response to mRNA codon sequences. The ribosome catalyzes the formation of peptide bonds through initiation complex formation, sequential elongation cycles that progressively extend the polypeptide chain, and termination upon recognition of stop codons. Mutations represent alterations in the DNA sequence that propagate through gene expression pathways with varying consequences depending on their nature and location. Point mutations affect individual nucleotides, potentially altering a single amino acid or producing a silent change, while frameshift mutations shift the reading frame downstream, causing dramatic alterations to all subsequent amino acids and frequently producing nonfunctional proteins. Understanding these mechanisms reveals how genetic information orchestrates the production and modification of proteins that execute cellular functions and shape organismal phenotypes.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 17: Gene Expression: From Gene to Protein

Related Chapters