Chapter 6: Gene Expression: Translation and Protein Synthesis

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Welcome back to the Deep Dive.

You know, we spend so much time talking about DNA, about replication and transcription, all about, you know, copying the blueprints and then making the blueprints.

But there's this critical moment in every cell's life when all of that abstract information, all that code stored in a sequence has to become, well, real.

It has to become physical functional reality.

Right.

It's kind of like typing up a really important contract.

You can spend days drafting and editing, but if one single typo slips through, just one letter changed, it can completely derail the meaning of a crucial sentence.

That is a perfect analogy.

It really sets the stakes for what we're talking about today because in the cell, it's exactly the same.

You change one nucleotide, just one letter in that messenger RNA, and you can fundamentally alter the entire protein sentence the cell is trying to build.

Absolutely.

So today our mission is to do a definitive deep dive into translation.

This is that essential process where the base sequence that's encoded in messenger RNA gets converted into the specific linear sequence of amino acids in a polypeptide chain.

This is where rubber meets the road, biologically speaking.

It really is.

And for this deep dive, we're using the foundational molecular biology presented in the chapter on gene expression from our source stack.

And we're tailoring it for you, the listener, to really understand this complex flow of information from start to finish.

Because translation is arguably the single most important step where genetic information actually becomes functional.

It's where code becomes the molecules, the proteins, that do basically everything in the cell from structure to catalysis.

And if it fails?

Well, the consequences are immediate.

And they're severe.

Our sources really emphasize this, that understanding the precise sequence is non -negotiable for things like diagnosing genetic diseases.

Like the cystic fibrosis example they brought up.

Exactly.

The activity example highlights cystic fibrosis, which is the most common fatal genetic disease in the U .S.

And a mutation there, it often comes down to a single misread, a single incorrect codon that leads to a completely non -functional protein.

It's disastrous.

So we really need to map this whole pipeline from the building blocks all the way to the final destination of the finished product.

So this deep dive, we've structured it around answering the biggest questions.

First,

what are proteins actually made of?

What's their chemical and structure?

Then how was the genetic code itself figured out?

How was it deciphered?

And what are its universal rules?

And after that, what are the key players, the actual machinery?

We're talking about the structures and functions of the critical RNA components like tRNA and rRNA.

And then the process itself.

How is the polypeptide chain actually built?

So initiation,

elongation, and termination, the play -by -play.

Finally, you have this incredibly complex molecule.

How does the cell solve the logistics problem?

How does it make sure that protein gets sorted and shipped to the right address, whether that's the nucleus or the mitochondria or even outside the cell?

It's a huge challenge, especially in eukaryotes.

Okay, let's unpack all of this.

And I think we have to start naturally with the end product.

We have to start with the protein itself because you can't really appreciate the complexity of the machinery until you understand the amazing complexity of the molecules it's tasked with building.

That's a great place to begin.

So proteins.

At their core, they're these high molecular weight workhorses.

They're massive, nitrogen -containing organic compounds.

And they have these incredibly intricate three -dimensional shapes.

They do.

And they're usually composed of one or more of these long chain -like subunits that we call polypeptides.

So what are the individual links in that chain, the building blocks?

These are the amino acids.

What's fascinating is that almost all of them, with a few exceptions, share a common structure.

Okay.

Every single one has a central carbon atom.

We call it the alpha carbon.

And attached to that, you always find three things.

An amino group, which is NH2, a carboxyl group, COOH, and just a single hydrogen atom.

So that backbone is always the same.

Always the same.

Which means when they link up, they form a very predictable, repeating backbone structure.

But wait, if the backbone is always the same, how do we get 20 chemically distinct amino acids?

Where does the variety come from?

The variety comes entirely from the fourth thing attached to that alpha carbon.

It's called the R group, or the side chain.

And that's different for every single one.

Exactly.

We like to call the R group the amino acid's personality.

It can be tiny, like just another hydrogen atom in glycine, or it can be this large, bulky, complex ring structure.

And that personality determines its chemistry.

It determines everything.

The R group dictates whether the amino acid is acidic, or basic, or polar, and loves water, or hydrophobic, and hates water.

And those properties are what drive how the entire finished protein will eventually fold up.

Now, you mentioned an exception.

The source materials pointed out one amino acid that breaks this rule.

Yes, that would be proline.

It's really unique.

Its R group actually loops back and forms a covalent bond with its own amino group.

What does that do to the structure?

It creates this rigid, kinked ring structure.

So wherever you find a proline and polypeptide chain, it often forces a very sharp, specific turn or bend in the chain.

It's essential for certain protein architectures.

So you have these 20 specific building blocks.

How does the ribosome actually link them together into a chain?

They're joined by what's called a peptide bond.

It's a very strong covalent bond that forms between the carboxyl group of one amino acid and the amino group of the next one in line.

And that reaction releases a water molecule.

It's a dehydration reaction, exactly.

And this bond creates that stable, linear backbone of the entire polypeptide.

And this linkage, this process of adding one to the next, it creates a kind of directionality to the chain, right?

A polarity.

Why does that direction matter?

It matters fundamentally because the synthesis, the building of the chain, always proceeds in one direction.

And that's defined by chemistry.

The end of the chain that has a free amino group is called the N -terminus.

And we define that as the start of the polypeptide because that's the end that's made first by the ribosome.

And the other end?

The other end is the C -terminus.

It has a free carboxyl group, and that's where the very last amino acid gets added.

This end -to -C directionality is, it's immutable.

And it's critical for things like protein sorting, which often relies on a sequence located right at that N -terminus.

So just knowing the linear sequence, what we call the primary structure, is really just the first step.

The real magic of biology is how that one -dimensional sequence dictates this incredibly complex three -dimensional form.

Yes, that's the heart of protein function.

The fold is everything.

Let's walk you through the four hierarchical levels of protein structure.

So we start with the level one.

Primary structure, this is simply the linear order of the amino acids in the polypeptide chain.

That's it.

And that sequence is determined directly by the base pair sequence of the gene that encodes it.

Directly and uniquely.

If you change a single amino acid here at the primary level, you're fundamentally altering the raw material for all the subsequent levels of folding.

Okay, so level two is the secondary structure.

Right.

These are the regular, predictable sort of local folding patterns that emerge almost immediately as the chain is being made.

And these are driven by bonds within the backbone itself, not the R groups.

Exactly.

It's all about weak bonds, mostly hydrogen bonds, forming between the NH and CO groups of the backbone amino acids that are near each other in the chain.

The R groups aren't really involved here.

And the two big examples of secondary structure are the alpha helix and the beta -pleated sheet.

Those are the canonical forms, yes.

The alpha helix is this right -handed spiral, like a coil.

It's stabilized by hydrogen bonds that form between an NH group and the CO group of an amino acid that's exactly four positions down the chain.

So it's like a spiral staircase with the steps being these hydrogen bonds.

That's a perfect way to think of it.

The beta -pleated sheet is different.

It's much flatter.

It involves segments of the polypeptide folding back and forth in a sort of zigzag pattern, like a folded fan.

And those parallel zigzags are linked together.

Yes.

They're linked by hydrogen bonds between their backbones, which creates this very strong, rigid sheet -like structure.

And many large proteins are actually mosaics of these two structures.

So we'll have some rigid alpha helical segments and some flat beta -pleated sheets, all connected by these little unstructured loop regions.

Precisely.

Which brings us to level three, the tertiary structure.

This is the big one, the final 3D shape.

This is the ultimate specific three -dimensional shape or conformation of a single completed polypeptide chain.

And while the primary structure dictates it, the tertiary structure is actively driven by interactions between the R groups themselves.

This is where their chemical personalities really come into play.

And what kind of R group interactions are we talking about here?

Well, you have your standard weak forces.

So hydrogen bonds between polar R groups, ionic interactions, sometimes called salt bridges, between charged acidic and basic R groups.

But there are stronger bonds too, right?

Yes, you can get stronger covalent bonds called sulfur bridges or desulfide bonds.

These form between the R groups of two cysteine amino acids.

And on top of all that, the environment is critical.

You mean the fact that the cell is mostly water.

Exactly.

The cell is an aqueous environment.

So the protein folds to minimize the energetic friction.

The polar and charged R groups, they want to be on the outside interacting with water.

But the non -polar hydrosobic R groups get tucked away tightly into the core of the protein, hiding from the water.

So it literally folds itself to hide its oily parts from the water.

That's a great way to put it.

And finally, level four is the quaternary structure.

And this level doesn't apply to all proteins.

No, this only exists in proteins that are made of multiple polypeptide chains, what we call multi -subunit proteins.

The quaternary structure is just the final assembled shape of how all those different chains interact and fit together to function as a single unit.

And the textbook example is always hemoglobin.

Right.

Hemoglobin is a perfect example of a hetero -multimeric protein.

It's made of four separate chains, two alpha chains and two beta chains.

And it's the precise way those four chains fit together that creates the final functional oxygen -carrying molecule.

Now, a really important point from the source material is that this whole folding process doesn't wait until the chain is finished.

It happens co -translationally.

Yes, that's a crucial point.

Folding begins during the synthesis process.

As the N -terminal end of the protein is emerging from the ribosome, it's already starting to fold into its secondary and tertiary structures.

Which means sometimes, with all that complexity, the folding process might need a little help to avoid mistakes.

And that's where chaperones come in.

Molecular chaperones are helper proteins.

They interact with the folding polypeptide just temporarily to guide it toward its correct functional shape and make sure it doesn't misfold or clump together with other proteins.

But they're not part of the final structure?

Never.

They're strictly aids.

You can think of them as highly skilled coaches, but they never actually play in the game.

That's a fantastic overview of the final product.

So now let's pivot to the instruction manual that dictates all of the structure.

The genetic code itself.

This was the central intellectual challenge facing molecular biologists back in the late 1950s.

The question was simple.

How do four chemical letters, the nucleotides A, C, G, and U, reliably specify 20 biological letters, the amino acids?

It's a math problem, first and foremost.

A one -letter code would only give you four possibilities.

Not enough.

Right.

A two -letter code that's four squared, so 16 possibilities.

Still not enough for 20 amino acids.

But a three -letter code, four cubed, gives you 64 possible combinations.

And 64 is way more than 20.

So that mathematical reality immediately suggested that the code must be degenerate, which just means that multiple combinations of bases, multiple three -letter codons must specify the same amino acid.

But proving it was a triplet code and not something more complicated required a really elegant experiment.

And that came from Francis Crick, Sydney Brenner, and their colleagues in 1961 using the T4 bacteriophage.

This is just a story of beautiful, beautiful logic.

They were using these Ti mutants of the T4 phage.

Now the normal wild -type phage can infect and grow on two different strains of E.

coli B and K12.

But the mutant can't.

Their i -mutant can only grow on the B strain.

It can't grow on K12.

So this gave them a perfect screening system.

They could look for revertans mutants that regain the ability to grow on that K12 host.

And to create their mutants, they used a chemical called proflaven.

Yes, and proflaven is important because it specifically causes frame shift mutations.

It either adds a single base, which we'll call plus mutation, or deletes a single base, a minus mutation.

And that completely scrambles the reading frame.

Totally.

Imagine the message is read in three -letter words.

The big red fox.

If you delete the B, the ribosome now reads the IGREDF ocx.

It's complete nonsense from that point on.

So a single deletion, a minus mutation, resulted in a non -functional mutant protein.

Correct.

But they reasoned if the code is read continuously, maybe a second nearby mutation of the opposite type, so an addition or plus mutation could fix it, it could restore the correct reading frame after that second mutation.

So the protein would only be wrong in that little scrambled section between the two mutations.

Exactly.

And if that scrambled part wasn't critical, the phage might revert to looking like the wild type and be able to grow on K12 again.

And they preved it.

They showed that one plus and one minus mutation together could indeed cause a reversion.

But the real clincher, the part that proved it was a triplet, came when they started combining mutations of the same sign.

This is the brilliant part.

They found that a single plus or a single minus mutant was, well, two pluses or two minuses together.

Still mutant, because shifting the frame by two is still completely out of sync.

Of three.

But when they created triple mutants of the same sign, so three pluses or three minuses, they often got functional

Because adding three bases or deleting three bases is like adding or deleting one whole three letter word.

It gets the rest of the message back in the right frame.

Precisely.

The simplest, most powerful explanation was that the genetic code is read in groups of three nucleotides.

It is a triplet code.

That experiment just elegantly proved the fundamental architecture of the system.

So once the geneticist proved it was a triplet, it was a race for the biochemist to actually crack the code.

This was the work of Marshall Nirenberg, Gobin Karana, and others in the mid -1960s.

Right.

They needed ways to synthesize specific RNA sequences and then see what proteins they would make in a cell -free system.

So they started with the simplest method, homopomer synthesis.

Meaning they made an mRNA out of just one repeating base.

Exactly.

For example, they made poly -U, just a long string of uracils.

And when they put that in their system, it directed the synthesis of a polypeptide made entirely of the amino acid phenylalanine.

So UU must code for phenylalanine.

Yep.

And from that, they figured out the AAA codes for lysine and CCC codes for proline.

A great start, but still a lot of codons to go.

So next, they moved to method two, random copper learners.

Here, they made mRNAs with a known ratio of two different bases.

Say, 75 % A and 25 % C.

By looking at how often each amino acid was incorporated into the final protein, they could use probability to figure out the base composition of the codons.

But not the sequence.

Not the sequence.

They might know a codon was made of two As and one C, but they didn't know if it was AAC, ACA, or CAA.

So still some ambiguity.

To solve some of that, Karana developed method three, known sequence copulmers.

This was a brilliant piece of chemistry.

He could synthesize precise, repeating sequences.

For instance, an mRNA that went 5 -UC -UC -UC3.

And what did that make?

It made a polypeptide of alternating lysine and serine.

So that proved that UCU and CUC encoded those two amino acids, but they still couldn't tell which was which.

So the final definitive step that really blew the whole thing open was method four, the ribosome binding assay developed by Nirenberg and Philip Leder.

This was the ultimate solution.

Instead of long RNA chains, they used tiny, synthetic trinucleotides, just three bases long, an unambiguous single codon.

And what did they do with them?

They mixed one of these trinucleotides with ribosomes and a whole collection of specifically charged tRNAs, where only one type of amino acid was radioactively labeled at a time.

And the key discovery was that only the tRNA that was complementary to that three base codon would actually stick to the ribosome.

So if they used the trinucleotide CUC and only the lysine carrying tRNA stuck to the ribosome, then CUC must be the codon for lysine.

They had definitively cracked it.

This method allowed them to assign the specific sequence for about 50 codons, wiping out all the previous ambiguity and leading to the complete map of the genetic code.

And that map revealed seven essential characteristics of the code that govern life's instructions.

Let's run through them.

First, we've established it's a triplet code.

Three trinucleotides make one codon.

Second, it's continuous or comma free.

The ribosome just reads it sequentially, three at a time without skipping any bases.

Third, it's non -overlapping.

Once the reading frame is set, the ribosome reads a group of three, then the next group of three, and so on.

Bases aren't shared between codons.

Fourth, and this is a profound one, the code is almost universal.

The same 61 codons specify the same 20 amino acids in nearly every organism on Earth, from bacteria to humans.

It's one of the strongest pieces of evidence for a common evolutionary ancestry.

And it's also incredibly useful for biotechnology.

Hugely.

It means we can take a human gene, put it into a bacterium, and that bacterium's ribosome will read the code correctly and make the human protein.

That's how we produce massive amounts of things like insulin.

Now, you said almost universal.

The source did highlight a few minor exceptions.

Yes, there are slight tweaks.

They're mostly found in the genomes of organelles like mitochondria and chloroplasts.

For example, in our own mitochondria, a codon that's normally a stop signal is reassigned to code for an amino acid.

But compared to the billions of years of conservation, these are really minor variations.

Okay, fifth property.

The code is degenerate or redundant.

Which it has to be, given the math.

We have 64 codons for only 20 amino acids.

So, most amino acids are specified by more than one codon, anywhere from 2 to 6.

The only exceptions are methionine, AUG, and tryptophan, UGG.

They each have only one codon.

And there's a pattern there, right?

There is.

Often, if the first two bases of a codon are the same, the identity of the third base doesn't matter as much.

It can be U, C, A, or G, and still code for the same amino acid.

Okay, sixth property.

There are dedicated start and stop signals.

Yes, AUG is the almost universal start codon, and specifies and finding.

And you have three essential stop signals.

UAG, UAA, and UGA.

These are called nonsense, or chain terminating codons, because they don't specify any amino acid at all.

They just tell the ribosome, stop here.

And that entire continuous stretch of code, from the start AUG to the first stop codon, has a name.

That's called the open reading frame, or ORF.

And finally, number seven, wobble.

I think this is one of the most mechanically brilliant and efficient parts of the whole system.

The wobble hypothesis, which was also proposed by David Wick, explains how the cell deals with that degeneracy we just talked about.

It says that the base at the five prime end of the tRNA's anticodon, that's the one that pairs with the third position of the mRNA's codon, is less structurally constrained.

It can wobble.

Meaning it can form non -standard base pairs.

Exactly.

So instead of needing 61 different tRNAs for the 61 cents codons, a cell can get by with far fewer, because a single tRNA can recognize multiple synonymous codons that differ only in that third position.

For example?

For example, if the wobble base on the tRNA is a G, it can pair with either a U or a C in the codon, or even better, a modified base called inosine, which is often found in that wobble position, can pair with AU or C.

It's incredibly efficient.

Nature using structural flexibility to achieve chemical efficiency.

It's just, it's amazing.

It is.

So now that we know the code, we need to meet the molecular machines that execute it.

So section three, the critical RNA components, tRNA and the ribosomes.

Let's start with the transfer RNA, the tRNA.

This is the physical translator, the adapter molecule.

Its job is to bring a specific amino acid to the ribosome that matches the codon being read.

The whole accuracy of translation really hinges on two points of fidelity here.

First, that the correct amino acid gets attached to the correct tRNA in the first place.

Second, that the tRNA's anti -codon correctly binds to the mRNA's codon inside the ribosome.

Structurally, tRNAs are pretty small, right?

Only 75 to 90 nucleotides long.

They are.

And because of internal base pairing, they fold up into this classic cloverleaf model.

When you draw it in 2D, it creates these four main loops and stems.

And the most critical part is loop two, because that's where you find the three nucleotide anti -codon sequence.

But in 3D, it's not a cloverleaf.

No, the functional tertiary structure is a compact L shape, which is essential for it to fit into the ribosome's binding sites.

And one other critical feature.

All functional tRNAs end with the exact same sequence at their three prime end.

Five foot cca three.

That's the attachment point for the amino acid.

Now, let's talk about the definitive proof that it's the tRNA itself and not the amino acid it's carrying that determines the specificity.

This is one of those legendary trick the cell experiments.

Oh, this is the landmark work from von Ehrenstein, Weisblum, and Benzer.

It's just brilliant.

They took the tRNA for cysteine.

Let's call it txrna.

They charged it correctly with cysteine.

But then they used a chemical to alter the cysteine that was already attached, turning it into alanine.

So they created this hybrid molecule, an alanine amino acid attached to a cysteine -recognizing tRNA.

Exactly.

They created ttrna.

Then they put this manipulated tRNA into a cell -free system that was synthesizing hemoglobin.

And what happened?

They observed that alanine was inserted into the hemoglobin protein at every single position where the mRNA code called for cysteine.

Wow.

The conclusion was just unassailable.

The amino acid itself was irrelevant to the ribosome.

Code and recognition specificity lies entirely in the tRNA anti -contact.

Which means that the process of attaching the correct amino acid to the tRNA in the first place, this amino acylation or charging, it must be surgically precise.

Because if that enzyme makes a mistake, the ribosome can't catch it.

It has to be perfect.

And this process is catalyzed by a family of 20 highly specific enzymes called the amino acyltrna synthetases.

There's one unique synthetase for each of the 20 amino acids.

And it's an energy intensive process, right?

It requires ATP.

It does.

First, the specific amino acid and ATP bind to the synthetase enzyme.

The amino acid gets activated when ATP is hydrolyzed, forming a high energy intermediate.

Then the correct uncharged tRNA binds to that same enzyme.

And the enzyme transfers the amino acid over.

The enzyme transfers the amino acid from the intermediate to that CCA tail at the three prime end of the tRNA, forming a covalent bond.

And that bond is very high energy, which is what will later provide the power to form the peptide bond.

So now we have our charged tRNAs.

What about the physical factory where all this happens?

The ribosomes.

Ribosomes are these enormous ribonuclear protein complexes.

They're made of two subunits, a large one and a small one, that only come together when translation is actively happening.

And they're different sizes in bacteria and eukaryotes.

They are.

In bacteria, you have the 70S ribosome, which is made of a 50S large subunit and a 30S small subunit.

In eukaryotes, like us, we have the larger 80S ribosome made of a 60S large and a 40S small subunit.

And that S value, the Svedberg unit, it refers to how fast they sediment in a centrifuge, which is why the subunit numbers don't add up arithmetically.

Right.

It's a measure of size and shape, not just mass.

When they combine, their shape changes, so they sediment differently.

And these subunits are made of both protein and ribosomal RNA or rRNA.

Yes.

In bacteria, the small subunit has 16S RNA and the large has 23S and 5S RNA.

In eukaryotes, it's 18S in the small subunit and 28S, 5 .8S and 5S in the large one.

And as we'll get to, that RNA isn't just a structural scaffold, it's the catalyst.

The ribosome organizes this whole process using three critical binding sites for the tRNAs.

Three pockets, right at the interface of the two subunits.

You have the A aminoacyl site, which is the entry point for the next incoming charged tRNA.

And the P site.

The P -peptidyl site, which holds the tRNA that's currently attached to the growing polypeptide chain.

And finally, the E exit site, which is where the now uncharged tRNA binds for a moment before it gets ejected from the ribosome.

So A for arrival, P for polypeptide, E for exit.

That's a great way to remember it.

Okay.

We have the components, the mRNA template, the charged tRNA adapters, and the ribosomal factory.

We are ready for the assembly line.

Let's dig into the mechanism of translation, starting with initiation.

Initiation is all about finding the correct start codon, that AUG, and getting the very first amino acid, the initiator tRNA, loaded into the P site to kick things off.

And this is where prokaryotes and eukaryotes really differ.

Let's start with bacteria.

In bacteria, you need the 30S small subunit, the mRNA, a special initiator tRNA, called text tRNA, and three protein initiation factors, IF1, 2, and 3.

And it's all powered by GTP.

How do they find the right start site on the mRNA?

They use a direct recognition system.

There's a specific sequence in the mRNA leader, just upstream of the AUG, called the Shine -Dalgarno sequence.

It's like an address label.

It's exactly like an address label.

It's a purine -rich sequence, usually something like 5IGA3F.

And the reason that's the label is because it's perfectly complementary to a sequence at the 3' end of the 16S RNA in the small ribosomal subunit.

So the ribosome's own RNA physically base pairs with the messenger RNA.

It literally sticks to it, which positions the 30S subunit perfectly so that the AUG start codon is sitting right in the future P site.

It's an incredibly elegant alignment mechanism.

And the initiator tRNA in bacteria is also special.

It is.

The methionic caries is chemically modified with a formal group.

So we call it formalmethionine, or FMET.

And this special initiator tRNA is the only one that can go directly into the P site to start things off.

Why is that?

Because IF1 physically blocks the A site.

So it's forced to go into the P site, chaperone there by IF2.

This ensures the growing chain starts in the right place for the next cycle.

So to assemble the whole complex, the 30S subunit binds the mRNA, then the initiator tRNA comes in to form the 30S initiation complex.

Right.

And then the 50S large subunit binds, GTP is hydrolyzed, the initiation factors all leave, and you're left with the complete functional 70S initiation complex ready for the next step.

Now let's contrast that with eukaryotic initiation.

No Shine -Delgarno sequence, and the initiator methionine isn't modified.

Eukaryotes use a totally different strategy called the scanning model.

Instead of an internal address label, the ribosome starts at the very beginning of the mRNA at the 5' end.

Which relies on that 5' cap structure.

It absolutely does.

A complex of eukaryotic initiation factors, especially one called EIF4F, recognizes and binds to that 5' cap.

Then the 40S small subunit, which is already carrying the initiator meth tRNA, loads onto that 5' end.

And then it just starts moving down the line.

It starts scanning.

It moves along the mRNA, reading the sequence, looking for the first AUG codon it encounters.

But it's not just any AUG, is it?

No, it has to be in the right context.

Yeah.

The AUG needs to be embedded in a specific sequence called the COSAC sequence.

That context helps the 40S subunit know that, yes, this is the true start codon.

And what about the polyA tail, way down at the 3' end?

How does that get involved?

This is a really cool part of the efficiency.

A protein bound to the polyA tail

physically interacts with one of the initiation factors at the 5' cap.

And this interaction loops the mRNA into a circle.

Why would it do that?

This looping dramatically stimulates initiation.

It means that once a ribosome finishes translating and falls off the 3' end, it's right next to the 5' end, ready to hop back on and start another round immediately.

It's a recycling mechanism.

That is incredibly efficient.

So the engine is started.

Now we move to elongation, the step -by -step addition of amino acids.

Okay, so the initiator tRNA sitting in the P site.

That means the next codon in the sequence is exposed in the vacant A site.

So step one is bringing in the next tRNA.

Step one is aminoacyl tRNA binding.

The correct charged tRNA for that next codon is brought to the A site by an elongation factor called EF2, which is bound to GTP.

And if it's the right match?

If the codon and anticodon match correctly,

EF2 triggers the hydrolysis of its GTP.

That provides a burst of energy and it also acts as a final proofreading step.

Then EF2 gets released and recycled.

Okay, so now the P site and the A site are both occupied.

Step two must be forming the bond.

Step two is peptide bond formation.

This is the moment the chain actually grows.

The bond between the amino acid and its tRNA in the P site is broken.

At the same time, a new peptide bond is formed between that now free amino acid and the amino acid on the tRNA in the A site.

So the whole growing polypeptide chain gets transferred over to the tRNA in the A site.

Precisely.

And this reaction is catalyzed by peptidyl transferase.

But here's the profound discovery.

I know where you're going with this.

For decades, everyone assumed this enzyme was a protein.

But in 1992, Harry Noller and his colleagues proved that the catalytic activity is located entirely within the 23S rRNA of the large subunit.

So the ribosome is a ribozyme.

The RNA itself is the enzyme.

The proteins are just structural scaffolding.

One of the deepest revelations in all of molecular biology.

So the result of this step is an uncharged tRNA in the P site and a longer peptidyl tRNA in the A site.

Which leads to step three, translocation.

The ribosome has to move.

Yes.

The ribosome physically ratchets over exactly one codon, three nucleotides, toward the three prime end of the mRNA.

This movement requires another factor, EFG and more GTP.

And as it moves, everything shifts over one spot.

Everything shifts.

The tRNA with the growing chain moves from the A site into the P site.

The uncharged tRNA moves to the P site into the E site.

And from the E site, it gets ejected.

And then the A site is vacant again, ready for the next cycle to begin.

And this happens over and over.

And we should also mention polysomes.

Since this process is fast, you can have many ribosomes all translating the same mRNA at the same time, like beads on a string, which dramatically increases the rate of protein production.

Okay, that cycle continues until the very end, the signal to stop.

Termination.

Termination happens when one of the three stop codons, UAG, UAA or UGA, slides into the A site.

And the crucial detail here is that there is no tRNA with an anticodon that recognizes a stop codon.

So nothing binds.

Well, not a tRNA.

Instead, proteins called release factors, or RFs, bind in the A site.

They are shaped in a way that molecularly mimics a tRNA, so they can fit into that site and recognize the stop codon.

And what does the release factor do when it binds?

Its binding triggers the peptical transferase activity of the ribosome one last time.

But instead of forming a new peptide bond, it catalyzes the hydrolysis of the bond, linking the finished polypeptide chain to the tRNA in the P site.

And the protein is released.

The completed polypeptide is set free.

And then the whole complex, the ribosome, the mRNA, the last tRNA, needs to be disassembled and recycled for the next round.

In bacteria, this involves another fascinating factor called the ribosome recycling factor, or RRF.

Okay, so the protein is made, it's folded, it's been released.

But for eukaryotes especially, the job isn't quite done.

These proteins have to be sorted and delivered to their correct locations.

This is the problem of protein sorting.

And the fundamental framework for understanding this is the signal hypothesis, which is proposed by Gunther Blobel.

It says that proteins that are destined for the ER and maybe for secretion out of the cell are synthesized with an extra stretch of amino acids at their N -terminal end.

This is the signal sequence.

And since the N -terminus is made first, the signal sequence is the first part of the protein to emerge from the ribosome.

What recognizes it?

As that signal sequence emerges into the cytoplasm,

it's immediately recognized by something called the signal recognition particle, or SRP.

This is a complex made of both RNA and protein.

And when the SRP binds to that signal sequence, it does something amazing.

It pauses translation.

It just stops the ribosome in its tracks.

Stops it dead.

The whole SRP ribosome complex then moves to the surface of the endoplasmic reticulum membrane, where it docks with an SRP receptor.

So it brings the stalled ribosome to its destination.

Exactly.

Once it docks, the SRP is released and translation resumes.

But now the growing polypeptide chain is being threaded directly through a channel in the ER membrane right into the ER lumen as it's being made.

And once it's inside, that signal sequence that acted as the address label needs to be removed.

Correct.

An enzyme called signal peptidase cleaves it off.

The protein is now free inside the ER where it can be further modified and then sent on its way through the secretory pathway to the Golgi apparatus for final sorting and packaging.

It's an incredibly choreographed system.

Now before we wrap up, we have to revisit this concept from box 6 .1 in our source material because it ties everything together.

The sequence, the code, the tRNAs, and the timing of folding.

I'm talking about how silent mutations can still affect a protein's function.

This is such a fascinating modern genetic concept.

A silent mutation is a change in the DNA that alters a codon to a synonymous one.

So the primary amino acid sequence of the protein is completely unchanged.

So logically, the final protein should be identical?

Logically, yes.

But often it's not.

The final three -dimensional structure is different, and the question is how can that possibly be?

Any answer comes down to speed?

It's all about speed and co -translational folding.

We know the code is degenerate, but the different tRNAs that read those synonymous codons are not present in the cell in equal amounts.

Some codons are common and are read by abundant tRNAs, and some are rare read by much less abundant tRNAs.

So if a silent mutation happens to swap a common fast codon for a rare slow one...

The ribosome stalls at that point.

It has to wait longer for that rare tRNA to show up.

And because the protein is folding as it emerges,

that stall, that pause, gives the part of the chain that's already out extra time to explore different folding conformations before the next part of the chain comes along.

So a change in timing leads to a different folding pathway, which results in a different final tertiary structure.

Even though the primary sequence is identical,

the source material highlights the human MDR1 gene.

Silent mutations in this gene that only swapped fast codons for slow ones resulted in a protein with an altered 3D structure and different drug binding properties.

The cell got all the right ingredients, but they were added at the wrong pace and it changed the final dish.

That's a perfect analogy.

It shows that genetic information isn't just in the sequence itself, it's also in the kinetics of how that sequence is read.

So let's bring it all home for you, the learner.

Let's summarize the absolute highest yield principles from this entire deep dive.

First, the bedrock of structure.

The primary amino acid sequence dictates all subsequent levels of protein folding and structure.

That's number one.

Second, the genetic code is a triplet code.

It's continuous, not overlapping, almost universal, but also highly degenerate.

Third, the tRNA is the essential adapter molecule, and the specificity of which amino acid gets added lies entirely in the tRNA's anticodon, not the amino acid itself.

Fourth, the big one.

The ribosomes 23S rRNA acts as the peptidyl transferase ribozyme.

It is the true catalyst that forms the peptide bond.

The RNA is the enzyme.

Fifth,

successful initiation requires specific sequence recognition.

That's the Scheindel -Garnot interaction in prokaryotes, or the 5' CAP scanning mechanism and the Kozak sequence in eukaryotes.

And sixth, termination doesn't use a tRNA.

It requires protein release factors that mimic tRNAs to recognize the stop codons, which leads to the release of the polypeptide and the recycling of the entire machine.

Okay, so here's where it gets really interesting, building on that idea of silent mutations and folding speed.

We talked about how changing the translation rate can actually change a protein's final function.

Right, even if the primary sequence is the same.

So if the speed of translation is so critical for proper folding, what might happen if scientists could artificially manipulate the abundance of specific rare tRNAs inside an engineered cell?

That's a fascinating question for synthetic biology.

I mean, if you could synthetically increase the concentration of a rare tRNA, you could speed up translation at a spot where it's naturally slow.

Or the opposite, you could repress a common tRNA to introduce a stall, a pause, or one never existed before.

So could we literally program an engineered organism to fold its proteins into completely new shapes that would never naturally achieve, just by manipulating the pace of the message without ever changing a single amino acid in the sequence?

It seems possible.

You could potentially generate functionally novel proteins for medicine or industry purely through kinetic control, by controlling the rhythm of translation.

It's a perfect bridge from the textbook mechanics we've discussed to the absolute cutting edge of synthetic biology and protein engineering.

A fantastic thought to end on.

We really hope this deep dive into translation has given you the clarity and the depth you need to master this foundational process, including some of those surprising insights.

Thank you so much for joining us.

And from the entire team, a very warm thank you from the last minute lecture team.

We hope this is helpful for your studies.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Protein synthesis represents the culmination of gene expression, where messenger RNA sequences are decoded into chains of amino acids that fold into functional proteins capable of performing virtually every biological process. Amino acids, the building blocks of proteins, share a common structure consisting of a central carbon bonded to an amino group, a carboxyl group, a hydrogen atom, and a distinctive side chain that determines each amino acid's chemical properties. These twenty standard amino acids link together through peptide bonds, forming polypeptides that adopt increasingly complex structural levels including primary sequences, secondary structures stabilized by hydrogen bonding between backbone atoms, tertiary arrangements shaped by interactions among side chains, and quaternary assemblies when multiple polypeptide chains combine. The genetic code serves as the translation system itself, a triplet mechanism in which three-nucleotide sequences called codons specify individual amino acids, with 64 possible codons accommodating all twenty amino acids plus stop signals. Landmark experiments by Crick and Brenner using frameshift mutations, combined with Nirenberg and Khorana's synthetic messenger RNA approach, revealed this code to be nonoverlapping and comma-free, with degeneracy allowing multiple codons to specify the same amino acid. The wobble hypothesis explains this redundancy by showing how transfer RNA molecules can recognize several synonymous codons through flexible base pairing at the third codon position, while the code remains nearly universal across all organisms. Transfer RNA molecules, with their characteristic cloverleaf secondary structure and three-dimensional L-shaped conformation, possess anticodon loops that pair with messenger RNA codons and separate attachment sites where aminoacyl-tRNA synthetases bind specific amino acids to their corresponding transfer RNAs. Ribosomes, the catalytic machines of translation, exist in two sizes (70S in prokaryotes and 80S in eukaryotes) and consist of ribosomal RNA and dozens of proteins assembled into distinct subunits. Protein synthesis itself unfolds through three coordinated stages: initiation, where ribosomes assemble on messenger RNA using either the Shine-Dalgarno sequence recognition mechanism in bacteria or the scanning model involving the five-prime cap in eukaryotes; elongation, whereby charged transfer RNAs sequentially enter the acceptor site, peptide bonds form through the catalytic ribozyme activity of peptidyl transferase, and the ribosome translocates using GTP hydrolysis and elongation factors; and termination, when release factors recognize stop codons and disassemble the translation complex. Following synthesis, many proteins contain signal sequences that direct them toward the endoplasmic reticulum through signal recognition particles, ensuring correct cellular localization and enabling secretion or organellar targeting through the endomembrane system.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 6: Gene Expression: Translation and Protein Synthesis

Related Chapters