Chapter 3: Amino Acids, Peptides, and Proteins: Structure, Properties, and Purification

Search this chapter

Audio Overview

0:00 / 0:00

Autoplay next chapter

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement not replaced the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

Have you ever just stopped to wonder how incredibly complex life is?

I mean, right down at the cellular level, how do cells actually do anything?

It's kind of mind -boggling when you think about it.

Yeah, like generating light in a firefly or carrying oxygen in our blood, even just the structure of hair.

It's all happening molecule by molecule.

Absolutely.

And, you know, behind almost every single one of those processes, you find proteins.

They're the real workhorses, the molecular machines doing the heavy lifting.

The unsung heroes.

Definitely.

They have this almost endless diversity of functions.

I mean, they mediate virtually every process in a cell.

If you want to understand the molecular mechanism of any biological process, you pretty much always end up studying one or more proteins.

They're just foundational.

Welcome to the Deep Dive, everyone.

This is where we take source material and really pull out the most important insights for you.

And today we're diving deep into those fundamental building blocks of life,

amino acids, peptides, and proteins.

That's right.

We're drawing heavily from chapter three of the classic textbook, Leninger Principles of Biochemistry, our mission today, to really get a handle on how these molecular machines are built, how scientists study them, and what their structure tells us about life itself.

And this is so important, whether you're studying for an exam, maybe brushing up on your biochem, or just really curious.

Understanding proteins is absolutely key.

It unlocks so many mysteries in biology, in medicine.

We'll try to cut through some of the jargon, give you the core concepts, and hopefully some fascinating details that stick with you, things that help grasp these molecular mechanisms.

Let's do it.

OK, let's unpack this.

When we talk proteins, the starting point is always their fundamental units,

amino acids.

What is it that makes them so special, considering the huge range of things proteins do?

Well, what's truly fascinating is that despite this incredible diversity of proteins,

enzymes, speeding up reactions, keratin in your hair, they're all built from a common set.

Just 20 standard amino acids.

Only 20.

And each of these 20 has a unique bit called a side chain, or an R group.

And that R group dictates its specific chemical properties.

You can think of them like the alpha.

The alpha bit used to write the language of protein structure.

So 20 letters, but do they have some common features, a shared structure?

They absolutely do.

All 20 are what we call alpha amino acids.

That means they have a central carbon atom, the alpha carbon, and it's bonded to both a carboxyl group and an amino group.

The part that makes them different, the real differentiator, is that R group.

It varies in, well, structure, size, electrical charge, and that really influences how the amino acid behaves, especially things like its solubility in water.

And wasn't there one that's kind of the odd one out, a bit different structurally?

Oh yeah, you're thinking of proline.

It's definitely the exception.

Its side chain actually loops back and bonds to its own amino group.

Yeah, it forms this rigid ring structure.

So any part of a protein that contains proline becomes much less flexible.

It acts almost like a structural kink or stiffener, which is crucial for certain protein shapes.

Okay, so they mostly share this core structure.

The R group adds the variety.

Where does the next level of complexity come in?

Is it about their 3D shape?

Exactly.

That brings up a really important point about their three -dimensional arrangement.

You see, for all the amino acids except glycine...

Glycine's the simplest one, right?

Just a hydrogen R group.

Precisely.

Because its R group is just hydrogen, its alpha carbon isn't chiral.

But for all the other 19, that alpha carbon is a chiral center.

This means they can exist in two forms that are mirror images of each other.

We call them enantiomers.

Like your left and right hands.

Mirror images, but you can't lay one perfectly on top of the other.

That's the perfect analogy.

They're non -superposable mirror images.

And here's the kicker, biologically speaking.

Nearly all amino acids found in proteins are exclusively the L -stereoisomer, the L -form.

Wow, exclusively.

Right.

It's not random chance.

Cells specifically synthesize the L -forms.

Why?

Because the enzymes that build proteins, their active sites, are themselves asymmetric.

They're built to recognize and work with only the L -form.

It ensures stereospecificity.

Like trying to put a right glove on your left hand.

Exactly.

It just doesn't fit.

This strict handedness is absolutely fundamental to protein structure and function everywhere in life.

Okay, so we have these 20 L amino acids.

How do biochemists keep them straight?

How do they categorize them to make sense of their different properties?

Well, knowing the properties of each is key, but yeah, grouping them helps a lot.

We usually put them into five main classes.

And this is based on their R group specifically, their polarity, maybe how much they like or dislike water, and their charge of physiological pH, which is around pH seven.

Makes sense.

What are the groups?

Okay, first up,

the non -polar aliphatic R groups.

Think of these as like the oily or greasy parts.

They tend to hate water.

Hydrophobic.

Exactly.

Hydrophobic.

They tend to cluster together on the inside of proteins, away from the water.

This hydrophobic effect is actually a major driving force that shapes the protein's overall 3D structure.

Glycine is in this group.

Smallest R group, just hydrogen, allows for lots of flexibility.

Proline, as we said, adds rigidity.

And methionine is here too.

It's one of the two sulfur -containing ones.

Then you have the aromatic R groups.

That's phenylalanine, tyrosine, and tryptophan.

They have these bulky ring structures.

They're also relatively non -polar, so they contribute to that hydrophobic core too.

But tyrosine and tryptophan are a little bit more polar because they have hydroxyl or nitrogen groups, so they can sometimes do hydrogen bonding.

Now, I remember reading somewhere that these aromatic ones are useful in the lab.

Something about UV light.

Oh, absolutely.

That's a great point.

Those aromatic rings, especially in tryptophan and tyrosine, absorb ultraviolet light really strongly at a specific wavelength, 280 nanometers.

And this is super useful because it ties into the Lambert Beer Law.

Basically, the amount of light absorbed is proportional to the concentration of the protein.

So biochemists can just shine UV light at 280 millimeter through a protein solution, measure the absorbance, and get a quick,

pretty accurate estimate of the protein concentration.

It's a workhorse technique.

Love her.

Okay, what's next?

Next are the polar, uncharged R groups.

This includes serine, threonine, asparagine, and glutamine.

Their R groups are more water -loving, more hydrophilic.

Why is that?

Because they have groups like hydroxyls on serine, threonine, or aligned in groups on asparagine, glutamine that can form hydrogen bonds with water molecules.

So you often find these amino acids on the surface of proteins interacting happily with the surrounding water.

Okay, so they like water.

Makes sense for the surface.

Now, you mentioned methionine had sulfur, cysteine has sulfur too, right?

Is it in this group?

Ah, cysteine.

It's often grouped here because it's polar and uncharged overall, but it's really special.

It's the other sulfur -containing amino acid.

What's special about its sulfur?

Its R group has a sulfhydryl group, SH.

And this is critical because two cysteine residues can react.

They can be oxidized to form a covalent bond between their sulfur atoms.

This link is called a disulfide bond, or a disulfide bridge.

Ah, I've heard of those.

Yeah, they're super important.

They act like molecular staples.

They can link different parts of the same polypeptide chain together, or even link two different chains.

They add a lot of stability and structural integrity to many proteins, especially those that exist outside of cells.

Okay, disulfide bonds, got it.

What are the last two groups?

The last two are based on charge.

We have the positively -charged basic R groups.

That's lysine, arginine, and histidine.

At physiological pH, their side chains carry a significant positive charge because they have amino or other nitrogen -containing groups.

So they're hydrophilic too.

Very hydrophilic, yes.

Often found on the protein surface, interacting with water or binding to negatively -charged molecules like DNA.

Histidine is particularly interesting.

Its R group's pKa is close to neutral pH.

What does that mean, practically?

It means histidine can easily gain or lose a proton right around physiological pH.

So it often plays a crucial role in enzyme active sites, acting as a proton shuttle donating or accepting protons during a reaction.

Very important for catalysis.

Okay, histidine, the proton shuttle.

And the last group.

Finally, the negatively -charged acidic R groups, aspartate and glutamate.

Each of these has a second carboxyl group in its side chain.

So they lose a proton from that second group.

Exactly.

At pH 7, that second carboxyl group is typically deprotonated, giving the side chain, and thus the amino acid, a net negative charge.

Like the basic ones, they're hydrophilic and often involved in binding positive ions or molecules.

Wow, okay.

So the R group really defines so much.

You mentioned THFX charge.

Can we talk a bit more about their acid -based properties?

That seems really fundamental.

It is, absolutely.

Amino acids are amphalites, or amphoteric.

Fancy words meaning they can act as both weak acids and weak bases.

Because they have both the carboxyl group acidic and the amino group basic.

Precisely.

And at neutral pH, say around 7, they exist mainly as zwitterians.

Zwitterians?

Sounds German.

It is.

It means hybrid ion.

It's a molecule that has both a positive charge on the protonated amino group, NH3 +, and a negative charge on the deprotonated carboxyl group, COO.

But the net charge is zero.

Okay, so positive and negative, but balanced out.

Exactly.

You can see this clearly if you do a titration curve for an amino acid like glycine.

You start at low pH, everything's protonated.

As you add base and raise the pH, first the carboxyl group loses its proton, then much later at higher pH, the amino group loses its proton.

And those points where they lose protons, those are the pKa values.

Correct.

Glycine has two pKa values.

And there's a specific pH value called the isoelectric point, or PI.

That's the pH where the amino acid exists, predominantly as the zwitterian, with a net charge of exactly zero.

How do you find the PI?

For a simple amino acid like glycine, with no ionizable R group, the PI is just the average of its two pKa values.

For amino acids with ionizable R groups, it gets a bit more complicated, but the principle is the same.

Find the pH where positive and negative charge is balanced perfectly.

So this PI thing.

If a protein is in a solution where the pH is above its PI, it'll have a net negative charge, and below its PI, a net positive charge.

Got it.

Because if the pH is above the PI, more groups will be deprotonated negative.

If the pH is below the PI, more groups will be protonated positive.

And this definitely matters for studying them in the lab, right?

For separating them.

Absolutely critical.

The charge of a protein determined by the pH relative to its PI is a key property exploited in many purification techniques, like ion exchange chromatography.

And what's really cool is that within the folded protein, the local environment, like nearby charges, can actually tweak the pKa values of individual amino acid residues.

Really?

Imagine a negatively charged group near an acidic residue.

It would make it harder for that acidic residue to lose its proton, become negative.

So its pKa would actually go up.

Enzymes often use these subtle pKa shifts in their active sites to make the actions happen much more efficiently.

Fascinating.

Okay, before I move on from the building blocks, are there other amino acids besides the standard 20 we should know about?

Yeah, definitely.

There are quite a few uncommon amino acids that play important roles.

Some are created by modifying one of the standard 20 after the protein has already been made.

Like an edit.

Kind of.

A good example is 4 -hydroxyproline, which is made from proline.

It's essential for the structure of collagen, the main protein in connective tissue.

Without that hydroxylation, collagen isn't stable.

Then there are a few rare amino acids, like selenocysteine, that are actually incorporated during protein synthesis itself, using special tricks in the genetic code.

Wow.

And finally, there are amino acids that exist freely in the cell, but aren't usually part of proteins.

They act as intermediates in the metabolic pathways.

Ornithine and citrulline, for instance, are key players in the urea cycle, which helps us get rid of waste nitrogen.

Okay, so the 20 are the main alphabet, but there are these other important characters too.

Right, so we have our building blocks.

How do they actually link up to make something bigger?

A single amino acid isn't a protein, obviously.

Well, yeah, not at all.

They join together via a specific type of covalent bond called a peptide bond.

Okay, how does that form?

It's a condensation reaction.

The carboxyl group of one amino acid reacts with the amino group of the next one, and a molecule of water is released in the process.

Water comes out, they link up.

Simple enough.

Well, chemically, the reverse reaction, adding water back to break the bond hydrolysis is actually the thermodynamically favored direction.

But cells have this amazing molecular machine, the ribosome, which uses energy to essentially activate the carboxyl group, making peptide bond formation happen very efficiently during protein synthesis.

Gotcha.

So the cell forces it to happen.

Pretty much.

And when you link two amino acids, you get a di -tep -tide.

Three makes a tri -peptide.

A short chain, maybe up to 20 or so, is often called an oligopeptide.

And longer chains?

Longer chains are polypeptides.

Now, the distinction between a polypeptide and a protein can be a bit fuzzy, but generally we think of polypeptides as having molecular weights below about 10 ,000.

Proteins are typically larger, often made of one or more polypeptide chains, and have a well -defined 3D structure.

And when people write out these sequences, there's a specific way to read them, isn't there?

A direction.

Yes, absolutely.

There's always one end of the chain that has a free amino group that's called the amino terminus, or N -terminus.

N for amino, makes sense.

And the other end has a free carboxyl group that's the carboxyl terminus, or C -terminus.

By convention, sequences are always written starting with the N -terminal residue on the left and ending with the C -terminal residue on the right.

So you read it N to C.

Left to right, N to C.

Got it.

Are these peptide bonds strong?

Do they last?

They are remarkably stable, kinetically speaking.

Under typical conditions inside a cell, the average half -life of a peptide bond before it gets hydrolyzed is about seven years.

Even though hydrolysis is favorable, it happens very slowly without an enzyme catalyst.

This stability is crucial for proteins to function reliably over time.

Seven years.

Okay, that's stable.

Now we usually think about these huge proteins, but what about the little guys?

Do small peptides do anything important on their own?

Oh, absolutely.

Don't underestimate the small peptides.

Many of them have really potent biological effects, even at very low concentrations.

Like what?

Well, think about hormones.

Oxytocin, which is involved in childbirth and social bonding, is just nine amino acids long.

Only nine?

Yep.

Or thyrotropin releasing factor, which triggers the release of another hormone, is only three residues.

They're also potent toxins, like some mushroom poisons, and even many antibiotics that are small peptides.

So yeah, they pack a punch.

Okay, so size isn't everything.

But when we do think about proteins, the diversity is huge, right?

Immense.

They range enormously in size.

You have relatively small ones like human cytochrome C, about 104 residues, and they have absolute giants like Titan, a muscle protein which has almost 27 ,000 amino acid residues.

Wow, 27 ,000.

That's massive.

That's incredible.

And proteins differ not just in size, but also in their composition.

Some are just a single polypeptide chain, but many are multi -subunit proteins.

Meaning more than one chain.

Exactly.

They consist of two or more polypeptide chains, which might be identical or different, held together usually by non -covalent interactions, though sometimes disulfide bonds are involved too.

Hemoglobin, the protein that carries oxygen in your blood, is a classic example.

It has four subunits, two of one type alpha and two of another beta.

Okay.

And sometimes proteins have extra bits attached too, right?

Not just amino acids.

That's right.

Those are called conjugated proteins.

They contain permanently associated chemical components in addition to the amino acids.

These non -amino acid parts are called prosthetic groups.

Prosthetic, like an artificial limb.

Kind of analogous, yeah.

It's a non -protein part that's essential for the protein's function.

Examples include metal ions, or complex organic molecules like the iron -containing group in hemoglobin that actually binds the oxygen, or lipids attached to proteins forming lipoproteins which transport fats in the blood.

So proteins can be really complex machines.

Okay, switching gears a bit.

If a cell has thousands of different proteins all mixed together, how on earth do scientists study just one specific protein that sounds like the ultimate needle in a haystack problem?

It really is.

That's one of the fundamental challenges in activities in biochemistry protein purification.

The key is to cleverly exploit the differences in the physical and chemical properties of proteins.

Things like size, charge, binding affinity, solubility.

So you need a strategy.

Where do you start?

First step is always to break open the cells or tissues to release all the proteins.

This messy mixture is called the crude extract.

Okay, you've got the soup.

Now what?

Now you start fractionation, separating the mixture into different fractions, hoping to enrich your protein of interest in one fraction while getting rid of others.

An early common step is salting out.

Salting out, like adding salt.

Exactly.

Usually ammonium sulfate.

As you increase the salt concentration, proteins become less soluble and start to precipitate out of the solution.

Different proteins precipitate at different salt concentrations, so you can selectively collect the fraction containing your protein.

Okay, so solubility differences, what else?

After that, you often use dialysis.

You put your protein solution in a bag made of a semi -permeable membrane, one with pores of a specific size.

Then you put that bag in a large volume of buffer.

Small molecules and salts can pass freely through the pores and out into the buffer, while the large protein molecules are trapped inside the bag.

It's a good way to remove the salt from the previous step or change the buffer.

Right, like a molecular sieve.

Precisely.

But the real powerhouse techniques for protein purification rely on column chromatography.

Chromatography, I remember that from chemistry.

Same principle.

You have a column packed with some solid material, the stationary phase.

You apply your protein mixture to the top, and then you flow a buffer, the mobile phase, through the column.

Proteins interact differently with the stationary phase, causing them to move through the column at different rates and separate.

What kinds of interactions are we talking about?

Several types.

One is ion exchange chromatography.

Here the stationary phase has charged groups bound to it.

So it separates based on the protein's charge.

Exactly.

If you use a kyan exchange resin, it has negative charges bound to it.

So positively charged proteins, catechamations, will bind to the resin, while negatively charged or neutral proteins will flow through more quickly.

You can then release the bound proteins by changing the pH or increasing the salt concentration in the buffer.

Okay, so if I had a peptide that was negatively charged overall and one that was positively charged and I put them on a kyan exchange column,

which one comes off first?

The negatively charged one would come out first.

It wouldn't stick to the negative resin.

The positively charged one would bind and elute later, probably when you increase the salt concentration.

It's a direct application of using that PI concept we talked about.

Knowing the charge at a given pH lets you predict how it behaves.

Cool.

What other kinds of chromatography are there?

Another major one is size exclusion chromatography, sometimes called gel filtration.

This separates proteins based purely on their size and shape.

Okay, so bigger ones come out slower.

Actually, counter -intuitively, the larger proteins come out first.

Wait, really, how does that work?

The stationary phase here consists of porous beads.

Small proteins can enter these pores, so they have a longer, more convoluted path through the column.

Large proteins are too big to enter the pores, so they just flow around the beads and take a more direct route, exiting the column sooner.

Ah, okay.

They take the express lane because they can't fit in the local streets.

Good analogy.

Then there's affinity chromatography, which can be incredibly specific and powerful.

Affinity,

like binding.

Exactly.

Here, the stationary phase has a molecule covalently attached to it called a ligand that binds specifically and tightly to your protein of interest.

So when you pass the crude extract through, only your protein sticks to the ligand.

Everything else washes through.

Wow, that sounds efficient.

How do you get your protein off, then?

You usually elute it by adding a high concentration of the free ligand, it competes for binding, and displaces the protein, or by changing the conditions, like pH or salt, to disrupt the binding interaction.

Okay, very targeted.

And all these chromatography methods can be performed using HPLC high -performance liquid chromatography.

This uses very fine materials for the stationary phase and high -pressure pumps to push the buffer through.

It gives much better resolution and faster separation times.

So you do these steps.

How do you know if it's working?

How do you track the purification?

Great question.

You monitor it quantitatively.

Typically, if your protein is an enzyme,

you measure its activity, how much reaction it catalyzes per unit time.

You also measure the total amount of protein in your fraction.

You then calculate the specific activity, which is the units of enzyme activity divided by the milligrams of total protein.

As you purify your protein, you're getting rid of other contaminating proteins.

So the total activity might stay the same or decrease slightly, you always lose some, but the total amount of protein decreases significantly.

Therefore, the specific activity should increase with each successful purification step.

When it reaches a maximum and constant value, your protein is likely pure.

You often summarize this in a purification table.

Right, tracking the numbers.

Yeah.

But beyond just separating, how do you actually look at the proteins?

How do you check the purity or estimate the size?

For that, the go -to technique is electrophoresis, specifically gel electrophoresis.

Running proteins on a gel.

Yep.

You apply an electric field across a gel matrix, usually made of polyacrylamide.

Proteins migrate through the gel based on their charge and size.

The most common type by far is SDSPAGE.

SDS stands for Sodium Dotocell Sulfate.

It's a detergent.

What does the detergent do?

Two crucial things.

First, it denatures the proteins, making them unfold into linear chains.

Second, SDS coats the protein with negative charges,

overwhelming the protein's own intrinsic charge.

So all proteins become negatively charged rods.

Pretty much.

And the amount of negative charge is roughly proportional to the protein's mass.

This is ingenious because now, when you apply the electric field, all the proteins migrate towards the positive electrode, and their separation is based almost entirely on their size or molecular weight.

Ah, so smaller proteins wiggle through the gel faster.

Exactly.

Smaller proteins move faster and further down the gel.

After running the gel, you stain it to visualize the protein bands.

If your purification worked, you should ideally see just one major band at the expected molecular weight for your protein.

Clever.

Are there other kinds of electrophoresis?

Yes.

Another important one is isoelectric focusing, or IEF.

This separates proteins based on their isoelectric point, their PI.

The pH where their net charge is zero.

Right.

In IEF, you establish a stable pH gradient across the gel.

When you apply the electric field, proteins migrate until they reach the position in the gel where the pH equals their PI.

At that point, they have no net charge.

They stop moving.

So it separates by PI, not size.

Correct.

And you could even combine these techniques for incredibly high resolution in two -dimensional electrophoresis.

Two dimensions?

How?

First, you separate the proteins by their PI using IEF in one direction, say in a thin tube gel.

Then, you take that gel and place it sideways on top of an SDS page slab gel and run it in the second dimension.

So first dimension is PI, second dimension is size.

Exactly.

This spreads the proteins out across the gel based on two independent properties.

You can resolve thousands of different proteins from a complex cellular extract on a single 2D gel.

It gives you a snapshot of the cell's proteome, its entire protein complement.

Incredible resolution.

Okay.

Okay, so let's say we've purified our protein.

We've checked it on a gel.

What's the next fundamental thing we want to know?

You mentioned primary structure earlier.

What exactly is that?

Primary structure is, well, the most basic level of protein structure.

It's simply the linear sequence of amino acid residues read from the N -terminus to the C -terminus.

It also includes the location of any disulfide bonds that might link different parts of the chain.

So just the order of the letters in the protein alphabet?

Pretty much.

But here's the absolutely crucial point.

This primary structure, this linear sequence, contains the information that largely dictates how the protein folds up into its specific functional three -dimensional shape.

So the sequence determines the fold and the fold determines the function.

That's the central dogma of protein structure, basically.

The sequence is the blueprint.

Change the sequence and you often change the fold and therefore the function.

We see that in diseases, right?

Absolutely.

Many genetic diseases are caused by mutations that lead to a single amino acid change in critical protein.

Sickle cell anemia is a classic example.

Just one amino acid substitution in hemoglobin changes its structure and function dramatically.

Or Duchenne muscular dystrophy, often caused by deletions that lead to a non -functional protein.

And conversely, if proteins in different species do the same job, their sequences are probably similar.

Generally, yes.

Functionally important regions of a protein tend to be conserved during evolution, meaning their sequences change very little over time across different species.

This similarity reflects their shared ancestry and functional constraints.

So how do we figure out that sequence?

How do we read the blueprint?

It must have been incredibly hard initially.

Monumental is the word.

Frederick Sanger was the first to sequence a protein, insulin, back in 1953.

It was a landmark achievement, won him the Nobel Prize, and proved proteins had defined sequences.

How is it done today?

Still like Sanger did it.

Not really.

Today with the explosion of genomics, we often deduce the protein sequence indirectly by sequencing the gene, the DNA, that encodes it.

We know the genetic code, so we can translate the DNA sequence into the amino acid sequence.

Ah, okay.

Read the gene, predict the protein.

Exactly.

But direct protein sequencing is still very important, especially for identifying modifications made after translation or verifying the predicted sequence.

And the dominant technology for that now is mass spectrometry.

Mass spec, again, it seems to do everything.

How does it help with sequencing?

It's truly revolutionized proteomics.

Mass spec can measure the mass of molecules with incredible accuracy, down to distinguishing a single proton difference.

Wow.

For sequencing, the key technique is called tandem MS or MSMS.

Tandem, like one after another.

Precisely.

First, you typically break your protein down into smaller, more manageable peptides, maybe using an enzyme like trypsin.

Then in the mass spectrometer, you select one type of peptide ion based on its mass.

Okay, isolate one peptide.

Then you send that selected ion into a collision cell, where it collides with gas molecules, causing it to fragment, usually at the peptide bonds.

So you break the peptide itself.

Yes.

And then in a second stage of mass analysis, that's the tandem part, you measure the masses of all those fragments.

Because the peptide bonds break in a somewhat predictable way, the difference in mass between successive fragment ions tells you which amino acid residue was just lost.

Like reading off the masses tells you the sequence.

Essentially, yes.

You get a fragmentation spectrum, and by analyzing the mass differences, you can piece together the amino acid sequence of that original peptide.

It's like reading a chemical barcode.

That is incredibly powerful.

It is.

And when you couple this with liquid chromatography beforehand, you'll see MSMS, you can analyze thousands of peptides from a complex mixture, like a whole cell aficit, identify the proteins they came from, determine their sequences, and even quantify their relative amounts and identify modifications.

All in a single experiment, sometimes in just hours.

Mind blowing.

So we can read the sequences.

Can we also write them?

Can we build proteins or peptides from scratch in the lab?

Yes, absolutely.

Chemical synthesis of peptides is a really important tool, especially for making smaller

Say up to 100 residues or so.

These synthetic peptides are used as drugs, as antigens, to raise antibodies or to study protein function.

How does that work?

Do you just mix the amino acids together?

Not quite.

The big breakthrough was solid phase peptide synthesis, developed by R.

Bruce Merrifield, which won him a Nobel Prize too.

Solid phase.

Yeah, the idea is to build the peptide chain step by step while it's chemically attached to an insoluble solid support, like small plastic beads.

You add one protected amino acid at a time, wash away the excess reagents, de -protect the end, and then add the next one.

Cycle after cycle.

Keeps everything manageable.

Exactly.

It revolutionized peptide synthesis.

However,

even with this method, achieving perfect yields at every single step is impossible.

If you have, say, 99 % efficiency for each amino acid addition, by the time you try to make a 100 residue peptide, a significant fraction of the final product will actually be incorrect sequences due to missed steps.

So it's hard to get perfect long chains chemically.

Very hard.

It really highlights how astonishingly fast and accurate biological protein synthesis is inside the cell.

A bacterium can make that same 100 residue protein perfectly in about five seconds.

Nature's technology is still way ahead.

OK, so we have the sequence, the blueprint.

You mentioned earlier that comparing sequences tells us about evolution.

Can we delve into that a bit more?

Absolutely.

Protein sequences are like molecular fossils.

They carry a huge amount of information about evolutionary history.

How so?

Well, if two organisms are closely related, meaning they diverged from a common ancestor relatively recently,

their proteins that perform the same function will have very similar amino acid sequences.

As the evolutionary distance between two organisms increases, the number of differences in the sequences of their corresponding proteins also tends to increase.

Some residues might be absolutely critical for the protein's function, so they change very rarely, we call these conserved residues.

Other positions might be less critical and can tolerate substitutions more readily.

These variable positions accumulate changes over time.

So the pattern of conserved and variable residues gives clues.

Exactly.

By comparing the sequences of the same protein from many different species, we can identify related proteins called homologues, proteins derived from a common ancestral gene.

Homologues.

Are there different types?

Yes.

We distinguish between paralogues, which are homologous proteins found within the same species often arising from gene duplication events, and orthologues, which are homologues found in different species that evolved from a common ancestral gene after a speciation event.

Comparing orthologues is usually what we do to build evolutionary trees.

How do you compare them systematically, just line them up?

Pretty much.

But it's done using sophisticated computer algorithms.

These programs align the sequences, introducing gaps where necessary to maximize the identity between residues.

They use scoring systems that reward matches and penalize mismanages and gaps.

And this alignment reveals the relationships.

Yes.

The degree of similarity between aligned sequences reflects their evolutionary relatedness.

We can even find signature sequences, short stretches of amino acids that are uniquely characteristic of a particular group of organisms, providing strong evidence for evolutionary links.

For example, there's a 12 amino acid insertion in a protein called EF1 that's found in all archaea and eukaryotes, but not bacteria, suggesting a closer relationship between archaea and eukaryotes.

So you can build family trees for organisms using their proteins.

Absolutely.

By comparing the sequences of multiple proteins, especially ones found in all organisms, like ribosomal proteins, researchers can construct detailed evolutionary trees, or phylogenetic trees.

These trees map out the evolutionary history, connecting different species, sometimes thousands of them.

It's a massive ongoing project to reconstruct the entire tree of life based on this molecular data.

What an incredible journey.

We started with just 20 simple amino acids, these little building blocks, and we've ended up using their sequences to trace the history of life on earth.

It just hammers home how much information is packed into these molecules.

It really does.

Understanding protein structure, function, how we study them, how they evolved, it's absolutely foundational to all of biochemistry, molecular biology, really all of life science.

The ability to purify, analyze, sequence, even synthesize these molecules has completely transformed our understanding of how life works at its core.

It's amazing how these molecular mechanisms underpin everything.

And maybe here's a final provocative thought to leave you with.

We talked about how nearly all proteins use L -amino acids, right?

Yeah, the specific handedness.

Well, that's not just some minor chemical detail.

It strongly implies that life, as we know it, arose from a specific choice, a specific chiral selection very early on.

What if, purely by chance, life had started with D -amino acids instead?

How fundamentally different would biology, would biochemistry,

would we be?

Something to ponder.

Definitely something to think about, a different molecular world entirely.

Well, we hope this deep dive into amino acids, peptides, and proteins has given you some powerful insights and maybe a new appreciation for these incredible molecular machines.

Thanks for joining us.

And thank you for being part of the Last Minute Lecture family.

We look forward to our next deep dive with you.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers

Amino acids serve as the fundamental building blocks from which all proteins are synthesized, and understanding their structure and chemical behavior is essential for comprehending how biological macromolecules function. The twenty standard amino acids found in living organisms each contain an alpha carbon bonded to an amino group, a carboxyl group, a hydrogen atom, and a distinctive side chain that determines its chemical properties. These side chains vary dramatically in character, ranging from nonpolar hydrophobic groups to polar uncharged groups, acidic residues, and basic residues, creating enormous diversity in how amino acids interact with their environment and with one another. All naturally occurring amino acids exist in the L-isomeric form, establishing a critical stereochemical standard for biological systems. At physiological pH, amino acids exist as zwitterions, displaying both positive and negative charges simultaneously due to ionization of their functional groups, and their behavior in solutions of varying acidity can be precisely mapped through titration curves that reveal their pKa values and isoelectric points. When amino acids condense together through condensation reactions, they form peptide bonds—covalent linkages between the carboxyl group of one residue and the amino group of the next. The peptide bond itself possesses unique structural properties, including partial double-bond character due to resonance effects, which restricts rotation around the bond and creates a rigid, planar geometry. Peptide chains develop directionality, with an unreacted amino group at the N-terminus and an unreacted carboxyl group at the C-terminus, following a standardized naming convention. Modern analytical techniques enable researchers to determine protein sequences and structures with remarkable precision; Edman degradation chemically removes amino acids sequentially from the N-terminus, while mass spectrometry measures the exact molecular weight of peptide fragments. Proteins themselves are classified by their composition and structural organization, with globular proteins forming compact three-dimensional shapes suited to dynamic cellular functions, while fibrous proteins extend into elongated structures providing mechanical strength and support. Post-translational modifications further diversify protein function beyond what the primary amino acid sequence alone would suggest, enabling cells to regulate protein activity and direct proteins to specific cellular compartments. The remarkable diversity of protein functions—including enzymatic catalysis, structural scaffolding, cellular signaling, and immune defense—emerges directly from the specific ordering of amino acids and the modifications applied to the polypeptide backbone.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥

Chapter 3: Amino Acids, Peptides, and Proteins: Structure, Properties, and Purification

Related Chapters