Chapter 3: Amino Acids and the Primary Structures of Proteins

0:00 / 0:00
Report an issue

Welcome to Last Minute Lecture.

This free chapter overview is designed to help students review and understand key concepts.

These summaries supplement, not replace, the original textbook and may not be redistributed or resold.

For complete coverage, always consult the official text.

So I want you to imagine a massive meteorite just crashing into the Australian outback.

Oh, the Murchison meteorite.

Yeah, classic.

Right.

So scientists rush to the scene, they secure the rock and they analyze what's inside, and they find something astonishing, which is amino acids.

The literal fundamental building blocks of life just forge spontaneously in the vacuum of space.

It's wild to think about.

It is.

Yeah.

But there's a massive catch.

The amino acids inside this space rock are a perfect 50 -50 mix of two different shapes, like left -handed and right -handed molecules.

Right.

Yet if you look at your own body,

or a blade of grass or a bacterium,

every single protein is built strictly from left -handed amino acids.

So the question is, why did life on Earth choose the left hand and just completely ignore the right?

I mean, it is one of the most profound molecular mysteries we have, and I think it really goes to show that biology isn't just this chaotic soup of magic.

At the microscopic level, it's governed by very strict, elegant physical rules, and if you want to understand how life works, you really have to understand those rules.

Which is exactly our mission today.

Welcome to a special one -on -one study session deep dive.

We are exploring the molecular building blocks of life based on the foundational concepts of biochemistry.

Yes.

So whether you're prepping for a brutal exam, or you're just insanely curious about how the microscopic machinery inside of you actually operates, you are in the right place.

We're going to unpack the structures, the lab techniques, and the evolutionary history of amino acids so it all makes perfect logical sense.

Absolutely.

And I think the central biochemical theme we need to keep returning to today is just a simple phrase,

structure determines function.

Structure determines function.

Exactly.

If you want to understand how a protein acts as an enzyme catalyst for digestion,

or how it transports oxygen through your blood, or even how it forms those physical cables that pull your chromosomes apart during cell division.

Like the tubulin fibers, right?

Yes, exactly like tubulin.

To understand all that, you have to look at the structure of its individual building blocks.

Okay, so let's unpack this chassis.

Before we get into the wild variety of proteins, what exactly is an amino acid?

Like what is the basic template that they all share?

Okay, picture a single carbon atom dead in the center.

We call this the alpha carbon.

Okay.

It has four chemical arms reaching out bonded to four different things.

The first arm holds an amino group, which contains nitrogen.

The second arm holds a carboxyl group, which has oxygen.

The third arm holds just a simple single hydrogen atom, and the fourth arm holds the R group.

And that R group, or side chain, that's the wild card, right?

That's the only part that swaps out to create the 20 different standard amino acids we use.

Precisely.

But before we even look at the wild cards, we have to look at how that central chassis behaves in the human body.

Under normal physiological conditions, say right around a neutral pH of seven, this molecule does something really interesting.

The amino group acts like a base and snaps up a free proton from the water around it, which gives it a positive charge.

Oh, I see.

And then the carboxyl group acts like an acid and drops a proton, giving it a negative charge.

So you basically end up with a single tiny molecule that has a positive pole on one end and a negative pole on the other, like a microscopic magnet.

Chemists call that a sweterian.

It comes from the German word for hybrid.

But the geometry of that central carbon is really where we get back to our meteorite.

Oh, right.

The left -handed, right -handed thing.

Exactly.

Because that alpha carbon is attached to four completely different groups, it is chiral.

Chiral meaning handedness.

Like, I could put my left and right hands together, palm to palm, and they match perfectly.

But if I try to lay my left hand directly on top of my right hand, facing the same way, my thumbs point in opposite directions.

Right.

They're mirror images, but you can't superimpose them.

OK, so amino acids exist as L isomers, the left -handed version, and D isomers, the right -handed version.

Yes.

And the Murchison meteorite from Australia proved that the universe creates both forms equally in deep space.

So why are we entirely left -handed?

I mean, if space rocks have a 50 -50 mix, how did Earth get so picky?

Well, it strongly points to all life on Earth originating from a single common ancestor.

Wow.

As for why that ancestor used the L form instead of the D form, it was almost certainly just an evolutionary coin flip.

Just a random chance.

Pretty much.

Billions of years ago, a primitive pathway just happened to utilize an L amino acid.

And it worked.

And once that machinery was built, there was no going back.

That random choice got locked in for the next four billion years.

But I was reading that those left -handed molecules don't actually stay perfectly locked in forever, right?

Yeah.

It's like over thousands of years, if an organism dies, those L amino acids slowly, spontaneously start flipping back into D amino acids.

Yes.

That process is called racemization.

It's a very slow, highly predictable chemical decay.

I like to think of it as a molecular hourglass.

Yeah.

Because scientists can actually use this to date fossils.

There's this fascinating study on a Stone Age jawbone.

Oh, the Bangool Jaw.

Yes.

By measuring the exact ratio of left -handed to right -handed amino acids trapped inside the fossilized tooth enamel, they could calculate exactly how long that individual had been dead.

It's just wild to me that subatomic chiral flips can act as a prehistoric clock.

It is a brilliant application of basic chemistry.

But let's shift from the universal chassis to those wild cards we mentioned earlier, the R groups.

Right, the side chains.

Nature takes that standard alpha carbon template and attaches 20 different side chains to give each amino acid a totally unique chemical personality.

Right, and biochemists usually group them by those personalities.

So let's start with the aliphatics.

We've got glycine, alanine, valine, leucine, and isoleucine.

But what does it actually mean to be aliphatic?

It basically means their side chains are made of simple carbon and hydrogen chains, which are nonpolar.

And because they're nonpolar, they are highly hydrophobic.

They actively avoid water.

I always get tripped up on this concept because, I mean, molecules don't have brains, right?

They don't actually fear water, what's physically pushing them away.

It's all about thermodynamics and entropy.

Water molecules love to bond with each other.

But when you drop a nonpolar aliphatic amino acid into water, the water molecules can't bond to it.

OK.

So they are forced to form this highly ordered, rigid, ice -like cage around the alien molecule.

And the universe hates order.

It wants chaos.

It wants high entropy.

So to minimize the amount of water trapped in these ordered cages, all the nonpolar amino acids clump tightly together.

It minimizes their surface area.

So if you are a hydrophobic amino acid in a cellular environment, you're going to bury yourself deep inside the core of a folded protein, just as far away from the watery exterior as possible.

You nailed it.

Now, there's one oddball in this aliphatic group, and that's proline.

Oh, proline.

Yeah.

Proline is a structural rebel.

Instead of its side changes dangling off into space, it actually loops back around and covalently bonds to the nitrogen of its own amino group.

It creates a rigid ring.

So if you're trying to build like a flexible, straight protein chain and you throw a proline in there, what happens?

It physically forces a kink, like a sharp turn in the architectural structure.

It's essentially like throwing a bent pipe into your plumbing layout.

Got it.

OK.

Next personality group.

The aromatics.

Phenylalanine, tyrosine, and tryptophan.

They also have big carbon rings, but they're different from proline.

Very different.

They have bulky, resonating double bond rings, and those rings are incredibly useful to us in the laboratory.

How so?

Because of how their electrons are shared around the ring, tryptophan and tyrosine absorb ultraviolet light brilliantly, peaking right at 280 nanometers.

Why does that specific number matter?

Because if you are a scientist who just spent three weeks trying to extract a microscopic amount of protein from a cell culture, you need to know if you actually succeeded.

You can just shine a 280 nanometer UV beam through your clear test tube.

If the light passes straight through, your tube is empty.

But if the light is absorbed, you know immediately, I have protein in this tube.

It's a massive shortcut.

That is incredibly slick.

OK.

Moving to the sulfur containing group.

We have methionine, which I know is famous because it's almost always the very first start amino acid when a new protein is being synthesized.

Yes.

But the other one is cysteine.

Why is sulfur so special here?

Because two cysteine molecules can undergo an oxidation reaction to form a covalent disulfide bridge.

A bridge between two completely different parts of a protein chain.

Exactly.

Think of it like a molecular spot weld.

Oh, I like that.

Yeah.

If a protein needs to survive in a harsh environment, say an antibody fleeting outside the cell in your highly turbulent bloodstream, those sulfur to sulfur covalent bonds lock the folded shape firmly in place.

Wow.

It's the exact same elemental sulfur that forms massive yellow geological deposits at volcanic hot springs, just working on a microscopic scale to keep your biology from falling apart.

From volcanic hot springs to my immune system, that's amazing.

Then we also have serine and threonine, the polar alcohols, right?

Right.

They have hydroxyl groups making them hydrophilic.

Got it.

Then we have the charged amino acids,

the bases, which are positive, like lysine, arginine, and histidine.

Yes.

And histidine is a total superstar in biochemistry.

Really?

Why?

Its side chain contains an imidazole ring, which has the superpower.

Its natural threshold for grabbing or dropping a proton is right around pH 6 .0.

Which is very close to the neutral pH of our body.

Exactly, because it sits right on that knife's edge.

Histidine can easily toggle its electrical charge on and off, depending on slight shifts in the environment.

This makes it incredibly useful in the active sites of enzymes, where moving protons around is literally how work gets done.

That makes a lot of sense.

Hey!

On the flip side of the bases, we have the negative acids, aspartate and glutamate.

Yes.

People might recognize glutamate because its salt form is monosodium glutamate, MSG.

The flavor enhancer.

Which, by the way, just shows how fundamental these molecules are.

Your taste receptors are literally hardwired to detect these specific amino acids to tell you that you're eating protein.

That's a really great connection.

And finally, we have their uncharged but highly polar amide cousins, asparagine and glutamine.

Because they are highly polar, they love water.

They will rush to the outside surface of a folded protein.

So we have all these different personalities.

But their behavior isn't static, right?

You mentioned earlier that the chassis acts like an acid or a base, depending on the environment.

How does that actually work if the pH of the blood or the cell changes?

To understand that, we have to look at how these molecules ionize.

Biochemists map this out using titration curves.

Okay, wait on me.

Imagine a beaker containing a solution of the amino acid alanine.

We start with the liquid in the beaker being highly acidic, a pH of 1.

Okay, so a pH of 1 means the room is absolutely flooded with free -floating protons.

Exactly.

Now, I want you to imagine ionization as a molecular game of musical chairs.

I love a good analogy.

The alanine molecule has two available seats that can hold a proton, the carboxyl group and the amino group.

Because the environment is overflowing with protons, every available seat on alanine gets filled.

The molecule has a net positive charge.

Okay, so then we start the titration.

We slowly drip a base, like, say, sodium hydroxide, into the beaker.

The base neutralizes the acid, the pH starts to rise.

Right.

Raising the pH is the equivalent of slowly draining the protons out of the room.

But the protons on the alanine molecule don't just fall off randomly.

They don't.

No, they are pulled out of their seats one by one according to a strict chemical threshold called the picoA.

The picoA.

So that's the exact pH level where a specific chemical group has a 50 % chance of holding the proton and a 50 % chance of dropping it.

Perfectly stated.

So as the pH rises, the carboxyl group has a pretty weak grip.

It hits its picoA threshold around pH 2 .4 and loses its proton, becoming negatively charged.

But the amino group still has its proton, right?

So now we have a positive end and a negative end.

They cancel out.

Yes.

And this is a critical concept for any exam, the isoelectric point, or PI.

It is the exact pH where the entire molecule has a net electrical charge of zero.

For alanine, that happens at a pH of 6 .15.

But the game isn't over.

If we keep adding base, raising the pH up toward 10, the room is completely starved of protons.

So finally, the amino group loses its grip, drops its proton, and now the entire molecule has a net negative charge.

Exactly.

Now, if you do this exact same process with histidine, the graph looks like a staircase with three steps instead of two.

Oh, because histidine has that ionizable side chain we mentioned.

It has three seats in the game of musical chair, so it has three different picoA thresholds to cross.

Okay, so we've mapped out the individual Lego bricks and how they behave in different chemical baths.

But a single amino acid doesn't do much on its own.

How does the body actually snap them together to build a functional machine?

It uses a condensation reaction to form what we call a peptide bond.

Condensation meaning water is involved.

It means water is removed.

The carboxyl group of the first amino acid lines up with the amino group of the second.

An enzyme facilitates a reaction where a molecule of water, H2O, is ejected from the two groups.

And what's left behind is a fiercely strong, extremely rigid, covalent bond holding the two amino acids together.

And there is a strict traffic law here that the textbook emphasizes.

You always, always read, draw, and synthesize these chains in one specific direction.

From the N -terminus, which is the free amino end, to the C -terminus, the free carboxyl end.

That directionality is non -negotiable.

It is the fundamental way genetic code is translated into physical reality.

And even tiny chains can have massive effects.

Oh, absolutely.

Think about aspartame, the artificial sweetener in diet soda.

It's just a simple dipeptide, just two amino acids linked together, aspartic acid and phenolamine.

But when that tiny two -part chain hits the receptors on your tongue, it registers as 200 times sweeter than table sugar.

Which brings up a massive challenge for biochemists.

What's that?

Well, if nature is building thousands of different protein chains, from tiny sweeteners to massive enzymes, and they are all floating around in a cellular soup, how do you study just one of them?

Yeah.

If I want to study one specific protein, how do I isolate it from the thousands of others?

It's essentially a molecular obstacle course.

The purification process starts rough and gets increasingly precise.

You might start by salting out.

That's adding ammonium sulfate until your specific target protein crashes out of the solution.

Or you might use dialysis, placing the soup in a semi -permeable membrane bag to let small impurities leak out while the massive proteins stay trapped inside.

But the real MVP of the lab is column chromatography.

You take a tall glass tube and fill it with a matrix of tiny synthetic beads.

You pour your protein soup in the top and let gravity and solvent wash it down to the bottom.

Right.

Because different proteins interact with the beads differently, they fall at the bottom or elute at different speeds.

And you can customize the beads for different obstacle courses.

First, ion exchange chromatography separates proteins by charge.

How does that work?

If you pack the column with positively charged beads,

negatively charged proteins will stick to them like magnets, while positively charged proteins will just flow right through.

Got it.

Then there's gel filtration chromatography, which separates by size.

But hold on, the physics of this always seemed entirely backward to me.

How so?

Well, the beads have microscopic pores in them, like a maze.

Shouldn't the tiniest proteins slip through the maze the fastest, while the massive, bulky proteins get stuck at the top?

It does seem counterintuitive, but think of it this way.

The massive proteins are entirely too big to enter the tiny pores in the beads.

Because they can't enter the maze at all, they just bypass the beads completely and wash straight down the empty space between them.

The small proteins, however, wander into the microscopic pores of every single bead they pass, taking a long winding detour.

Therefore, the largest proteins elute first, and the smallest ones elute last.

Big goes fast, small gets distracted.

I love that.

And finally, there's affinity chromatography, which is like a molecular sniper rifle.

It really is.

You coat the beads with a highly specific molecule, like a custom antibody.

When you pour the soup through, only your target protein binds to the antibody.

Everything else washes away.

So you've run the obstacle course, you have a test tube, and you think your target protein is in it.

How do you prove it?

How do you analyze the results?

You use an absolute staple of biochemistry, STS -PAGE.

And I have an analogy for this, because the mechanism is just brilliant.

Go for it.

Let's hear it.

Imagine you want to test the pure running speed of a group of athletes.

But some athletes are aerodynamic.

Some are bulky.

Some are wearing heavy boots.

To make it a fair test of just one variable, you force every single runner to put on an identical, heavy, negatively charged weighted vest.

Oh, I like this.

In the lab, that vest is a strong detergent called SDS.

Right.

The SDS breaks all the internal bonds of the protein, completely unfolding it from its 3D shape into a long linear string.

And it coats the entire string in a uniform negative charge.

Exactly.

So now every protein has the exact same charge -to -mass ratio.

You put them at the top of a dense polyacrylamide gel grid, and you turn on an electric current.

With the positive pole at the bottom.

Right.

Because they are all uniformly negatively charged, the electricity pulls every protein down with the exact same relative force.

So the only variable left to slow them down is physical friction.

Yes.

How easily can they squeeze through the microscopic gel grid?

The massive, long proteins get tangled up near the top.

The tiny proteins slip through the grid with ease and race down to the bottom.

And then you add a blue dye, and boom, you get distinct visual bands that tell you the exact molecular weight of the proteins in your tube.

But if you want to get incredibly precise, like down to the mass of a single proton, you turn to MALDI -TOF mass spectrometry.

Time of flight.

Yes.

How does that work?

Well, MALDI stands for Matrix Assisted Laser Desorption Ionization.

That is a mouthful.

It is.

Basically, you embed your purified protein into a chemical matrix on a metal plate.

Then you fire a laser at it.

If you hit the protein directly, you'd just incinerate it.

Yikes.

Okay.

But the matrix absorbs the laser's energy,

vaporizes,

and gently lifts the intact protein into a gas cloud while donating a proton to give it a charge.

And once it's a charged gas.

You hit it with an electric field, accelerating it down a long vacuum tube.

That's the flight tube.

It flies toward a detector.

Because the kinetic energy applied is constant, the time it takes the molecule to fly down the tube and hit the detector is perfectly proportional to its mass.

Light molecules fly fast, heavy ones fly slow.

Okay, so now we have our isolated protein and we know its exact mass.

But how do we read its primary structure?

How do we determine the exact sequence of amino acid letters making up the chain?

Historically, scientists used Edmund degradation.

They used a chemical called PITC that acts like molecular scissors.

It binds specifically to the N -terminal amino acid, the very first letter of the chain.

When you drop the pH, it snips off only that first amino acid, leaving the rest of the polypeptide chain completely intact.

So you identify that single letter you snipped off, and then you just run the chemical cycle again on the newly exposed N -terminus.

Yeah.

Snip, identify, repeat.

Exactly.

But there's a problem.

The chemistry isn't perfectly efficient.

After about 30 cycles of snipping, the chemical airs compound and the data becomes unreadable.

You max out at 30 residues.

But most proteins are hundreds of amino acids long.

Exactly.

So you have a massive logic puzzle.

If you can only read 30 letters at a time, you have to chop the massive protein up into smaller pieces first.

And to do that, you use protases.

Which are enzymes that act like highly specific scissors.

Precisely.

Trypsin is an enzyme that only cuts a chain right after lysine or arginine.

Chymotrypsin only cuts after the bulky aromatic rings we discussed.

Cyanogen bromide only cuts after methionine.

It's like taking a highly classified document and running it through three different paper shredders that cut at different word lengths.

You sequence the trypsin shreds, and you get a bunch of disconnected 20 -letter phrases.

Then you sequence the chymotrypsin shreds, and you look for where the letters overlap.

Right.

By sliding the overlapping segments together, you can deduce the entire original massive sentence.

It's a brilliant feat of deductive logic.

But frankly, today the landscape has entirely changed.

We rarely solve the whole puzzle manually anymore because we have the ultimate cheat code.

The mapped human genome.

Oh right.

Modern fingerprinting.

If you have an unknown protein in a lab today, you don't sequence every letter.

You just hit it with trypsin to chop it up into fragments, and you throw that superfragment straight into the mass spectrometer we just talked about.

And the mass spec gives you a graph with a handful of highly precise fragment weights.

That is your unique fingerprint.

Then you just feed those weights into a computer.

The computer looks at the entire database of the human genome, virtually chops up every known human protein with trypsin in its memory, and looks for a weight match.

The computer tells you instantly, based on these exact fragment weights,

this can only be human serum albumin.

You identify the mass of protein without ever sequencing it manually.

It is breathtakingly efficient.

But you know, it raises a final crucial question.

Why do biochemists care so much about these exact primary sequences?

Why did we spend decades inventing these techniques?

Because reading these sequences is quite literally reading the evolutionary history of life on Earth.

Exactly.

To see the big picture, we look at cytochrome c.

It's a small protein crucial for cellular energy production, and it is found in almost all oxygen -breathing organisms, from bacteria to sunflowers to humans.

Scientists lined up the sequences of cytochrome c from dozens of different species.

When you look at that alignment,

it's mind -blowing.

Out of over a hundred amino acids, some columns are completely identical across every single species.

Like every organism on Earth has a proline sitting exactly at position 30.

We call those invariant residues.

They are so critically essential for the protein's core function that evolution simply does not tolerate mutations there.

If a random DNA mutation changes that proline to something else, the protein breaks, the organism dies, and the mutation is eliminated from the gene pool.

But other spots on the chain are highly variable.

Over billions of years, neutral mutations have slowly drifted in, swapping out amino acids without breaking the machine.

And this is the ultimate aha moment for me.

You can literally draw a phylogenetic tree, the entire tree of life, purely by counting the number of amino acid differences between species.

It's true.

Human and chimpanzee cytochrome c sequences are exactly identical.

Not a single letter is different.

Humans and dogs have about 10 differences.

Humans and sunflowers have many more.

When you map out those molecular differences, the genetic tree you draw perfectly mathematically mirrors the evolutionary trees drawn by paleontologists studying physical fossils in the dirt.

The chemistry matches the fossils perfectly.

I mean, from the subatomic left -handed twist of a single carbon atom all the way up to proving the evolutionary origins of life on Earth.

That is the power of understanding structure.

And we are really only just beginning to play with this architecture.

Keep in mind, nature built all of this complexity using primarily just those 20 standard amino acids.

Which leaves you with this final, provocative thought to mull over.

If life could naturally evolve all the stunning complexity of the human body with just 20 molecular Lego bricks,

what would happen if synthetic biologists engineered a completely novel, entirely synthetic 21st amino acid, one with a chemical R group never before seen in nature?

What entirely new functions, unheard of biological materials or revolutionary medicines, could protein build if we handed it a brand new chemical tool?

The possibilities are quite literally endless.

Something to keep you awake before your exam.

You've got this.

A warm thank you from the Last Minute Lecture Team.

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

Chapter SummaryWhat this audio overview covers
Amino acids are organic compounds that form the structural foundation of all proteins, with twenty standard variants serving as the universal building blocks across all living organisms. Each amino acid features a central chiral alpha carbon attached to an amino group, a carboxyl group, a hydrogen atom, and a distinctive side chain that determines its chemical properties and biological function, with glycine representing the sole exception as its side chain consists of only a hydrogen atom, making it achiral. Proteins universally incorporate only the L-stereoisomer of amino acids, a conserved trait dating back to the last universal common ancestor. At physiological pH, amino acids adopt a zwitterionic state in which the amino group becomes protonated and the carboxyl group becomes deprotonated, resulting in a neutral overall charge. Side chains vary dramatically in their chemical nature, enabling classification into functionally distinct groups: nonpolar aliphatic residues that avoid water, aromatic residues capable of absorbing ultraviolet radiation, sulfur-containing amino acids that form stabilizing disulfide bonds between cysteine residues, polar uncharged amino acids with hydroxyl or amide groups, positively charged basic amino acids essential for catalytic mechanisms, and negatively charged acidic amino acids. The ionization properties of individual amino acids are characterized by specific pKa values for each ionizable functional group, and the isoelectric point represents the pH at which the net charge of an amino acid reaches zero, a parameter critical for understanding how proteins distribute electrical charge and fold into three-dimensional configurations. Amino acids condense through peptide bond formation via dehydration reactions to assemble the primary structure of proteins, defined as the linear sequence of amino acid residues extending from the N-terminus to the C-terminus. Multiple experimental techniques enable researchers to determine and analyze protein primary structures. Purification exploits differences in solubility, molecular size, electrical charge, and binding specificity, while SDS-PAGE separates protein molecules based exclusively on molecular mass. Mass spectrometry identifies precise molecular weights and enables protein characterization through tryptic fingerprinting analysis. Protein sequencing methodologies include Edman degradation for sequential identification from the N-terminal end and selective cleavage by proteases such as trypsin and chymotrypsin or chemical agents like cyanogen bromide to generate fragments suitable for analysis. Comparative examination of primary sequences among related proteins from different species reveals evolutionary relationships, with sequence similarity metrics directly correlating to evolutionary distance and enabling construction of phylogenetic frameworks consistent with paleontological evidence.

Using this chapter to study? Last Minute Lecture is free and student-run. If it helped, consider supporting the project.

Support LML ♥