Chapter 4: Protein Primary Structure & Sequencing
Loading audio…
ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.
Proteins serve as the functional workhorses of biological systems, acting as structural scaffolds in the cytoskeleton, molecular motors in muscle tissue, and vital catalysts in enzymatic reactions. This chapter explores the intricate life cycle of these macromolecules, from their synthesis and posttranslational maturation—such as selective proteolysis or disulfide bond formation—to their eventual degradation into basic amino acids. Identifying specific protein biomarkers, which are modifications or proteins associated with particular physiological states, remains a primary goal of molecular medicine to improve disease diagnosis. To study these complex molecules, they must first be isolated using chromatographic methods such as size-exclusion chromatography based on effective volume (Stokes radius), ion-exchange chromatography utilizing net surface charge, and affinity chromatography which leverages specific selective interactions between proteins and ligands. The evolution of high-pressure liquid chromatography (HPLC) has further enhanced the resolution of these techniques by using small, high-density matrix particles. The purity and physical properties of isolated proteins are commonly assessed via polyacrylamide gel electrophoresis (PAGE), specifically using the anionic detergent sodium dodecyl sulfate (SDS) to denature polypeptides and separate them primarily by relative molecular mass. Isoelectric focusing (IEF) further resolves proteins based on their isoelectric point, the specific pH where their net charge is zero, often used in two-dimensional electrophoresis for high-resolution separation. Historically, primary structure determination was pioneered by Frederick Sanger’s sequencing of insulin and Pehr Edman’s development of the Edman reaction for sequential amino acid labeling from the N-terminus. Today, mass spectrometry (MS) has become the dominant method for sequencing due to its superior sensitivity, speed, and ability to detect covalent modifications like phosphorylation or glycosylation that are invisible in DNA-derived sequences. MS configurations include quadrupole systems for smaller molecules and time-of-flight (TOF) instruments for large proteins, often requiring specialized vaporization techniques such as electrospray ionization or matrix-assisted laser desorption (MALDI). The field of proteomics now aims to identify the entire protein complement of a cell under specific conditions, utilizing genomics as a static blueprint while bioinformatics algorithms compare sequences to known structural themes—like the Rossmann fold—to infer physiological roles and mechanisms of action.