Chapter 15: Genomics – Mapping & Sequencing the Genome
Loading audio…
ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.
The field of genomics is defined as the comprehensive analysis of the structure, function, and evolution of entire genomes, building rapidly upon earlier genetic studies by pioneers like Mendel, and is sub-divided into structural, functional, and comparative approaches. Structural genomics establishes physical maps using methods that rely on physical distances measured in base pairs (bp), kilobases (kb), or megabases (mb), which are correlated with classical genetic maps (measured in centiMorgans or cM) and cytological maps (based on stained banding patterns) using anchor markers. Key markers utilized in mapping efforts include highly variable polymorphisms such as Restriction Fragment-Length Polymorphisms (RFLPs), Variable Number Tandem Repeats (VNTRs), and Short Tandem Repeats (STRs), essential for constructing detailed contig maps and facilitating positional cloning of genes. Massive amounts of sequencing data are organized and archived in public resources like GenBank, maintained by the NCBI, and analyzed using specialized computational tools under the umbrella of bioinformatics, including the sequence comparison software BLAST. The Human Genome Project (HGP) successfully sequenced the 3.2 billion base pair human genome using both hierarchical BAC clone mapping and whole-genome shotgun sequencing strategies, revealing that only a tiny fraction (1-2%) encodes proteins, while a large portion (around 50%) consists of repetitive elements derived primarily from retrotransposons like LINEs and SINEs. The human genome is estimated to contain about 20,500 protein-coding genes, alongside numerous functional noncoding RNAs (ncRNAs), such as lncRNAs and miRNAs, and nonfunctional remnants called pseudogenes. Genetic variation across human populations is characterized by frequent Single-Nucleotide Polymorphisms (SNPs), which group into inherited segments called haplotypes, mapped by initiatives like the HapMap Project to pinpoint disease susceptibility genes. Functional genomics utilizes technologies like microarrays (gene chips) to monitor the expression of thousands of genes simultaneously, and Green Fluorescent Protein (GFP) fusions to observe protein synthesis and localization in living cells. Comparative genomics analyzes diversity across species, from small prokaryotes (like M. genitalium which informs the minimal gene set concept) to complex eukaryotes, highlighting conserved blocks of genes known as shared synteny. Finally, paleogenomics enables the sequencing of highly fragmented ancient DNA, providing extraordinary details on human evolution, including evidence of interbreeding between modern humans and archaic hominins like Neanderthals and Denisovans.