The Evolution of CRISPR: From Science Fiction to Reality
Written on
As a child, I was captivated by the X-Men animated series, which depicted a world where "mutants" coexisted with ordinary people. These mutants possessed the "X-gene," a genetic anomaly that bestows them with extraordinary abilities during adolescence. Despite their powers, they often faced discrimination and hostility from society. The X-Men, a group of these mutants, utilized their abilities to advocate for harmony and fairness between humans and mutants.
Like many children, I yearned for the day when science would turn some of those fantastical superpowers into reality. While abilities like Cyclops' optic blasts may remain in the realm of fiction, the last few decades have witnessed remarkable strides in genetic engineering, particularly with the advent of CRISPR. Commonly described as "word processing for DNA," CRISPR's potential to modify the genetic makeup of living organisms, including humans, could become commonplace in our lifetime.
CRISPR is not the first technique developed for genome manipulation, but it has generated significant excitement due to its precision, affordability, and ease of use compared to earlier methods. In this article, we will delve into the historical and scientific context of CRISPR, beginning with an overview of genetics and the human genome.
A Brief Overview of Genetics
In your school biology lessons, you likely learned that traits like eye color, height, and susceptibility to certain illnesses stem from the genetic information encoded in your genome. Your genome consists of a molecule known as deoxyribonucleic acid (DNA), found in each cell and largely uniform across them, except for the mutations that may occur. DNA guides your body's growth, maintenance, and the transfer of genetic information to offspring—this principle applies to all living organisms.
DNA is composed of adenine, cytosine, thymine, and guanine—often abbreviated as A, C, T, and G—which are chemical units called bases. These bases are arranged in long strands that twist into a double helix, resembling a spiral ladder. A sugar-phosphate backbone supports this structure, with the bases forming the rungs.
In this configuration, A pairs with T and G pairs with C, known as base pairs. During cell division, an enzyme cleaves the DNA helix, allowing other enzymes to synthesize a new “partner strand” for each split strand, adhering to the pairing rules. This process results in two identical copies of DNA for the daughter cells, ensuring the genetic information is passed accurately.
How does DNA define your identity? Think of DNA as a "secret language" that contains instructions for synthesizing proteins. Proteins are intricate molecules that perform crucial functions in the body, from aiding digestion to responding to light. They are integral to cellular operations and are essential for the structure and regulation of tissues and organs.
To construct proteins based on DNA's directives, an intermediary molecule called ribonucleic acid (RNA) is required. RNA is synthesized from the DNA template through a process known as transcription. RNA utilizes bases adenine, cytosine, uracil, and guanine—abbreviated as A, C, U, and G. This process closely resembles DNA synthesis, with uracil replacing thymine. RNA serves as a messenger that conveys information from the nucleus, where DNA resides, to the cellular sites where proteins are synthesized.
During translation, segments of DNA are transcribed into long strands of RNA (termed genes) that are employed to produce specific proteins. Once RNA exits the nucleus, it enters the ribosome, which functions like a protein-manufacturing facility. The ribosome interprets the sequence of A, C, U, and G as instructions for assembling amino acids.
When combined, a three-letter segment of RNA, known as a codon, corresponds to one of 20 amino acids. While there are 64 potential codons, only 20 amino acids exist, meaning several codons can represent the same amino acid; three codons act as stop signals to terminate protein synthesis once the protein is fully formed. Genes are distinguished solely by their sequence of base pairs, while their protein products vary based on the arrangement of amino acids.
The entire process by which DNA generates RNA, leading to protein formation, is referred to as the central dogma of molecular biology. Thus, the genetic information encoded in DNA culminates in the proteins that constitute all living entities. This is why DNA is often referred to as the "secret language" through which life communicates and expresses itself.
The Human Genome
The human genome consists of approximately 3.2 billion base pairs, translating to around 20,000 protein-coding genes. It is organized into 23 distinct units known as chromosomes, each ranging from 50 to 250 million base pairs in length. Typically, human cells contain 23 pairs of chromosomes, with each parent contributing 23 chromosomes, resulting in a total of 46 in each cell. Nearly every cell in our body houses a complete set of chromosomes.
Genomes are susceptible to mutations, alterations in the base pair sequence of DNA. The simplest form of mutation is substitution, where one letter in the DNA sequence is unintentionally replaced with another. While these mutations may seem minor, they can lead to severe consequences.
For instance, sickle cell anemia arises from a substitution in the beta-globin gene, where an A is replaced with a T at the 17th position. This mutation changes glutamate to valine in a crucial segment of the hemoglobin protein responsible for oxygen transport in red blood cells. The altered hemoglobin tends to clump together, distorting the shape of red blood cells, resulting in anemia and an elevated risk of stroke and infection.
The sequence of base pairs in our DNA plays a pivotal role in determining our biology and health. By deciphering this sequence, we can gain insights into the origins of genetic disorders.
This understanding has prompted a global initiative known as the Human Genome Project. In the 1970s, pioneering DNA sequencing technologies emerged, enabling researchers to identify the base pairs constituting a genome. Since then, advancements in sequencing technology have been remarkable. In the 1990s, scientists worldwide launched the Human Genome Project to map the entire human genome with this cutting-edge technology. The first draft was published in 2001, marking a significant milestone in our ability to understand the instructions that define our existence.
The Human Genome Project concluded in 2003, achieving approximately 92% sequencing coverage of the human genome. The remaining 8% proved challenging due to regions with highly repetitive DNA, where earlier technology fell short. However, sequencing capabilities have since improved, and a truly complete human genome sequence was released on April 1, 2022.
The initial cost of sequencing the human genome exceeded $3 billion. Today, the price has plummeted, making it accessible to more consumers. As of now, full genome sequencing can be done for less than $1,000, with experts predicting that a price point around $100 may soon be attainable.
Thanks to the foundational research from the Human Genome Project and ongoing advancements in genome sequencing technology, scientists have significantly enhanced our comprehension of human genetics. Currently, over 4,000 types of DNA mutations leading to genetic disorders have been identified. This knowledge enables the assessment of an individual's risk for specific diseases, aiding in personalized treatment approaches. Moreover, it reveals meaningful connections between numerous gene variants and various physical and behavioral attributes. Despite the benefits of genome sequencing, it remains a diagnostic tool. What if we could precisely and easily modify DNA?
Introducing CRISPR
This is where CRISPR comes into play. The term CRISPR stands for "Clustered Regularly Interspaced Short Palindromic Repeats." I will clarify this further shortly. Historically, scientists have utilized various methods to manipulate DNA, but CRISPR has emerged as the latest and most revolutionary gene-editing tool.
While many recognize CRISPR as a gene-editing mechanism today, its early research focused on understanding how bacteria protect themselves from viruses. Researchers discovered that bacteria possess short DNA repeats regularly interrupted by unique sequences within their genomes, giving rise to the CRISPR acronym. These sequences cluster in a specific genomic region rather than being randomly dispersed, a phenomenon previously unknown to science.
So, how does this relate to bacteria and their defense mechanisms? It turns out that viruses that specifically infect bacteria, known as bacteriophages, are extraordinarily prevalent. Scientists estimate that there are approximately 10³¹ bacteriophages on Earth, translating to around ten times as many bacteriophages as there are bacterial cells. Furthermore, an estimated 40% of bacterial mortality is attributable to bacteriophages.
Bacteriophages invade bacterial cells, insert their genetic material, and commandeer the host's machinery. In severe infections, the viral DNA takes over the host's resources, causing it to produce viral proteins instead of its own. The infected bacterial cell repeatedly replicates the viral genome until it ultimately bursts, releasing newly formed bacteriophages to infect neighboring bacteria.
While these infections can be catastrophic, bacteria have developed CRISPR as a form of adaptive immunity. During an infection, the bacterial genome captures snippets of the invading bacteriophage's DNA, incorporating this information into its own genome. This allows the bacterium to recognize and eliminate the same bacteriophages in future encounters.
In subsequent infections, the bacterial cells transcribe the entire CRISPR array into long RNA strands. CRISPR-associated genes, or Cas genes, encode specific enzymes and proteins within the bacterial genome. These Cas enzymes transform CRISPR-derived RNA into short strands of uniform length, with each containing a sequence from a particular bacteriophage.
The CRISPR-derived RNA strands associate with a group of Cas proteins, which use the RNA as a guide to locate the precise sequence of bacteriophage DNA during an infection. Once identified, additional Cas proteins come into play to cleave the target DNA, effectively neutralizing it.
This DNA cutting process demands a high level of accuracy to avoid inadvertently damaging the bacterium's own DNA. Different Cas proteins exhibit varying levels of precision, with the Cas9 protein being particularly discerning. When CRISPR is mentioned in the media, it often refers specifically to the CRISPR-Cas9 system.
To summarize how Cas9 operates: it initially binds to a DNA double helix, separating the strands. It then aligns a new partial helix formed between the CRISPR-derived RNA and the target DNA strand. Cas9 cuts through both strands of the target DNA, resulting in a double-strand break, guided by a 20-base sequence match indicated by the CRISPR-derived RNA.
We initially discovered that bacteria used Cas9 to excise particular viral DNA sequences, but it turns out this mechanism can also target and cut other DNA sequences—be they viral or otherwise. This breakthrough has profound implications, as it provides a programmable system capable of targeting and cutting DNA at any specified 20-letter sequence. Scientists have already begun applying this bacterial defense system to edit DNA in other organisms, including humans.
Researchers have advanced the use of the Cas9 system as a gene-editing tool by integrating tracrRNA and guide RNA. TracrRNA, or transactivating CRISPR RNA, helps maintain the catalytic activity of the CRISPR RNA and Cas9 protein. Later, scientists discovered a method to link tracrRNA with the CRISPR RNA regulating Cas9, leading to the creation of guide RNA that significantly enhances the efficacy of the CRISPR-Cas9 system as a gene-editing tool.
Here’s a simplified overview of how the CRISPR-Cas9 system is employed for gene editing: First, the RNA sequence is modified to match that of a specific target gene. Next, Cas9 and the newly created guide RNA are introduced into the target cell. The CRISPR-Cas9 system then executes a precise cut in the cell's DNA upon finding a complementary match indicated by the guide RNA.
Cells possess their own repair mechanisms that respond to DNA cuts, akin to how a welder joins severed metal pipes; all eukaryotic organisms, including humans, have evolved this repair process due to ongoing DNA damage. At this stage, a separate sequence of template DNA is incorporated into the CRISPR mixture, ready for use during the homology-directed repair process. In homology-directed repair, the template DNA serves as a guide for the repair, allowing for the insertion of a new desired sequence where the cut has occurred.
This technology enables the introduction of minor insertions or deletions in the DNA, as well as double cuts for gene sequence swaps, inversions, or deletions. Consequently, during protein synthesis, new proteins are produced based on the revised DNA instructions. Thus, alterations to DNA also influence how genes are expressed. This capability has numerous practical applications, ranging from curing genetic disorders to creating crops resistant to drought.
Humanity now possesses a straightforward, precise, and affordable means to edit the very code of life. This capability opens doors to diverse applications, from addressing genetic diseases to potentially controversial initiatives like genetically modifying humans and designing "designer babies." CRISPR could reshape not only our environment but also our very selves. What are your thoughts on this? Feel free to share in the comments!
Thank you for reading! If you enjoyed this article and wish to see more, consider joining Medium through my referral link here to gain unlimited access to all articles for just $5 per month.
Did you find this article interesting? Check out these others: 1. A “De-Extinction” Company Plans to Revive the Already Extinct Tasmanian Tiger — Can genetic engineering and reproductive technology bring back the Tasmanian Tiger? 2. Nanotechnology vs. Cancer: How Liposomes Have Revolutionized Chemotherapy — Cancer treatments that are more effective and less harmful. 3. Artificial Intelligence Reveals the Entire Protein Structure Universe — Soon, you’ll be able to obtain the 3D structure of any known protein simply by querying a database.
References: 1. “X-Men: The Animated Series.” IMDb, IMDb.com, 31 Oct. 1992, https://www.imdb.com/title/tt0103584/. 2. “New CRISPR Editing System a ‘Genetic Word Processor.’” Analysis & Separations from Technology Networks, https://www.technologynetworks.com/analysis/news/new-crispr-editing-system-a-genetic-word-processor-326275. 3. Doudna, Jennifer A., and Samuel H. Sternberg. A Crack in Creation: Gene Editing and the Unthinkable Power To Control Evolution. Mariner, 2018. 4. Genetics Generation, 30 Jan. 2021, https://knowgenetics.org/. 5. Anne Marie Helmenstine, Ph.D. “Do You Know the Differences between DNA and RNA?” ThoughtCo, ThoughtCo, 2 Feb. 2020, https://www.thoughtco.com/dna-versus-rna-608191. 6. “The Human Genome Project.” Genome.gov, https://www.genome.gov/human-genome-project. 7. Green, Eric. “The Human Genome Sequence Is Now Complete.” Genome.gov, 7 Apr. 2022, https://www.genome.gov/about-nhgri/Director/genomics-landscape/april-7-2022-the-human-genome-sequence-is-now-complete. 8. Nurk, Sergey, et al. “The Complete Sequence of a Human Genome.” Science, vol. 376, no. 6588, 2022, pp. 44–53., https://doi.org/10.1126/science.abj6987.