In the field of cancer research, DNA sequencing is prevalent in nearly every area. It has provided many breakthroughs in terms of discovering the origins and evolution of cancer, and can be used to determine which genes in the tumour contain mutations, revealing the underlying molecular mechanisms driving the tumor's growth . Clinically, it can be used to determine which patients have an increased risk of developing a particular type of cancer. For example, women with a pathogenic mutation in BRCA1, a gene encoding for a DNA repair protein, have an 85% risk of developing breast cancer during their lifetime . Furthermore, pharmacogenomics uses DNA sequencing to investigate the relationship between different gene variants and a patient's tolerance or sensitivity to certain therapies, allowing for personalized gemome specific therapy regimens . DNA sequencing is the key to finding novel genes involved in cancer which can serve as the basis for new cancer therapies.
The first of the DNA sequencing techniques, Sanger sequencing, was discovered in 1977 and is still commonly used today . It uses irreversible chain termination, where the elongation step in DNA synthesis is halted when a dideoxynucleotide, a nucleotide lacking a 3’-OH group, is randomly incorporated into the growing strand. The 3’-OH group is essential for the addition of subsequent nucleotides, thus dideoxynucleotide incorporation prevents DNA polymerase from further elongating the DNA strand.
There are many variations of Sanger sequencing, differing mostly in their radioactive or fluorescent tags. The most common procedure involves single-stranded template DNA, a DNA primer, DNA polymerase, normal deoxynucleotides (dNTPs) and labelled dideoxynucleotides (ddNTPs), each base(A, G, T, C) labelled with a different "tag". In the reaction, DNA polymerase initiates replication of the template DNA using the DNA primer and elongates the strand using dNTPs. Upon the random addition of a ddNTP instead of a dNTP, elongation can no longer continue. As such incorporation events are random and ddNTPs of all bases are present, theoretically there are newly synthesized fragments terminating at every position of the template DNA. These fragments are then denatured and separated by size using a gel or capillary. Once arranged by size, the detector can ‘read’ the labelled ddNTP at the final position of each strand, from the shortest fragment to the longest, thus reading the complete DNA sequence of the template DNA .
Although Sanger sequencing is highly accurate, it is also expensive and time-consuming due to its limited capability to sequence large DNA fragments (limited to ~1000 bp). The Human Genome Project, completed in 2003 using Sanger sequencing, took 10 years for a working draft, $3 billion dollars, and the collaboration of 23 laboratories around the world . While it remains the technique of choice for highly accurate data, the introduction of next generation sequencers have allowed DNA sequencing to become diagnostically relevant.
Next Generation Sequencing
Next generation sequencers consist of a variety of DNA sequencing technologies that vary slightly in terms of sample preparation and the method of sequencing, but are all based on the concept of whole genome shotgun sequencing, where genomic DNA is randomly fragmented, sequenced and reassembled using the reference genome and overlapping regions of the fragments .
Illumina Genome Analyzer
Illumina is one of the major biotechnology companies that provides sequencing and array-based solutions for genetic research. The company’s popular next-generation sequencer uses a technique similar to Sanger sequencing except that in this case, chain termination is reversible. Instead of using ddNTPs, reversible terminator bases are used. The Illumina genome analyzer utilizes a microfabricated device for bridge amplification of fragments bound on its surface. The first phase of the NGS platform involves preparation of the genomic DNA sample in order to generate the DNA library (which acts as a reference template). The DNA fragments of interest are first fragmented in order produce fragment sizes that are optimal for sequencing (13). Next, adapters are ligated onto both ends of the DNA fragments, allowing them to subsequently attach to the surface of the flow cell. The single-stranded DNA fragments are bound to the flow cell channel in preparation for sequencing-by-synthesis. At this point, the amount of fragmented DNA added to the flow cell is crucial in terms of optimizing the sequencing approach, where loading too many fragments (also known as over-clustering) will result in poor sequencing and decoding of information. To amplify the amount of DNA fragments within the flow cell, bridge amplification is performed where unlabeled nucleotides are added along with enzymes in order to produce clusters of identical strands (13). With amplified DNA strands on the flow cell, sequencing can be started whereby DNA polymerase incorporates fluorescent-labelled and reversible terminator nucleotides. As each nucleotide complements to the DNA template, an imaging optical instrument records the emitted fluorescence at each sequencing cycle, producing a base-by-base sequence of the template strand (13).
Applied Biosystems SOLiD Sequencer
The ABI SOLiD (Sequencing by Oligonucleotide Ligation and Detection) platform utilizes an adapter-ligated fragment library that is similar to other NGS platforms discussed in this section thus far. The platform recruits the use of an emulsion-based PCR approach coupled with small magnetic beads to amplify the DNA fragments for sequencing (13). Prior to the flow cell stage, DNA fragments are amplified on the surfaces of 1-μm magnetic beads in order to generate a enough DNA fragments for sufficient signals during sequencing. Once completed, these fragments are deposited onto the flow cell slide, where ligase-mediated sequencing initiates. Unlike “sequencing-by-synthesis”, which is observed in the NGS platforms of illumine, 454 Roche, and life technologies, sequencing by DNA ligase first involves the annealing of a primer to the adaptor sequences on each amplified DNA fragment. From there, fluorescence-labeled 8-mer probes, which consist of eight bases in length with a free hydroxyl group at the 3’ end and a fluorescent dye at the 5’, are used to complement the nucleotides along the amplified fragment. Cleavage of the fluorescent dye is indicative of which complementary nucleotide it corresponds to (A, G, C, T). By repeating this process, the sequence of the amplified fragment can be decoded based on sequential annealing of the various 8mers (13).
Life Technologies Ion Torrent Platform
The Ion Proton™ sequencer from Life Technologies is another NGS platform that utilizes a semi-conductor chip to perform its sequencing processes (12). Unlike the HiSeq sequencers by Illumina, which uses light as an intermediary in determining genomic sequences, the semiconductor chip used by the Ion Proton™ follows the “sequencing by synthesis” principle similar to pyrosequencing by 454 Life Sciences (Roche) (12). The “Sequencing by synthesis” principle involves the generation of a complementary strand based on the sequence of a template strand. Containing different DNA templates in each microwell, sequencing first begins by flooding the chip with a single type of dNTP, whereby the dNTP is incorporated into the growing complementary strand if it complements to the leading template nucleotide (12). During this time, hydrogen ion is released which subsequently triggers an ion sensor based on the varying pH due to hydrogen concentrations. Based on this principle, the chemical reaction caused by template synthesis is correlated with the electrical signals generated by the ion detector, which can then be used to assemble the DNA sequence through the Ion Proton™ software (12).
The NGS platform utilized in Roche’s sequencer is based on an alternative sequencing technology called pyrosequencing. Similar to Life Technologies’ Ion Torrent platform, pyrosequencing involves the incorporation of nucleotides by DNA polymerase, which results in the release of pyrophosphate (13). The number of pyrophosphate released is dependent on the number of nucleotides incorporated, where it catalyzed by luciferase (firefly enzyme) to produce light. The Roche/454 sequencer first generates the library fragment of genomic DNA by incorporating the DNA fragments onto the surface of agarose beads. These beads carry oligonucleotides that are complementary to the 454-specific adapter sequences ligated on the library fragments. The uniqueness of this approach is that each bead contains only one DNA fragment that is subsequently amplified using emulsion PCR, thereby producing one million copies of the DNA fragment. Similar to the other NGS technologies discussed thus far, these amplified fragments within their specific beads are sequenced en masse (13). The beads are loaded onto a plate containing thousands of wells, where each well will contain a bead for specific monitoring. As mentioned before, the pyrosequencing reaction generates light through the use of luciferase, which is then converted to digital readings; representing the sequence of the amplified strands (13).
Future of DNA Sequencing
The development of next generation sequencers means whole-genome sequencing is now within reach for the average academic research group or diagnostic laboratory in terms of both cost and technical complexity. These new machines can produce a genomic sequence a hundred times more quickly and for a lower cost than the conventional method of Sanger sequencing, although it also involves far more data analysis. Initially, the bioinformatic tools were lacking and the error rates of these machines were extremely high, but these issues have been substantially reduced over the years .
A new set of sequencing techniques, occassionally referred to as "third generation sequencing" were first conceptualized as early as 1996, but are still undergoing testing for general use in sequencing applications . These methods include nanopore sequencing, which involves measuring changes in electrical currents as DNA sequences pass through holes of approximately 1-3 nanometer-sized diameters, an example of which includes the bacterial-derived alpha-hemolysin pores covalently bound with the molecule, cyclodrexin . The pores are placed inside a conducting fluid, and an electrical current is supplied, which is disrupted when DNA bases pass through. Due to the size of the pores, it is generally understood that bases pass through one by one, and since all four bases have different sizes and chemical composition, they interfere with the pore's voltage to a different degree and thus provide for direct readings of DNA sequences from these differential current aberrations. Nanopore sequencing is particularly advantageous as a possible approach as it does not require modified DNA bases, unlike Sanger sequencing or most next generation sequencing methods. Unfortunately, due to the speed that DNA sequences currently pass through the constructed nanopores, single-base pair resolution is not yet available and further advancements in either slowing down DNA or increasing resolution of voltage recording will be needed . The overall goal of third generation sequencing methods is to decrease both the cost and time duration from current sequencing techniques; however, more development and testing will be needed before they become widespread for practical sequencing applications .
DNA sequencing is a rapidly evolving field that is greatly increasing our understanding of cancer and cancer therapies. The development of new technologies and the refinement of currently available ones have pushed sequencing to the forefront of science. However, there are still distinct gaps in our knowledge in terms of the biological signification of these mutations found in an individual’s genome that must be addressed before DNA sequencing becomes a routine part of the diagnostic process . As the average individual has 3-4 million inherited sequence variants, of which most are not disease-causing, and tumours accumulate additional mutations, it can be challenging to elucidate which mutations are clinically relevant and targetable. However, the potential applications of DNA sequencing in developing an effective personalized therapy, such as tests for biomarkers and prediction of risk for various forms of cancer, as shown in a 2013 study using next generation sequencing to uncover molecular signals correlated with prostate cancer, or in Kolorectal carcinomas , likely will make the use of genome sequencing for developing these therapies a very enticing option for future cancer therapeutics.
1. Voelkerding, K.V., Dames, S.A. and Durtschi, J.D. (2009). Next-generation sequencing: from basic research to diagnostics. Clin. Chem. 55: 641-658.
2. Hutchinson, C.A. (2007). DNA sequencing: bench to bedside and beyond. Nucleic Acids Res. 35: 6227-6237.
3. Gullapalli, R.R., Desai, K.V., Santana-Santos, L., Kant, J.A., and Becich, M.J. (2012). Next generation sequencing in clinical medicine: challenges and lessons for pathology and biomedical informatics. J. Pathol. Inform. 3: 2153-3539.
4. Pareek, C.S., Smoczynski, R., and Tretyn, A. (2011). Sequencing technologies and genome sequencing. J. Appl. Genet. 52: 413-435.
5. Murray, A.J. and Davies, D.M. (2013). The genetics of breast cancer. Surgery 31: 1-3
6. Saijo, N. (2012). The role of Pharmacoethnicity in the development of cytotoxic and molecular targeted drugs in oncology. Yonsei Med. J. 54: 1-14.
7. Welch, J.S., and Link, D.C. (2011). Genomics of AML: Clinical applications of Next-generation sequencing. Hematology Am. Soc. Hematol. Educ. Program 1: 30-35.
8. Schadt, E.E.; S. Turner, A. Kasarskis (2010). "A window into third-generation sequencing". Human Molecular Genetics 19 (R2): R227–40. PMID: 20858600
9. Kasianowicz JJ, Brandin E, Branton D, Deamer DW. (1996). Characterization of individual polynucleotide molecules using a membrane channel. Proceedings of the National Acadmey of Sciences of the United States America. 93(24):13770-3. PMID: 8943010
10. Huang J, Wang JK, Sun Y. (2013). Molecular pathology of prostate cancer revealed by next-generation sequencing: opportunities for genome-based personalized therapy. Current Opinion in Urology. 23(3):189-93. PMID: 23385974
11. Hekeler E, Zoller WG, Wiedorn KH, Bosse A. (2012). The change of pathology in the era of personalized medicine using the example of discordant KRAS mutational status in metastasized colorectal carcinoma. Deutsche Medizinische Wochenschrift. 137(45):2327-31. PMID: 23111797
12. Life Technologies. (2014). Ion Torrent Next-Generation Sequencing Technology. Received from http://www.lifetechnologies.com/ca/en/home/life-science/sequencing/next-generation-sequencing/ion-torrent-next-generation-sequencing-technology.html
13. Mardis, E.R. (2008). Next-Generation DNA Sequencing Methods. Annual Review of Genomics and Human Genetics, 9: 387-402