10.1 Introduction: The Cancer Genome Project

Click to collapse Click to expand
Main | Save Edit | Discussion | History | Cube (0)


The Cancer Genome


All cancers arise from mutations within the human genome sequence (4). Millions of dollars of funding and countless hours of time have been put into research to determine what inherent features of these mutations cause cancer cells to become malignant. With the advent of particular technologies, the ability to sequence entire genomes of cancer cells has greatly increased our understanding of the nature of cancer and has led, and will continue to lead to improvements in therapeutics. This technology not only has increased our knowledge of cancer and how it is arises, but it has allowed for a comprehensive perspective of the immense variability in individual cancers (1). After all, cancer is just a name given to a an enormous assortment of distinct diseases; each similar in nature but inherently unique. 


The concept of the cancer genome first arose in the late nineteenth century by scientists David von Hansemann and Theodor Boveri, who noticed unusual chromosomal abnormalities upon analyzing cancer cells under a microscope (1). The idea soon rose that cancer cells were clones of normal human cells with irregularities in hereditary material and was later supported with the discovery of DNA structure and inheritance. This theory was supported with the discovery that materials that cause DNA damage also cause cancer (2). Specific and recurrent chromosomal aberrations, such as the Philadelphia Chromosome in chronic myeloid leukemia (CML) further supported this movement. Later, it was demonstrated that insertion of tumor DNA into non-cancerous cells had a transformation effect. The first ever tranformation-causing sequence change was a single nucleotide change, identified in the HRAS gene in 1982 and began a long era of search for cancer-involved genetic mutations, which still continues today (1). 


Cancer progresses through a combination of two processes; the constant acquisition of heritable somatic mutations and their natural selection through Darwinian Evolution (1). The human body has evolved numerous defense mechanisms to inhibit cancer progression and for that reason, cancerous tissues contain an astounding amount of mutations, often many within the same pathways (3). As rising mutations are random and unique, this results in tumour heterogeneity and the snowflake phenomenon in which no two cancers are alike. 


Cancer Genome Sequencing


Cancer genome sequencing is a laboratory technique that allows for determination of the entire DNA sequence of cancer cells (4). This involves whole genome, exome, or transcriptome sequencing of a group of heterogenous or homogenous cancer cells and can be analyzed in reference to normal adjacent tissue. Unlike other genome sequencing techniques which involves the use of a blood sample, cancer genome sequencing can incorporate the use of multiple tissue types and can provide a variety of information based on the experiment settings (3). Not only does this method directly elucidate the mutations involved in the primary tumor but it can be used to determine heterogeneity within the tumor cell populations, as well as analyze the tumor microenvironment, adjacent tissue types and even metastatic sites. Cancer genome sequencing can also provide more specialized information regarding the genes being expressed and the epigenetic modifications within the tumor, microenviroment and normal tissues. Cancer genome sequencing does not only involve whole genome sequencing but can often involve only exome sequencing or transcriptome sequencing; cheaper but more efficient methods that analyze less but more important aspects of the genome, and are used on different occasions and for various reasons outlined later in this chapter.


Sequencing is used to analyze the differences between a DNA sequence against another, reference sequence (7). These differences can be found in the form of nucleotide base changes, chromosomal translocations and fusion genes, copy number or sequence variants, irregular gene expression of miRNA expression (1). Depending on the gene(s) in which the mutation is found, this can tell a lot about the nature of the tumor and give potential therapeutic targets (See Chapter 6: Personalized Cancer Therapy). 


On top of mutations, DNA changes can be found in the form of acquisition of entirely new DNA such as through carcinogenic viruses (Human Papilloma Virus or Epstein Barr Virus for examples) or also manifested as epigenetic changes in which modifications (DNA methylation, acetylation and histone modication) do not affect the DNA sequence but greatly affect gene expression. These epigenetic changes also undergo Darwinian Evolution due to selection pressures, which can affect cancer progression. Interestingly, while often forgotten, distinct changes in the mitochondrial genomes have been associated with cancer progression, however their exact roles is still very unclear (1). 


Large Scale Sequencing Projects


There have been many attempts in the past few decades to create a database of the cancer genome: a unified and organized resource for cancer research. These include the Cancer Genome Project based out of the Wellcome Trust Sanger Institute, the Cancer Genome Atlas and the Cancer Genome Anatomy Project as well as the Cancer Genome Characterization Initiative at the National Cancer Institute (8). All of these projects use bioinformatics and high throughput genome analysis techniques to gain a better understanding on the genetic basis behind cancer, so treatment, diagnosis and prevention of cancer can be improved upon. Many of these projects have led to the creation of COSMIC, catalogue of somatic mutations in cancer, an online database of non inherited, cancer mutations compiled from scientific literature including large projects mentioned above (9). Launched in 2004, the catalogue began with just four genes in the database, HRAS, KRAS2, NRAS & BRAF, however it expanded rapidly containing 529 genes within a year, describing over 20 000 mutations. 5 years later it contained close to 90 000 mutations in over 13 000 genes. To date, the database now contains over 140, 000 mutations (9). It is now known, from this, that more than 1% of all human genes are somehow mutated in some form of cancer, the vast majority of which (~90%) are somatic mutations (9). 


Amongst others, the Cancer Genome Atlas (TCGA) is a project that begun in 2005 that uses high throughput genome analysis techniques to catalogue cancer mutations. This project is affiliated with the National Cancer Institute and the National Human Genome Research Institute which are funded by the US government. The International Cancer Genome Consortium (ICGC) exists to coordinate all the large scale sequencing projects from various institutions around the world into one synchronized effort to catalogue somatic genetic mutations in cancer (10) It aims to provide complete genetic and epigenetic descriptions of 50 different clinically and socially relevant tumor types (10). 


Next Steps: Implications of Cancer Genomics 


Understanding a primary tumor is essential for effective treatment. Currently, many therapeutics are targeted to general cancer without having much specificity for tumor type and heterogeneity. The evolution of cancer genomics allows for multiple changes in this inefficient system. First, it creates a database of mutation frequencies in tumor types. This allows for quantification of common mutations, for example, it may be found that at least 1 of 3 key genes is mutated in 96% of pancreatic cancer and sequencing pancreatic tumors at a clinical level can allow for this type of large scale analysis and database creation. This knowledge is useful for ameliorated diagnosis, prognosis and treatment. Sequencing upon discovery of a tumor in clinic and comparing it to known tumors at various stages of progression would allow for a complete prognosis. Finally, knowledge of key genetic mutations present frequently in specific cancers allows for the creation of more effective therapeutics and using cancer genome sequencing at a clinical level would allow proper distribution of therapeutics, being used only when it’ll be effective.  Unfortunately while there are amazing implications that come along with cancer genome sequencing, there are several limitations; these will be discussed later in the chapter.


This chapter explores some of the key components of Cancer Genomics. Beginning with General Genetics, it discusses the nature of DNA and then delves into the Genome Landscape of cancer genetics. Further, we discuss the relationship between Epigenetics and cancer progression and the Techniques used for cancer sequencing. Finally, the Implications for Cancer Genomics are discussed in detail and concluded by the Limitations that currently hinder the progression of Cancer Genomics. 





  1. Campbel P, Futreal P, Stratton M, 2009. The cancer genome. Nature, 458(7239), 719-724.
  2. Loeb, L. A. & Harris, C. C. Advances in chemical carcinogenesis: a historical review and prospective. Cancer Res. 68, 6863-6872 (2008).
  3. Schubert, C. (2008). Cancer genome. Nature Medicine, 14(10), 1026
  4. Chin L, Anderson J, Futreal A, 2011. Cancer Genomics: From discovery science to personalized medicine. Nature Medicine, 17, 297 - 303. 
  5. Diamandis E, Hudson T, Kallioniemi O, Liu E, Lopez- Otin C (2010). Cancer Genomes. Clinical Chemistry. 56 (11) 1660 - 1664.
  6. Ley T et al. (2008) DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature.  456 (7218)
  7. Pleasance E, et al. (2009) A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463 (7278)
  8. National Cancer Institute: CGAP http://cgap.nci.nih.gov/ Accessed: March 14, 2014
  9. COSMIC: Catalogue of somatic mutations in cancer http://cancer.sanger.ac.uk/cancergenome/projects/census/ Accessed: March 14, 2014
  10. Hudson, T. J., Gerhard, D. S., Jiang, T., Guttmacher, A., Guyer, M., Hemsley, F. M., Medicinska fakulteten. (2010). International network of cancer genome projects. Nature,464(7291), 993-998