Skip to main content
Alpha Cyanea is in public alpha. We're building in the open — expect rough edges and rapid iteration. See what's live

Cancer Genomics and Precision Oncology

Advanced Cell Biology ~40 min

Explore how genomic profiling reveals cancer mechanisms and guides precision treatment — from oncogenic pathway analysis and targeted therapy to immunotherapy and liquid biopsy.

Introduction

The previous lesson established that cancer is a genetic disease driven by the accumulation of mutations in oncogenes and tumor suppressors through clonal evolution. Here we go deeper into the molecular mechanisms by which specific mutations rewire cellular signaling to drive malignancy, and then turn to the translational question: how can our understanding of cancer biology guide treatment? The answers are reshaping medicine. Targeted drugs that exploit specific molecular vulnerabilities, immunotherapies that unleash the patient’s own immune system, and liquid biopsies that monitor tumor evolution in real time are all products of the genomic revolution in oncology. This lesson covers the molecular mechanisms of cancer (sections 20.4) and cancer treatment (section 20.5), with emphasis on the bioinformatics tools — from pathway activity scoring and neoantigen prediction to tumor immune deconvolution and ctDNA analysis — that underpin modern precision oncology.


20.4 — Molecular Mechanisms of Cancer

How Proto-Oncogenes Become Oncogenes

A proto-oncogene is a normal gene that encodes a protein promoting cell growth or survival. Gain-of-function mutations can convert a proto-oncogene into an oncogene — a gene whose product drives uncontrolled proliferation. This activation can occur through several distinct mechanisms:

MechanismEffectExample
Point mutationProduces a hyperactive or constitutively active proteinKRAS G12V: locks Ras in the GTP-bound state
Gene amplificationMultiple gene copies → protein overexpressionHER2 amplification in breast cancer
Chromosomal translocationCreates a fusion protein with aberrant activity, or places a gene under a strong promoterBCR-ABL fusion in chronic myeloid leukemia; MYC under the immunoglobulin promoter in Burkitt lymphoma
Regulatory mutationIncreases transcription or mRNA stabilityTERT promoter mutations in melanoma and glioblastoma
Viral insertionViral promoter activates a nearby proto-oncogeneRetroviral activation of MYC in avian leukosis

A single activated oncogene allele is sufficient to promote cancer (dominant gain of function), in contrast to tumor suppressor genes, which typically require loss of both alleles (recessive loss of function following Knudson’s two-hit model).

Studies of Rare Cancers Identified the First Oncogenes

The first oncogenes were discovered through the study of rare, virus-induced cancers. In the 1970s, Peyton Rous’s work on the Rous sarcoma virus led to the identification of v-Src, a viral oncogene. The pivotal discovery was that v-Src was a mutated version of a normal cellular gene, c-Src — demonstrating that oncogenes originate from the cell’s own genome. Similarly, studies of rare childhood cancers such as retinoblastoma (caused by loss of the Rb tumor suppressor) and Burkitt lymphoma (driven by MYC translocation) provided the conceptual foundation for understanding how genetic alterations drive malignancy. These discoveries in rare cancers established principles that apply universally to all cancers.

Grouping Cancer Genes by Normal Function

Genes mutated in cancer can be grouped according to the normal function of their protein products. This functional classification reveals that cancer mutations converge on a limited number of critical cellular processes:

Functional classExamplesNormal role
Growth factorsPDGF, EGFExtracellular mitogenic signals
Growth factor receptorsEGFR, HER2, FGFRSignal reception at the cell surface
Intracellular signal transducersRAS, RAF, PI3K, ABLRelay signals from receptors to effectors
Transcription factorsMYC, JUN, FOS, β-cateninActivate gene expression programs
Cell cycle regulatorsCyclin D, CDK4, Rb, p16Control progression through the cell cycle
Apoptosis regulatorsp53, BCL-2, BAXDecide between cell survival and death
DNA repair proteinsBRCA1/2, MLH1, MSH2Maintain genome integrity
Chromatin regulatorsSWI/SNF complex, EZH2, IDH1/2Control epigenetic state and gene expression

This classification makes a critical point: cancer is not the result of any single pathway going wrong but of disruptions at multiple levels of cellular control, from the signals a cell receives to how it interprets and executes growth decisions.

Mutations in Growth-Factor Signaling Pathways Are Very Common in Cancer

The RTK–Ras–MAPK pathway is mutated in the majority of human cancers. Growth factors bind receptor tyrosine kinases (RTKs), which activate the GTPase Ras, triggering the MAP kinase cascade (Raf → MEK → ERK) and ultimately driving transcription of genes that promote proliferation. Oncogenic mutations at virtually every step of this cascade have been found in cancer:

  • Receptor overexpression or activating mutation (EGFR in lung cancer, HER2 amplification in breast cancer)
  • Constitutive Ras activation (KRAS mutations in ~25% of all cancers, especially pancreatic, colorectal, and lung)
  • Constitutive Raf activation (BRAF V600E in melanoma and colorectal cancer)

These mutations render the signaling pathway constitutively active, independent of external growth factors, so the cell proliferates without restraint.

Let’s compare normal and oncogenic KRAS at the DNA level. The G12V mutation is a single nucleotide change in codon 12 (GGT → GTT, Gly → Val) that abolishes GTPase activity:

let normal_KRAS = "ATGACTGAATATAAACTTGTGGTAGTTGGAGCT"
let mutant_KRAS = "ATGACTGAATATAAACTTGTGGTAGTTGGAGTT"
let result = Align.global(normal_KRAS, mutant_KRAS)
print("Normal vs G12V mutant KRAS:")
print(result.alignment)
print("Normal protein: " + Seq.translate(normal_KRAS))
print("G12V protein:  " + Seq.translate(mutant_KRAS))

A single base change (C→T) converts glycine to valine at position 12. This seemingly minor substitution introduces a bulky side chain that prevents GAP-assisted GTP hydrolysis, locking Ras in its active, GTP-bound state permanently.

Mutations in the PI3K/Akt/mTOR Pathway Drive Cancer-Cell Growth

The PI3K–Akt–mTOR pathway is the second most frequently mutated signaling pathway in cancer. Normally, growth factor signaling activates PI3K (phosphoinositide 3-kinase), which produces the lipid second messenger PIP₃. PIP₃ recruits Akt (protein kinase B) to the membrane, where it is activated and phosphorylates targets that promote cell survival, growth, and metabolism. Akt activates mTORC1, a master regulator of protein synthesis and cell growth. The lipid phosphatase PTEN opposes PI3K by dephosphorylating PIP₃, acting as a critical brake on the pathway.

Cancer mutations in this pathway include:

GeneAlterationFrequency
PIK3CAActivating mutations (E545K, H1047R)Breast, endometrial, colorectal
PTENLoss-of-function, homozygous deletionProstate, glioblastoma, endometrial
AKT1Activating mutation (E17K), amplificationBreast, ovarian
mTORActivating mutationsRenal cell carcinoma
TSC1/TSC2Loss-of-function (removes mTOR inhibition)Tuberous sclerosis, renal angiomyolipoma

The convergence of so many cancer mutations on this single pathway — at the level of the kinase (PIK3CA), the phosphatase (PTEN), the effector (AKT), and the downstream target (mTOR) — underscores its central role in promoting cancer-cell growth and survival.

Mutations in the p53 Pathway Are Very Common in Cancer

p53 (encoded by TP53) is the most frequently mutated gene in human cancer, altered in roughly 50% of all tumors. p53 acts as a “guardian of the genome”: in response to DNA damage, oncogene activation, or other stress signals, p53 activates transcription of genes that arrest the cell cycle (via the CKI p21), promote DNA repair, or trigger apoptosis if the damage is irreparable.

The p53 pathway integrates upstream stress signals through the kinases ATM, ATR, Chk1, and Chk2, which phosphorylate and stabilize p53. MDM2, an E3 ubiquitin ligase, normally keeps p53 levels low by targeting it for proteasomal degradation. In response to stress, MDM2 is inhibited, allowing p53 to accumulate.

Cancer mutations disable the p53 pathway at multiple points:

  • TP53 missense mutations (especially in the DNA-binding domain) — produce a dominant-negative protein that cannot activate target genes but can oligomerize with and inhibit wild-type p53
  • TP53 deletion — complete loss of p53 function
  • MDM2 amplification — excessive p53 degradation (sarcomas, glioblastomas)
  • p14ARF deletion — loss of the protein that normally inhibits MDM2 (melanoma, glioblastoma)
  • ATM loss — impaired upstream activation of p53 (lymphomas)

When p53 is lost, cells with DNA damage continue to divide and accumulate further mutations, accelerating tumor evolution and genomic instability.

Genome Instability Takes Different Forms in Different Cancers

Cancer genomes are unstable, but the form of instability varies:

Chromosomal instability (CIN) is the most common form, present in ~85% of solid tumors. CIN tumors have abnormal chromosome numbers (aneuploidy), large-scale chromosomal rearrangements, and frequent gains and losses of chromosome arms. CIN typically results from defects in the mitotic spindle checkpoint, centrosome duplication, or sister-chromatid cohesion. TP53 loss often accompanies CIN because p53 normally triggers apoptosis in aneuploid cells.

Microsatellite instability (MSI) occurs in ~15% of colorectal cancers and a subset of endometrial, gastric, and other cancers. MSI results from defective DNA mismatch repair (mutations in MLH1, MSH2, MSH6, or PMS2, or epigenetic silencing of MLH1 by promoter methylation). Mismatch repair deficiency leads to accumulation of insertions and deletions at microsatellite repeats throughout the genome, producing a hypermutator phenotype with very high tumor mutational burden (TMB).

Instability typeMechanismTMBClinical significance
CINMitotic errors, p53 lossModerateAneuploidy correlates with poor prognosis
MSI-highMismatch repair deficiencyVery highPredicts response to immune checkpoint inhibitors
BRCA-deficientHomologous recombination deficiencyModeratePredicts response to PARP inhibitors and platinum chemotherapy

These distinct forms of genome instability have different therapeutic implications — a theme we will return to in the treatment section.

Cancers of Specialized Tissues Use Distinct Routes to Target Common Pathways

Although cancers from different tissue types often disrupt the same core pathways (Ras-MAPK, PI3K-Akt, p53, Rb, Wnt), the specific genes mutated and the order in which mutations accumulate are highly tissue-specific. Colorectal cancer typically follows the adenoma-carcinoma sequence (APC → KRAS → SMAD4 → TP53). Pancreatic ductal adenocarcinoma almost invariably requires KRAS activation plus loss of p53, SMAD4, and CDKN2A. Melanoma frequently involves BRAF V600E plus CDKN2A loss. Breast cancer subtypes are defined by distinct driver combinations: HER2 amplification, PIK3CA mutation, or BRCA1/2 loss with homologous recombination deficiency.

This tissue specificity reflects the fact that each cell type has a unique epigenetic and transcriptional context that determines which pathways are active and which mutations confer a selective advantage. The implication for therapy is that treatment must be guided not only by the molecular alterations present but also by the tissue context in which they occur.

Bioinformatics: Cancer Pathway and Network Analysis

Computational tools translate raw genomic data into biological understanding at the pathway and network level:

Oncogenic pathway activity scoring from expression data: Tools such as PROGENy and GSVA (Gene Set Variation Analysis) score the activity of oncogenic pathways (Ras-MAPK, PI3K-Akt, p53, Wnt, Notch, TGF-β, and others) from bulk or single-cell RNA-seq data. Rather than looking at mutations alone, these methods capture pathway output — whether the pathway is functionally active based on its transcriptional footprint. This is valuable because a pathway can be activated by mutations in any of its components, or even by non-genetic mechanisms such as autocrine signaling.

When pathway activity is scored from expression data, correlating pathway scores across samples reveals co-activation patterns. Here we compute the Pearson correlation between two simulated pathway activity vectors (Ras-MAPK and PI3K-Akt) across a set of tumor samples:

let ras_scores  = [0.82, 0.91, 0.45, 0.38, 0.73, 0.95, 0.61, 0.29]
let pi3k_scores = [0.78, 0.88, 0.51, 0.42, 0.69, 0.90, 0.55, 0.35]
let r = Stats.pearson(ras_scores, pi3k_scores)
print("Pearson r (Ras-MAPK vs PI3K-Akt): " + r)

The high positive correlation reflects the fact that both pathways are frequently co-activated in cancer, often driven by shared upstream receptor signaling.

Cancer-specific protein interaction network analysis: Databases like STRING, BioGRID, and IntAct provide protein-protein interaction networks. Cancer-specific analyses overlay mutation and expression data onto these networks to identify network hotspots — densely connected subnetworks enriched for cancer driver genes. Tools like HotNet2 and NetSig use network diffusion algorithms to identify significantly mutated subnetworks, revealing cancer-relevant pathways that would be missed by single-gene analysis.

Drug sensitivity prediction from genomic features: The Genomics of Drug Sensitivity in Cancer (GDSC) project and the Cancer Cell Line Encyclopedia (CCLE) provide drug response data for hundreds of cell lines with comprehensive genomic characterization. Machine learning models trained on these datasets predict which drugs a tumor is likely to respond to based on its mutational and expression profile. Features such as specific mutations (BRAF V600E predicts sensitivity to vemurafenib), copy number alterations, gene expression signatures, and methylation patterns all contribute to prediction accuracy.

Machine learning for cancer classification and outcome prediction: Supervised learning algorithms (random forests, support vector machines, deep neural networks) are increasingly used to classify tumors into molecular subtypes, predict treatment response, and estimate prognosis. The PAM50 classifier for breast cancer subtypes (Luminal A, Luminal B, HER2-enriched, Basal-like) is a widely adopted example. More recent deep learning approaches integrate multi-omics data — mutations, copy number, expression, methylation — for improved outcome prediction.


20.5 — Cancer Treatment: Present and Future

Biology Guides the Search for Cancer Treatments

Every modern cancer therapy is rooted in biological understanding. The realization that cancer cells differ from normal cells — in their mutations, their proliferative rate, their dependence on specific signaling pathways, and their relationship to the immune system — provides the foundation for therapeutic strategies. The progression from nonspecific cytotoxic chemotherapy to molecularly targeted therapies and immunotherapies reflects the deepening of our understanding of cancer biology over the past half-century.

Traditional Therapies Exploit Genetic Instability and Rapid Division

Conventional cancer treatments — surgery, radiation therapy, and cytotoxic chemotherapy — predate the molecular era but remain cornerstones of treatment.

Cytotoxic chemotherapies exploit the rapid division and defective DNA repair of cancer cells. Classes include:

Drug classMechanismExamples
Alkylating agentsCrosslink DNA, preventing replicationCyclophosphamide, temozolomide
AntimetabolitesMimic nucleotide precursors, block DNA synthesis5-fluorouracil, methotrexate, gemcitabine
Topoisomerase inhibitorsTrap topoisomerase-DNA complexes, causing DNA breaksEtoposide, irinotecan
Mitotic inhibitorsDisrupt microtubule dynamics, arrest mitosisPaclitaxel (taxol), vincristine
Platinum compoundsCrosslink DNACisplatin, carboplatin

These drugs are effective but nonspecific: they damage all rapidly dividing cells, causing side effects in the gut epithelium, bone marrow, and hair follicles. Cancer cells with defective DNA repair (e.g., BRCA-mutant tumors) are particularly sensitive to DNA-damaging agents, which is the basis for using platinum chemotherapy and PARP inhibitors in BRCA-deficient cancers. PARP inhibitors block a backup DNA repair pathway, creating synthetic lethality in cells that have already lost homologous recombination repair.

Targeted Therapies Kill Cancer Cells Selectively

Targeted therapies represent a paradigm shift: rather than poisoning all dividing cells, they exploit the specific molecular vulnerabilities of cancer cells. The archetype is imatinib (Gleevec), a small-molecule inhibitor of the BCR-ABL fusion kinase in chronic myeloid leukemia (CML). CML cells depend on BCR-ABL for survival (a phenomenon called oncogene addiction), so imatinib selectively kills the cancer while sparing normal cells. Imatinib transformed CML from a fatal disease into a manageable chronic condition, establishing proof of concept for targeted therapy.

Key targeted therapies include:

TargetDrug(s)CancerMechanism
BCR-ABLImatinib, dasatinibCMLKinase inhibition
HER2Trastuzumab (Herceptin), T-DM1HER2+ breast cancerAntibody blocks receptor; antibody-drug conjugate delivers cytotoxic payload
BRAF V600EVemurafenib, dabrafenibMelanomaSelective BRAF kinase inhibition
EGFRErlotinib, osimertinibEGFR-mutant lung cancerKinase inhibition
ALK fusionsCrizotinib, alectinibALK+ lung cancerKinase inhibition
PARPOlaparib, rucaparibBRCA-mutant breast/ovarianSynthetic lethality with HR deficiency
CDK4/6Palbociclib, ribociclibHR+ breast cancerCell cycle arrest
PI3KαAlpelisibPIK3CA-mutant breast cancerPI3K inhibition

The success of targeted therapy depends on identifying the specific alteration that drives each patient’s tumor — the essence of precision oncology.

Let’s examine how a point mutation in the EGFR kinase domain creates a drug-targetable vulnerability. The L858R mutation in exon 21 replaces leucine with arginine, activating the kinase constitutively:

let EGFR_normal = "ATGCGCTTCCTGCCCGGCGCCTACAACCTGCTGCTGGAGCTG"
let EGFR_L858R = "ATGCGCTTCCTGCCCGGCGCCTACAACCTGCGGCTGGAGCTG"
print("Normal EGFR: " + Seq.translate(EGFR_normal))
print("L858R EGFR:  " + Seq.translate(EGFR_L858R))
let result = Align.global(EGFR_normal, EGFR_L858R)
print(result.alignment)

A single nucleotide change (CTG → CGG) produces the L858R substitution. This mutation destabilizes the autoinhibited conformation of the EGFR kinase domain, locking it in an active state. Importantly, it also reshapes the ATP-binding pocket in a way that makes the mutant kinase exquisitely sensitive to specific inhibitors like erlotinib and osimertinib — more sensitive, in fact, than the wild-type kinase.

Immunotherapy Unleashes the Immune System Against Tumors

The immune system is capable of recognizing and destroying cancer cells. Tumor-specific mutations generate novel peptides (neoantigens) that are presented on MHC class I molecules. Cytotoxic CD8⁺ T cells can recognize these neoantigens as foreign and kill the tumor cell. However, tumors evolve mechanisms to evade immune destruction — most notably by expressing immune checkpoint molecules that suppress T cell activity.

Immune checkpoint inhibitors block these inhibitory signals:

  • Anti-PD-1 (pembrolizumab, nivolumab) and anti-PD-L1 (atezolizumab) — PD-L1 on tumor cells binds PD-1 on T cells, delivering an inhibitory “don’t kill me” signal. Blocking this interaction reactivates T cell-mediated tumor killing
  • Anti-CTLA-4 (ipilimumab) — CTLA-4 competes with the co-stimulatory receptor CD28 for binding to B7 ligands on antigen-presenting cells, dampening T cell activation. Blocking CTLA-4 releases this brake and amplifies the anti-tumor immune response

Checkpoint inhibitors have produced durable responses in melanoma, lung cancer, renal cell carcinoma, bladder cancer, and many other tumor types. Their efficacy correlates with tumor mutational burden (TMB) and microsatellite instability (MSI), because more mutations generate more neoantigens for T cell recognition. In 2017, the FDA granted the first tissue-agnostic cancer drug approval to pembrolizumab for any MSI-high solid tumor — treatment based on a molecular biomarker rather than the tissue of origin.

Other immunotherapy approaches include CAR-T cell therapy (engineering a patient’s T cells to express chimeric antigen receptors targeting tumor surface antigens like CD19 in B-cell lymphomas) and cancer vaccines (immunizing with tumor neoantigens to amplify anti-tumor T cell responses).

Combination Therapies Hold Promise for Cancer Treatment

Single-agent targeted therapies often achieve dramatic initial responses, but resistance almost invariably develops. Resistance mechanisms include:

  • Secondary mutations in the drug target (e.g., EGFR T790M after erlotinib, BCR-ABL T315I after imatinib)
  • Bypass pathway activation (e.g., MET amplification bypasses EGFR inhibition)
  • Pre-existing resistant subclones that expand under selection pressure

Combination therapies attack the tumor from multiple angles simultaneously, making it much harder for resistant clones to emerge. Successful combinations include:

CombinationRationaleCancer
BRAF + MEK inhibitors (dabrafenib + trametinib)Block pathway reactivation through MEKBRAF-mutant melanoma
Anti-PD-1 + anti-CTLA-4 (nivolumab + ipilimumab)Complementary immune checkpoint blockadeMelanoma, renal cell carcinoma
Targeted therapy + immunotherapyKill tumor cells and enhance immune recognitionMultiple cancers (clinical trials)
Chemotherapy + immunotherapyChemotherapy-induced cell death releases neoantigens; combine with checkpoint inhibitionLung cancer, triple-negative breast cancer
CDK4/6 inhibitor + endocrine therapyBlock cell cycle entry while suppressing estrogen signalingHR+ breast cancer

The mathematical logic is compelling: if resistance to drug A arises at a frequency of 10⁻⁶ per cell division and resistance to drug B at 10⁻⁶, the probability of simultaneous resistance to both is approximately 10⁻¹² — far less likely in a tumor of 10⁹ to 10¹² cells.

Molecular Profiling Guides Cancer Treatment

Precision oncology uses comprehensive molecular profiling of each patient’s tumor to guide treatment decisions. Modern profiling typically includes:

  • Targeted gene panels (200–600 cancer genes) or whole-exome/genome sequencing to identify driver mutations
  • RNA sequencing to determine expression-based molecular subtype and pathway activity
  • TMB and MSI status to predict immunotherapy response
  • Homologous recombination deficiency (HRD) scores to predict PARP inhibitor and platinum sensitivity
  • PD-L1 immunohistochemistry to assess checkpoint inhibitor eligibility

A mutation matrix summarizes which genes are mutated in which patients. Visualizing this as a heatmap reveals co-occurrence and mutual exclusivity patterns:

let mutation_matrix = [
  {"gene": "KRAS",  "Sample1": 1, "Sample2": 1, "Sample3": 0, "Sample4": 0, "Sample5": 1},
  {"gene": "TP53",  "Sample1": 1, "Sample2": 0, "Sample3": 1, "Sample4": 1, "Sample5": 1},
  {"gene": "PIK3CA","Sample1": 0, "Sample2": 0, "Sample3": 1, "Sample4": 1, "Sample5": 0},
  {"gene": "BRAF",  "Sample1": 0, "Sample2": 1, "Sample3": 0, "Sample4": 0, "Sample5": 0},
  {"gene": "PTEN",  "Sample1": 0, "Sample2": 0, "Sample3": 1, "Sample4": 1, "Sample5": 0}
]
print("Mutation matrix:")
print(mutation_matrix)
Viz.heatmap(mutation_matrix, "gene")

Notice that KRAS and BRAF mutations tend toward mutual exclusivity (both activate the MAPK pathway, so mutating both provides no additional selective advantage), while PIK3CA and PTEN alterations co-occur with TP53 loss.

Databases such as OncoKB (Memorial Sloan Kettering) and CIViC (Clinical Interpretations of Variants in Cancer) annotate the clinical significance of cancer mutations — classifying each as oncogenic, likely oncogenic, or of uncertain significance — and link specific alterations to FDA-approved therapies, clinical trial eligibility, or prognostic implications. This annotation pipeline transforms raw sequencing data into actionable clinical information.

Let’s examine how a tumor suppressor mutation (PTEN) affects the protein product. PTEN is a lipid phosphatase that opposes PI3K signaling; its loss unleashes constitutive PI3K-Akt-mTOR activation:

let pten_wt = "ATGACAGCCATCATCAAAGAGATCGTTAGCAG"
let pten_mut = "ATGACAGCCATCATCAAAGAGATCGTAAGCAG"
print("PTEN wild-type protein: " + Seq.translate(pten_wt))
print("PTEN mutant protein:   " + Seq.translate(pten_mut))
print("WT sequence length: " + Seq.length(pten_wt))
print("Codon usage (WT):")
print(Seq.codon_usage(pten_wt))

Even a single nucleotide change in PTEN can produce a truncated or nonfunctional protein. Because PTEN is a tumor suppressor, loss of both copies (through mutation of one allele and deletion of the other) removes the PI3K pathway brake entirely.

Bioinformatics: Precision Oncology

Precision oncology depends on a growing suite of computational tools that connect genomic data to clinical decisions:

Tumor molecular profiling and clinical interpretation (OncoKB, CIViC): After sequencing identifies variants, automated pipelines annotate each mutation with clinical evidence. OncoKB classifies variants by their level of evidence for therapeutic actionability (Level 1: FDA-approved therapy for the specific alteration; Level 2: standard-of-care biomarker; Level 3: clinical trial evidence). CIViC is a community-curated, open-access database that links specific variants to clinical assertions with supporting evidence.

Neoantigen prediction for immunotherapy (pVACtools, NetMHCpan): The neoantigen prediction pipeline proceeds as follows: (1) identify somatic mutations from tumor-normal sequencing; (2) determine the patient’s HLA type from sequencing data (using OptiType or HLA-LA); (3) generate all possible mutant peptides (8–11 amino acids for MHC class I); (4) predict binding affinity of each peptide to the patient’s specific HLA alleles using NetMHCpan or MHCflurry; (5) filter for peptides that bind strongly to MHC but whose wild-type counterpart does not, identifying true neoantigens. The pVACtools suite integrates these steps into a single workflow. Predicted neoantigens guide personalized cancer vaccine design and help predict response to checkpoint immunotherapy.

Tumor immune microenvironment deconvolution (CIBERSORT, xCell, TIMER): A tumor is not just cancer cells — it contains infiltrating immune cells, fibroblasts, endothelial cells, and other stromal components. CIBERSORT uses a machine learning approach (support vector regression) to deconvolve bulk RNA-seq data and estimate the proportions of 22 immune cell types in the tumor microenvironment. xCell extends this to 64 cell types. TIMER (Tumor Immune Estimation Resource) provides web-based tools for analyzing immune infiltration across TCGA cancer types. High infiltration by cytotoxic CD8⁺ T cells generally predicts better prognosis and response to immunotherapy, while high levels of regulatory T cells or tumor-associated macrophages are associated with immune suppression.

Biomarker discovery from multi-omics data: Integrating genomic, transcriptomic, proteomic, and epigenomic data reveals biomarkers that predict drug response or prognosis. Approaches include survival-associated gene expression signatures, multi-omics factor analysis (MOFA), and pathway-level integration.

Clinical trial matching from genomic profiles: Platforms such as NCI-MATCH and MatchMiner algorithmically match a patient’s molecular profile to open clinical trials. Given the thousands of ongoing oncology trials and the complexity of eligibility criteria, computational matching is essential for connecting patients with appropriate experimental therapies.

Liquid biopsy and circulating tumor DNA (ctDNA) analysis: Circulating tumor DNA (ctDNA) consists of tumor-derived DNA fragments released into the bloodstream by dying cancer cells. ctDNA analysis enables:

  • Early detection: identifying cancer-associated mutations in asymptomatic individuals (e.g., the GRAIL Galleri multi-cancer early detection test)
  • Treatment monitoring: tracking variant allele frequency (VAF) of known driver mutations over time; a rising VAF indicates disease progression or resistance
  • Minimal residual disease (MRD) detection: detecting microscopic residual cancer after surgery, guiding decisions about adjuvant therapy
  • Resistance monitoring: identifying new resistance mutations (e.g., EGFR T790M) without an invasive tissue biopsy

ctDNA analysis typically requires ultra-sensitive sequencing methods (error-corrected deep sequencing, digital PCR) because tumor DNA constitutes a tiny fraction (often <1%) of total cell-free DNA in the blood.


Exercises

Exercise: Identify a Resistance Mutation

A patient with EGFR-mutant lung cancer initially responds to erlotinib but then progresses. Sequencing of the resistant tumor reveals a secondary mutation. Compare the drug-sensitive and drug-resistant EGFR sequences to identify the amino acid change:

let sensitive = "ATGCAGCTCATGCCCTTCGGCTGCCTCCTG"
let resistant = "ATGCAGCTCATGCCCTTCGGCTGCCTCATG"
let prot_s = Seq.translate(sensitive)
let prot_r = Seq.translate(resistant)
print("Sensitive: " + prot_s)
print("Resistant: " + prot_r)
print(prot_r)

Exercise: Analyze a Tumor Suppressor for Codon Usage

A cancer genome study finds a TP53 mutation. Examine the codon usage of a p53 fragment to understand which codons are vulnerable to specific mutational processes (e.g., UV-induced C→T transitions at dipyrimidine sites):

let p53_fragment = "ATGGATGATTTTGATGCTGTCCCCGGACGATATTGA"
print("p53 protein: " + Seq.translate(p53_fragment))
print(Seq.codon_usage(p53_fragment))
let gc = Seq.gc_content(p53_fragment)
print("GC content: " + gc)
print(gc)

Exercise: Correlate Pathway Activity Across Tumor Samples

Given expression-based activity scores for the Ras-MAPK and p53 pathways across a cohort of tumor samples, compute their Pearson correlation. A negative correlation would suggest that high Ras-MAPK activity is associated with p53 pathway suppression:

let ras_activity = [0.92, 0.85, 0.78, 0.41, 0.33, 0.25]
let p53_activity = [0.15, 0.22, 0.30, 0.72, 0.81, 0.88]
let r = Stats.pearson(ras_activity, p53_activity)
print("Pearson r (Ras vs p53): " + r)
print(r)

Knowledge Check


Summary

In this lesson you learned:

  • Gain-of-function mutations convert proto-oncogenes into oncogenes through point mutations, gene amplification, chromosomal translocation, regulatory mutations, or viral insertion
  • The first oncogenes (v-Src, BCR-ABL, MYC) were discovered through studies of rare virus-induced and inherited cancers
  • Cancer genes are classified by their normal function: growth factors, receptors, signal transducers, transcription factors, cell cycle regulators, apoptosis regulators, DNA repair proteins, and chromatin regulators
  • The RTK–Ras–MAPK pathway is the most frequently mutated signaling pathway in cancer, with oncogenic alterations at every level from receptor to kinase cascade
  • PI3K–Akt–mTOR pathway mutations (PIK3CA activation, PTEN loss) drive cancer-cell growth and survival
  • p53 is mutated in ~50% of all cancers; its loss eliminates the DNA damage checkpoint and allows accumulation of further mutations
  • Genome instability manifests as chromosomal instability (CIN, aneuploidy), microsatellite instability (MSI, mismatch repair deficiency), or homologous recombination deficiency (HRD)
  • Different tissue types use distinct mutational routes to converge on common cancer pathways
  • Pathway activity scoring (PROGENy, GSVA), network analysis (HotNet2, NetSig), and machine learning enable cancer classification, drug sensitivity prediction, and outcome modeling
  • Traditional therapies (chemotherapy, radiation) exploit the genetic instability and rapid division of cancer cells; PARP inhibitors create synthetic lethality in BRCA-deficient tumors
  • Targeted therapies (imatinib, vemurafenib, osimertinib, trastuzumab) selectively inhibit specific oncogenic drivers, exploiting oncogene addiction
  • Immune checkpoint inhibitors (anti-PD-1, anti-CTLA-4) reactivate anti-tumor T cell responses; high TMB and MSI predict response
  • Combination therapies attack multiple pathways simultaneously to prevent resistance
  • Molecular profiling with clinical databases (OncoKB, CIViC) translates genomic data into actionable treatment decisions
  • Neoantigen prediction (NetMHCpan, pVACtools) guides personalized immunotherapy and vaccine design
  • Tumor immune deconvolution (CIBERSORT, xCell, TIMER) characterizes the immune microenvironment from expression data
  • Liquid biopsy (ctDNA analysis) enables non-invasive detection of resistance mutations, treatment monitoring, and minimal residual disease assessment

References

  1. Alberts B, Johnson A, Lewis J, Morgan D, Raff M, Roberts K, Walter P. Molecular Biology of the Cell, 7th ed. New York: W.W. Norton; 2022. Chapter 20: Cancer.
  2. The Cancer Genome Atlas Research Network. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013;45(10):1113–1120.
  3. Alexandrov LB, Nik-Zainal S, Wedge DC, et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463):415–421.
  4. Nik-Zainal S, Davies H, Staaf J, et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature. 2016;534(7605):47–54.
  5. Newman AM, Bratman SV, To J, et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat Med. 2014;20(5):548–554.
  6. Mardis ER. Neoantigens and genome instability: impact on immunogenomic phenotypes and immunotherapy response. Genome Med. 2019;11:71.
  7. Bailey MH, Tokheim C, Porta-Pardo E, et al. Comprehensive characterization of cancer driver genes and mutations. Cell. 2018;173(2):371–385.
  8. Crowley E, Di Nicolantonio F, Loupakis F, Bardelli A. Liquid biopsy: monitoring cancer-genetics in the blood. Nat Rev Clin Oncol. 2013;10(8):472–484.

Powered by

cyanea-seq cyanea-align cyanea-stats cyanea-viz
cancer genomics precision oncology oncogenes p53 PI3K mTOR genome instability CIN MSI targeted therapy imatinib trastuzumab immunotherapy checkpoint inhibitors PD-1 CTLA-4 neoantigens NetMHCpan CIBERSORT liquid biopsy ctDNA OncoKB CIViC combination therapy drug resistance