domingo, 7 de junio de 2015

The Future of Molecular Medicine: Biomarkers, BATTLEs, and Big Data | 2015 Educational Book | Meeting Library

The Future of Molecular Medicine: Biomarkers, BATTLEs, and Big Data | 2015 Educational Book | Meeting Library

ASCO University

The Future of Molecular Medicine: Biomarkers, BATTLEs, and Big Data

When I first entered the field of lung cancer, the seminal study published in the New England Journal of Medicine was a four-arm chemotherapy doublet study that showed no difference in effectiveness. Conclusions: All the drugs are the same, so use what you like for patients with advanced non–small cell lung cancer (NSCLC).1
Times have changed, and personalized medicine has become a reality. The era of “molecular medicine” has truly blossomed. Over the last decade, approximately 40 drug approvals were based on specific tumor biomarkers.2 Options for patients have changed dramatically, from cytotoxic chemotherapy to molecularly targeted therapy. This concept has been standard for almost 2 decades for those treating patients with breast cancer (targeting HER2 and/or hormone receptors), but now patients with melanoma and lung cancer are included in this discussion. It is astounding how far treatments for each of these tumor types have progressed in the past 10 years. Melanoma treatment consisted of chemotherapy or interleukin-2 and now consists of RAF inhibitors and targeted immunotherapy, both of which have improved patient outcomes substantially: response rates and progression-free survivals have more than doubled. In fact, the progression-free survivals for molecularly targeted therapies exceed the overall survival numbers reported for chemotherapies. Lung cancer treatment has evolved from administration of chemotherapy yielding poor outcomes to biomarker assessments such as EGFRALK, and ROS1. Treatment with EGFR inhibitors (gefitinib, erlotinib, afatinib)35 and ALK-based therapies (crizotinib, ceritinib)6,7 have dramatically improved efficacy and quality-of-life outcomes. The notion that a patient with advanced lung cancer could be treated on oral biologic therapy for over 2 years without undergoing intravenous chemotherapy is astounding. For example, a patient with an EGFR mutation present in their tumor is treated with an EGFR tyrosine kinase inhibitor (TKI). Then the patient develops a T790 mutation in the tumor and is treated with a T790 TKI.8
As our desire for continued discoveries in tumor genetics increases, more and more data are generated and captured. Larger organized trials that prospectively capture biomarker and treatment data have emerged, and data analysis of these trials has become more complex. Adaptive trial designs, such as umbrella and basket trials, have become commonplace across academic cancer institutes and are emerging at community-based cancer institutes. As biopsies to obtain adequate amounts of testing for molecular studies are now the norm, repeat biopsies are now part of the armamentarium of procedures to best assess the tumor in real time following treatment. With electronic data capture processes (including electronic health records [EHRs] and national databases) also evolving, medicine is now faced with the concept of “Big Data”: how to collect it, maintain it, integrate EHR big data with genomic big data, and thus, utilize it.


The principle of identifying biomarkers to help guide patient treatment has been a longstanding research initiative. Early biomarkers and panels were largely classified as prognostic or predictive. They were not associated with a specific biologic therapy but rather defined the tumor characteristics, including response to therapy. Examples include Ki-67, carcinoembryonic antigen (CEA), and prostate-specific antigen (PSA). Breast cancer markers, such as hormone receptor status and HER2, were some of the earliest markers related directly to therapy. In lung cancer, validated markers associated with therapy began with elucidating that EGFR mutations were more prevalent in patients with adenocarcinoma, Asian ethnicity, and nonsmoking status and were correlated with high activity of EGFR TKIs (gefitinib, erlotinib, afatinib). However, more information was desired by researchers and treating physicians, and, more importantly, correlation with specific treatments was sought. Tumor registries were created to collect tissue and test utilizing evolving molecular panels. Such strategies have been developed across academic institutions, private enterprises, and government. Numerous cancer consortiums were developed in the 2000s that required submission of patient tissue along with clinical data.
The Lung Cancer Mutation Consortium (LCMC) is one of these efforts ( The LCMC, an initiative of the National Cancer Institute (NCI), comprises 16 leading cancer centers across the country, with the goal of providing clinicians with information on the types and frequency of mutations. Once identified, the mutations are matched with associated treatments and/or clinical trials, giving physicians more options to better care for patients with advanced NSCLC. Testing these mutations with various treatments could lead to identification of actionable mutations (tumors with a mutation that responds to specific treatments), which would expedite potential drug development. Over 1,000 patients (stage IIIB/IV, performance status 0 to 2) have been enrolled in the LCMC. The group has detected an actionable mutation in 64% of tumors from prospectively studied patients with lung adenocarcinoma.9


The Director's Challenge Lung Study comprises gene expression profiles acquired on Affymetrix microarray chips from more than 400 specimens of early-stage lung cancer, with associated clinical and pathologic annotation available to the public for analysis.10 The investigators reported a large, training-testing, multisite, blinded validation study to characterize the performance of several prognostic models based on gene expression for 442 lung adenocarcinomas. The hypotheses proposed examined whether microarray measurements of gene expression, either alone or combined with basic clinical covariates (stage, age, sex), could be used to predict overall survival in lung cancer subjects. Several models examined produced risk scores that substantially correlated with actual subject outcome. Most methods performed better with clinical data, supporting the combined use of clinical and molecular information when building prognostic models for early-stage lung cancer.10 This has not yet translated into clinical practice, but highlighted that clinical information is still very important. This study also provides the largest available set of microarray data with extensive pathologic and clinical annotation for lung adenocarcinomas.


Early efforts to collect large amounts of comprehensive data included genome-wide association studies (GWA study, or GWAS), also known as whole genome association study (WGA study, or WGAS). These studies involve rapidly scanning markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular disease. The magnitude of data involved with GWAS requires thousands to tens of thousands of markers, compared to genomics that could require few to hundreds. Other limitations of GWAS include the potential for high false-positive rates, which next-generation sequencing (NGS) may be able to overcome, and perhaps at a more economical scale. However, this is debatable until sufficiently larger data sets can be analyzed.
This method searches the genome for small variations, called single nucleotide polymorphisms (SNPs), which occur more frequently in people with a particular disease than in people without the disease. Information generated from these analyses can pinpoint genes that may contribute to a person's risk of developing a certain disease—including cardiovascular, pulmonary, or neurologic ailments, or cancer—or even determine an individual's sensitivity to drug metabolism. The National Institutes of Health (NIH) has developed a Database of Genotypes and Phenotypes (dbGaP; to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype.


Identifying a patient population that has the best chance for improved outcomes with therapy allows better assessment of safety and risk. Realizing phenotypic characteristics are limited in nature, clinical trials have focused more than ever on acquiring blood, tissue, and other biospecimens to identify biomarkers of response. However, the idea of finding the appropriate molecular phenotype/genotype of a patient and matching it with a molecularly tailored drug that will in turn lead to an overall survival advantage is still in the research and testing stage for the majority of patients with cancer. Clinicians have patients in whom impressive results have occurred on a mostly case-by-case basis. The best drugs or combinations of drugs for individual patients are still speculative.


My colleagues and I initiated the BATTLE trial (Biomarker-Based Approaches of Targeted Therapy for Lung Cancer Elimination) in 2005. In the study, patients with previously diagnosed and treated lung cancer underwent a new real-time biopsy to assess biomarkers. These biomarkers were utilized in an adaptive manner to select treatment recommendations with one of several biologic agents. BATTLE was conducted before any drugs were approved with companion diagnostic biomarkers. The trial provided an important proof of principle that repeat biopsies could be performed safely and expeditiously in patients who were previously treated for lung cancer.11 Subsequent BATTLE trials have been initiated based on our initial trial.

I-SPY 1 and I-SPY 2

Other trials have utilized similar concepts across a variety of disease types (Table 1). The I-SPY breast cancer studies were sponsored by a combination of public and private organizations, including the Foundation for the National Institutes of Health, NCI, U.S. Food and Drug Administration (FDA), Safeway Foundation, Biomarkers Consortium, Quintiles, and QuantumLeap Health Care Collaborative. The goal of I-SPY 1 was to identify predictors of response to neoadjuvant chemotherapy for women with stage II/III cancer. A total of 356 women were enrolled; biomarker information and MRI results were obtained. Researchers found that certain tumor profiles responded better than others to particular drugs. Results from I-SPY 1 guided the design of I-SPY 2, an adaptive, randomized trial that uses early data to guide treatment decisions for those enrolled later in the trial. Approximately 800 women will be enrolled in this trial of multiple drugs.12
Selected Umbrella and Master Trials

            Selected Umbrella and Master Trials


The NCI has launched initiatives to study patients with unique biomarkers and targeted therapy in both early- and advanced-stage cancer settings. The Adjuvant Lung Cancer Enrichment Marker Identification and Sequencing Trial (ALCHEMIST) series is focused on patients diagnosed with early-stage lung cancer. The series of trials consists of a screening trial and two treatment trials (EGFR-targeted therapies and ALK-targeted therapies). Approximately 6,000 to 8,000 patients will be enrolled in the screening trial over the next 5 to 6 years with the goal of determining if selection of therapy based on these molecular markers leads to improved survival. Tumor biopsies from these patients will be evaluated for EGFR and ALK abnormalities. Patients whose tumors harbor one of these mutations will be directed into the appropriate biologic treatment trial (erlotinib if EGFR is mutated and crizotinib if ALK is abnormal vs. placebo). Each treatment trial is expected to enroll approximately 400 patients.13



Another NCI-sponsored study, NCI MATCH (The NCI Molecular Analysis for Therapy Choice Trial), will enroll patients with advanced solid tumors (gastrointestinal stromal tumor, NSCLC, breast, gastric, melanoma, thyroid) and lymphomas. Biopsies from approximately 3,000 patients will undergo testing for abnormalities that may respond to targeted drug therapies. Of these patients, up to 1,000 will then participate in phase II clinical trials of targeted drug therapies. The patients will be matched to the drug based solely on the genetic abnormality, not on the type of cancer. NCI MATCH is a master trial, meaning that new drugs can be added to the trial at any time. The primary endpoint for this trial is response, however, progression-free survival will also be assessed.14
A new public/private cooperative effort is Lung-MAP (Lung Master Protocol). This study will enroll 500 to 1,000 patients diagnosed with advanced, previously treated squamous cell lung cancer each year. The molecular profiling is done with a central commercial panel and places patients into different arms with biologic therapies. One feature of this study is that patients who do not meet the criteria for treatment with a targeted therapy are placed into a trial involving a nontargeted investigational treatment. The primary objective of this trial is to determine if the efficacy of targeted therapy is better than that of standard therapy.15


In the NCI-sponsored M-PACT (Molecular Profiling–Based Assignment of Cancer Therapeutics trial), patients with advanced tumors that have progressed on at least one line of standard therapy undergo tumor biopsy to determine if a mutation is present. Those who do not have an identifiable mutation are removed from the trial. Those who do have a mutation are randomly assigned to receive either treatment with a drug known to target their mutation or treatment with a drug not known to target their mutation. Cross-over is allowed for those who experience disease progression after receiving treatment with a drug not known to target their mutation. Tumor response and 4-month progression-free survival are the endpoints in this trial. Accrual to this trial began in 2014 and approximately 700 patients are expected to be screened, with over 150 to be enrolled in a treatment arm. The goal is to determine whether therapies targeting a mutation can work in the metastatic setting.16
There is a need to test the approach in different settings, leading to many umbrella and master studies required to answer the overarching question of whether therapies targeting molecular aberrations lead to better outcomes for patients over standard chemotherapy, and in what types of patients, tumors, or aberrations these treatments work (or do not work). The importance of these efforts is not only the testing of the specific targeted therapies, but the collection and storage of centralized tissue and molecular data. These collections will allow large amounts of data to be analyzed with the hope of someday developing predictive treatment models for patients.


Big data is defined as any voluminous amount of structured, semi-structured, and unstructured data that has the potential to be mined for information. More specifically, big data is any data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it as well as to extract value and hidden knowledge from it. Big data expands across four fronts: velocity, variety, volume, and veracity (Table 2).17,18
The Four Fronts of Big Data

          The Four Fronts of Big Data
Capturing big data in databases may help formulate hypotheses for testing. Statistical testing can be performed on pre-existing data to facilitate this process. However, big data collection can be compromised by bias in medical records, lack of data validity and reliability, and technology challenges. The potential for misinterpretation of the data is paramount. Additionally, the structure of electronic medical records may be poorly suited for adequate data abstraction and are certainly not suited for numerous secondary analyses. Many institutions also have different platforms, which can impede data integration. Attention to privacy, sharing, transparency, and stewardship are all guiding principles for big data collection and analysis.
Medicine, specifically cancer medicine, is encountering this dilemma. As the amount of information exponentially increases (patient clinical information, pathology, biomarkers, treatment outcomes, and patient questionnaires), the prospect of harnessing and processing this data is daunting. However, the potential rewards make it worth the effort. Numerous companies and researchers are working to integrate and interrogate these data sets with expectations that they will identify new opportunities for treatment, diagnosis, prognosis, and prevention. Nontraditional groups are joining the foray into this area, including Google and financial institutions, because of their ability to collect a variety of data on inordinately large numbers of individuals and variables (e.g., search, purchasing, and other behaviors).


The American Society of Clinical Oncology (ASCO) has established an important initiative called CancerLinQTM (, a health information technology platform that plans to collect information from the experiences of patients with cancer. The goal of this big data collection will be to improve the quality and value of cancer care. Data will be collected from the electronic health records of patients across the United States and will be restructured and stored into one single database that will be able to provide clinical decision-making support for health care providers. The information from patients and providers will be continuously collected, thus enabling further refinement of the decision-making algorithm. ASCO has currently created a prototype platform and has initiated 15 practice sites to collect data from over 500,000 patient experiences.19
Big data projects like CancerLinQ include strategic, technical, and regulatory challenges, but ASCO has established a robust external advisory structure to guide its development. An example of clinical utility could be in immunotherapy treatment. As melanoma, renal cell, and lung cancer have immunotherapy as treatment options, harnessing numerous patient encounters and clinical data could create guidance for clinicians in managing not only the disease process, but also side effects from the treatment. This type of data would be valuable in helping predict responders and nonresponders, perhaps predict who will experience side effects, and more importantly, help keep patients on treatment longer, which may translate into better survival and quality of life.


In his 2015 State of the Union Address, President Barack Obama proposed a new initiative aimed at increasing our capacity to research, and ultimately deliver, precision medicine across disease types. President Obama has designated over $215 million to support the collaborative efforts of the NIH, FDA, and Office of the National Coordinator for Health Information Technology on this Precision Medicine Initiative. With the initial focus on cancer, the President has proposed to create a research cohort (partially through the use of existing resources and partially through upwards of 1 million new accruals) in which patient tissues and clinical data will be collected, stored, and analyzed. This big data will be used to enhance our existing knowledge of molecular medicine and develop individualized, molecular approaches to cancer treatment. The President's new initiative is clearly at the intersection of biomarkers and big data.20,21


The era of personalized medicine continues at a rapid pace. The practice of medicine now requires not only a clinical examination, but also molecular testing, novel therapeutics, research trial awareness, and perhaps utilization of information obtained from big data in the near future. Integration of these tools will enable providers to deliver better, more comprehensive care. Much work still needs to be performed, and more tools needs to be developed and refined, to help providers and clinicians achieve the best results.

No hay comentarios:

Publicar un comentario