In the LHC Molecular Genetics & Carcinogenesis Section we have been building a Knowledge Network of lung cancer to support our ongoing efforts to enhance disease subtyping and contribute to a more detailed taxonomy that will lead to more precise clinical management of this complex disease. Our strategy is to begin by analyzing resources from our NCI-MD cohort and then validate those findings in additional cohorts throughout the world. We acquire multiple levels of genomic data from patients and controls using different types of biospecimens. After results are validated in multiple cohorts, we integrate them as part of our Information Commons. When possible, we use the same patients for different studies as this allows for a full integration of the different levels of data to determine if it improves cancer taxonomy. The multiple layers of information gathered provide a view of each component of the 1. Exposome. The identification and precise measure of the collective contributions of exogenous (e.g., smoking, radon, environmental pollutants) and endogenous (e.g., hormones, inflammation) exposures to an individual's disease predisposition are key components of the Information Commons and provide insights into disease biology, health disparities and opportunities for intervention. External Exposome We identified a germline Single Nucleotide Polymorphism (SNP) in the dopamine D1 receptor (DRD1) that modulates risk of lung cancer among individuals exposed to secondhand smoke during childhood. Interestingly, this polymorphism modulates risk in both ever smokers and never smokers. The relationship is also evident in African Americans and European Americans. Internal Exposome: Inflammation We have conducted numerous studies on the role of inflammation in lung carcinogenesis. Recently, we have studied whether markers of inflammation, such as pro-inflammatory cytokines and C-reactive protein, are predictors of lung cancer diagnosis and prognosis. In the NCI-MD study and validated in the PLCO cohort, we demonstrated that increased levels of IL-6, CRP and IL-8 are associated with lung cancer diagnosis and are elevated up to 5 years before diagnosis in the case of IL-8. We have demonstrated that a combined IL-6 and IL-8 signature is both associated with poor outcome in stage I lung cancer patients, a population for which accurate predictors of outcome are both needed and lacking (Ryan BM, et al., J Thorac.Oncolgy, In Press, 2014). In addition, our health disparity research has identified specific inflammatory profiles associated with risk in African Americans and in European Americans and recent efforts have also integrated our biomarker data with genetic data, where an interaction between a 3'UTR SNP in the IL-8 receptor and putative target for miR-516a-3p binding, and IL-8 was identified. We also continue to study the mechanisms of this circulating inflamed signature in cancer. 2. Genome In collaboration with Takashi Kohno, we are investigating Oncogenic fusions act as driver mutations in lung cancer without KRAS mutations, and thus represent promising therapeutic targets for the treatment of such cancers (Nakaoku T, et al., Clin Cancer Res 20: 3087-3093, 2014). 3. Transcriptome We have developed a cancer-related gene expression signature that is a robust prognostic classifier for stage I lung cancer. Our goal was to evaluate the expression of genes with a mechanistic role in lung cancer to increase the odds of finding a robust classifier for stage I lung cancer. We developed a classifier based on the expression of BRCA1, HIF1A, DLC1, and XPO1 that could classify patients into risk groups in 5 independent cohorts (Patent Pending). This classifier was significantly associated with prognosis in subgroup analyses of Stage I, Stage IA and Stage IB separately demonstrating the potential clinical utility of this classifier (patent pending). We have now gone on to validate this classifier in 12 publically available cohorts. We included every publically available cohort with more than 30 stage I patients that we could identify. This 4-gene classifier is robust and can assign patients into different risk groups. We also have our mechanistic studies of RecQ helicases, e.g., regulation of gene expression by the BLM helicase correlates with the presence of G-quadruplex DNA motifs (Nguyen GH, et al., Proc.Natl.Acad.Sci.U.S.A 111: 9905-9910, 2014). 4. Metabolome In collaboration with Frank Gonzalez (Laboratory of Metabolism), we have discovered that the urine metabolome is associated with lung cancer diagnosis and prognosis and biomarkers of coordinate metabolic reprogramming in colorectal tumors in mice and humans (Manna, et al., Gastroenterology, 2014). We identified four diagnostic and prognostic biomarkers for which high levels are associated with lung cancer diagnosis and poor prognosis (patent pending). Four metabolites (novel and previously un-annotated creatine riboside, N-acetylneuraminic acid (NANA), cortisol sulfate and un-indentified metabolite referred to as 561+) were uncovered as top predictors of lung cancer status independent of race (African American and European American subjects), gender, and smoking status (self-reported never-, former- and current-smokers) (Mathe et al., Cancer Res., 2014). These results were validated in an independent sample set comprising more recently diagnosed cases (validation set) and further confirmed by quantitation (quantitation set). MOLECULAR TAXONOMY OF LUNG CANCER We envision that our molecular studies will lead to valuable and clinically useful tests for the assessment of an individual's risk for disease development or recurrence. One of the strengths of our program is the emphasis we place on integration of multiple parameters of molecular, clinical, and environmental data for single patients. We devote substantial resources to this effort with the vision that complementary integration of multiple parameters will improve the clinical value of our taxonomic system. Using this multilevel platform, we can construct a rich knowledge network, from which we will be able to identify biomarkers for new more accurate taxonomic classification that will be clinically beneficial to patients resulting in improved outcome for cancer diagnosis and treatment. Combinations of signatures across different data types within a given disease will also be hypothesis generating in terms of biology, potentially illuminating novel mechanisms, and informing therapeutic strategies. We have started to integrate the diverse molecular markers with a goal of finding most predictive classifiers of disease outcome that can potentially inform therapeutic strategies in early stage lung cancer. We are evaluating if the combination of the 4-gene classifier with other mRNA, miRNA, metabolomics and methylation signatures will result in a more robust prediction of patient prognosis. This is the case for the combination of 4-protein-coding gene signature and miR-21 in stage I lung cancer.