Estrogen receptor-negative (ER-), early stage breast cancer (ESBC) patients show marked clinical heterogeneity with regard to outcomes. Further, there have been no major advances in improving prognostication or prediction over the last decade. We have completed an extensive analysis of copy number imbalances (CNI) in ER- ESBC and have developed the first practical, robust prognostic model applicable to ER-ESBC. The primary goal of this project is to validate, and if necessary, refine our prognostic CNI model for ER-/ESBC. Overall the project is complementary to the TCGA in that the follow-up for patients is much longer, a requirement for breast cancer studies, and the samples are solely from ESBC whereas many of the samples in the TCGA are from large, advanced tumors due to the study design. The overarching hypothesis of our study is that inclusion of information on somatic events or tumor 'genotype' will improve risk discrimination and prediction model calibration for individual ER-/ESBC patients for recurrence, distant metastasis, treatment response, and overall survival. Secondarily, we hypothesize that the pattern of somatic events in ER-/ESBC will differ by epidemiological factors (race/ethnicity, age of onset, screening behaviors) providing important public health information. Three specific aims encompass the validation and refinement of prognostic/predictive models based on somatic events for ER-/ESBC considering population structure. In aim 1, we will validate our current model as a fixed model in three independent sample sets for prognostication. In aim 2, we will take advantage of advanced methods for variable selection to evaluate whether or not we can improve model accuracy by considering interactions between somatic events and clinical factors. In aim 3, we will conduct comparative analyses of the models to assess overlap in information content, prognostic accuracy. We will explore the models for the ability to predict response to contemporary treatment with and without inclusion of HER2+ cancers including taxanes and HER2-targeted therapy. The primary translational goal of this project is to validate and refine our prognostic CNI model for ER-/ESBC to reflect current therapeutic protocols. A second translational goal is to assess the performance of our CNI prognostic model(s) in predicting treatment response. Importantly, we propose novel methods for variable selection that allow consideration of the joint effects of somatic events, epidemiologic factors, and treatment on patient outcomes that can be generalized to other marker discovery efforts.