Risk of colorectal cancer (CRC) varies widely across the population. Statistical models have shown that there must be wide variation in the underlying 'familial risk profile' necessary to explain the well-established familial risks of CRC. On top of this, CRC risk is determined by personal characteristics and exposures to many environmental factors. Current screening guidelines use only age and rudimentary metrics of family history to stratify individuals to different regimens. To date, there are no CRC risk prediction models that integrate data on family history (including ages, ages at diagnoses, and relationships across multiple generations) and all known genetic factors, let alone environmental factors and personal characteristics associated with CRC risk. Our goal is to develop a comprehensive risk prediction model that will deliver an accurate individual risk prediction across the full spectrum of risk by using explicit family history data, as well as data on the presence or absence of all known genetic and environmental risk factors. This model will facilitate more appropriate screening recommendations to reduce the burden of disease. We will do this using data from a fully characterized, population-based study with comprehensively measured known and suspected risk factors including personal characteristics, environmental risk factors, complex family history, and measured genetic variants. We have at our disposal the Colon Cancer Family Registry, a large, prospective, international resource of families and individuals who span the continuum of risk for CRC, recruited from the USA, Canada, Australia and New Zealand since 1998. Participants have: (i) been tested for mutations in the mismatch repair genes (MLH1, MSH2, MSH6 and PMS2) and MUTYH, and measured for the single nucleotide polymorphisms (SNPs) known to be associated with CRC; (ii) provided a multi-generational pedigree and family cancer history which has been validated where possible; and (iii) completed a risk factor questionnaire at baseline and every five years. We will accomplish our goal by addressing two aims. The first is to develop the CRC risk prediction model using measured genetic information, the multi-generational cancer family history data, and the information on all known risk factors. We will apply segregation analysis and logistic regression to data from 29,701 individuals from 12,902 population-based families. The second aim is to test the validity of the CRC risk prediction model and to determine its incremental value over existing models using prospective data. We will determine how well, in terms of calibration and predictive capacity, the model's predicted CRC risks (based on their baseline data) predict the CRC diagnoses occurring during the follow-up of the sub-cohort individuals with no history of CRC at baseline (19,390 with five-year follow-up of which 8,657 have 10-year follow-up). This comprehensive CRC risk prediction model will have many uses, including helping the development of targeted, population-wide screening strategies in which recommendations as to initiation age, frequency and modality, are tailored appropriately to predicted risks so as to increase the value and cost-effectiveness of screening.