investigator_user investigator user funding collaborators pending menu bell message arrow_up arrow_down filter layers globe marker add arrow close download edit facebook info linkedin minus plus save share search sort twitter remove user-plus user-minus
  • Project leads
  • Collaborators

New Statistical Methods to Handle Spatial Uncertainty in Cancer Risk Estimation

Yongtao Guan

2 Collaborator(s)

Funding source

National Cancer Institute (NIH)
The primary goal of this project is to develop novel statistical methods to handle spatial uncertainty in the event locations when conducting cancer risk estimation. We consider two different types of spatial uncertainty, specifically, 1) coarsenin due to the practice of releasing location information at an area level but not the point level and 2) geocoding error resulting from the use of geographic information systems software to convert residential addresses to geographic coordinates (i.e. longitudes and latitudes). Cancer epidemiologists can extract data from many different sources such as census, statewide health surveys, tumor registries and population-based case-control studies, and each source may yield data with different types of spatial uncertainty. Analytic methods are usually adversely affected by the presence of spatial uncertainty, resulting in biased parameter estimates, inflated standard errors, and reduced statistical power to detect spatial clustering and trends. To address these challenges, we propose a set of highly versatile estimation procedures to account for the spatial uncertainty and to efficiently combine data obtained from multiple sources. These procedures are based upon established theories on estimating equations and as such they can be easily implemented in practice. Compared with existing methods, the proposed methods are novel because 1) they permit the inclusion of individual-level risk factors for subjects with spatially uncertain locations, 2) the proposed intensity model admits a flexible semiparametric form and hence removes potentially restrictive assumptions such as the population density being constant over small geographic areas, and 3) they explicitly account for spatial correlation in the disease locations in both parameter estimation and statistical inference. In the substantive applications, we propose to supplement population-based case-control data with tumor registry data, census data and statewide health survey data. To the best of our knowledge, such an approach would be the first in the field and unparalleled. We will implement our proposed methods in a free, user-friendly R package. Our package will provide much- needed tools for more objective investigations of cancer risk factors by accounting for spatial uncertainty in the event locations. It will allow researchers to take advantage of the full spectrum of available data and use the data more effectively to reduce the burden of disease.

Related projects