Using Big Data to Identify Possible External Risk Factors for Poorly Understood Cancers


Julie Jaddoo







Worldwide, cancer is the second leading cause of death (Cancer, 2012). There were 17 million new cases and 9.6 million cancer deaths worldwide in 2018, including approximately 1.7 million new U.S. cases and 600,000 U.S. cancer deaths (Cancer Facts & Figures 2018 | American Cancer Society, 2018; Worldwide Cancer Statistics, 2019). The worldwide incidence of cancer is expected to increase to 27.5 million per year by 2040 (Worldwide Cancer Statistics, 2019). The U.S. expects an increase to over 1.9 million new cases per year by 2020 due to an aging Caucasian population and a growing African American population (CDC – Expected New Cancer Cases and Deaths in 2020, 2019).
There have been significant reductions in cancer mortality, thanks to improved screening, early detection, and better treatment (Cancer, 2012). However, the increase in cancer incidence will cause more suffering for patients and their families. It will be an emotional burden and an economic burden to individuals and society. More patients will need to undergo treatment and deal with the resulting side effects, which can negatively impact their quality of life. The medical cost of cancer in the U.S. could rise to $207 billion by 2020 (Cancer Costs Projected to Reach at Least $158 Billion in 2020, 2011). The increasing burden of cancer will have a greater impact on the low- and middle-income countries. These countries already bear the burden of 70% of cancer deaths, are at a financial disadvantage due to the significant financial cost of cancer, and they lack the resources to detect and adequately treat cancer (Cancer, 2012; Torre et al., 2016).
The fight against disease can be through preventing the disease, curing those affected by the disease, or caring for those who cannot be cured. The World Health Organization states that “30-50% of all cancer cases are preventable. Prevention offers the most cost-effective long-term strategy for the control of cancer” (WHO | Cancer Prevention, 2020). Cancer can be prevented by reducing exposure to environmental risk factors, modifying lifestyle factors linked to cancers, and protecting against the effects of risk-factor exposures. Because these differences in preventability have been inadequately studied for many cancer types, the objective of this study is to systematically review and identify the cancer types which pose the greatest threats, but for which the environmental and lifestyle risk factors are poorly understood, and identify possible environmental and lifestyle risk factors for these cancers.
Specific Aims:1.     Identify poorly understood cancers2.     Use association mining, cluster analysis, and contrast mining to identify relationships between the cancers and variables; similarities among the cancers; differences among the cancers.3.     Use GIS analyses to determine any relationships between the cancers incidence and regulated environmental activities. 

Please contact Robert Sanders ( for Zoom information.