Improving interpretation of Genome-Wide Association Studies (GWAS) by quantifying Marker Recurrence
To understand biological differences within groups of people or animals, we often turn to DNA. A genome-wide association study (GWAS) can assess the genetic contribution of biological differences between individuals. However, the scale of input data continues to expand in three ways: the sequence coverage of genomes, the number of individuals sequenced, and the number of phenotype records per individual. High-throughput workflows are computationally intensive and require a laborious interpretation of results. These barriers inhibit systematically investigating hypotheses and limit the effective translation of genetics into biomedical and agricultural solutions. The expansion of data analyzed, compounded by numerous analysis approaches,…
Implementing GeoARK: The Geospatial Analytical Research Knowledgebase
This research focuses on the development and implementation of an interface to the Geospatial Analytical Research Knowledgebase (GeoARK), a spatially enabled big data informatics approach assembled around applications in health research and analytics. Example applications in telehealth reach, COVID-19 risk in rural situations, pathways for zoonotic disease spread, and contextual leukemia research will be provided. The creation and design of GeoARK occurred within the University of Missouri’s Institute for Data Science and Informatics. Being spatially engendered, its core is data that is pre-processed, cleaned, integrated and represented in its spatial context as millions of point locations. To this core, additional…
An R-based platform for the visualization and analysis of single molecule tracking experiments
Single molecule tracking (SMT) is a technique of single-molecule fluorescence imaging that allows for the exploration of molecular motion at a high spatiotemporal resolution on living cells. This is widely used to define dynamics of individual tumor cell-surface receptors. Spatiotemporal regulation of many of these receptors varies across cancer types, playing a key role in tumor progression and drug resistance. Many tools can be used to identify trajectories and calculate their features from these experiments. However, there are relatively few tools to analyze this data. Thus, the present study uses a set of live-cell single-molecule imaging experiments with a model…
Artificial Intelligence Driven Framework for the Structurization of Free-Text Diagnostic Reports
Diagnosticians record, share, and store a wealth of data on patients, diseases, and biomedical processes in free-text diagnostic reports. To continue providing advanced biomedical services, healthcare organizations should efficiently and effectively perform complex data management, aggregate data resources, and ensure the interconnectivity and interoperability of biomedical data sets. However, free-text is a poor starting point for the computational analysis of complex biomedical information. For data management applications, diagnostic reports, biomedical test results, diagnostic images must be in a structured and machine-readable format. Free-text diagnostic reports lack data structure, making it challenging to extract information and use it for medical care…
Cancer Research Funding: Where is the Money?
Cancer is one of the most common and deadly diseases and its incidence is increasing. There are over one hundred types of cancer and they have a varied impact on society and those affected. Some have known, preventable causes and some are poorly understood. Some can be detected early and some are only detected in an advanced stage. Some are very treatable and some have a very high mortality rate. In order to level the playing field for cancers, there needs to be research to understand more about poorly understood cancers. What is the present state of cancer research funding?…
Quantifying the Predictive Value of Categories of Neighborhood-Level Risk Factors to Predict Health Outcomes
A person’s environmental context is well known to impact health outcomes. However, this information is rarely available in an acute clinical care setting. Although a growing body of literature combines the information available in the Electronic Medical Record (EMR) with environmental and place-based data to better examine the environmental impact on health, these studies generally focus on a single index or category of environmental data, thus failing to take into account the richness of geo-spatial data that are available, as well as the underlying interactions of multiple community risk factors. Recent improvements in access to geo-spatial data at multiple layers…
A Case-Control based Genomic Analysis of Chronic Obstructive Pulmonary Disease
Chronic Obstructive Pulmonary Disease (COPD) is a respiratory illness that affects millions of people all over the world. It is a major cause of chronic morbidity and mortality and a serious global public health problem. COPD is the fourth leading cause of death worldwide. Although the environmental causes of COPD which predominantly include cigarette smoking are well-documented, to this date the genetic underpinnings of COPD remain largely unknown. Furthermore in the current landscape of a respiratory pandemic, COPD patients are at a much higher risk for developing other respiratory illnesses and co-morbidities. In this study we use genomic data from…
Proof tree contrast mining for automatic hypothesis generation
The probability of developing almost any given disease is affected by multiple risk factors. These risk factors often do not behave independently, instead interacting in specific ways which affect the probability of developing the disease. To better understand the root causes of many diseases, it is necessary to study these interactions as they may provide clues about the underlying mechanisms responsible for the development of the disease. We have developed a method for studying these interactions by applying contrast mining to extract patterns of nested logical interactions associated with specific medical outcomes. We demonstrate the effectiveness of this method in…
Bioinformatic Prediction of the Potential SARS-CoV-2 Receptors in Human
Unraveling receptors used by SARS-CoV-2 for entry and the exact positively selected sites on the Spike (S) protein associated with this process can provide insights into the viral transmission and reveal therapeutic targets. Except for ACE2, accumulating evidence indicates that the S protein potentially recognizes other receptors like CD147. Therefore, methods for new receptors identification are urgently needed. To account for this, the following three aspects will be explored in this proposal. First, with increasing genome sequence data to investigate evolution and selection patterns and to assess their influence on the structure and function of the S protein. Second, integrating…