Artificial Intelligence Driven Framework for the Structurization of Free-Text Diagnostic Reports
Diagnosticians record, share, and store a wealth of data on patients, diseases, and biomedical processes in free-text diagnostic reports. To continue providing advanced biomedical services, healthcare organizations should efficiently and effectively perform complex data management, aggregate data resources, and ensure the interconnectivity and interoperability of biomedical data sets. However, free-text is a poor starting point for the computational analysis of complex biomedical information. For data management applications, diagnostic reports, biomedical test results, diagnostic images must be in a structured and machine-readable format. Free-text diagnostic reports lack data structure, making it challenging to extract information and use it for medical care…
CHALLENGES AND OPPORTUNITIES IN DIABETES SELF-MANAGEMENT EDUCATION AND SUPPORT: THE ANALYSES OF DIABETES MOBILE APPLICATIONS AND PROVIDER DOCUMENTATION PATTERNS
Diabetes mellitus is one of the most prevalent chronic diseases in the United States. As a disease with long-term complications requiring changes in management, diabetes requires not only education at the time of diagnosis, but ongoing diabetes self-management education. The goal of this dissertation is to identify challenges and opportunities in diabetes self-management education and support through the analyses of diabetes mobile applications and provider documentation patterns. This dissertation includes three specific areas. First, we compared features of current diabetes mobile apps to the American Association of Diabetes Educators Self-Care BehaviorsTM guidelines. A multidisciplinary team analyzed and classified the features of…
LARGE-SCALE SOYBEAN GENOME-WIDE VARIATION WORKFLOW AND ASSOCIATION ANALYSIS USING DEEP LEARNING
With the advances in next-generation sequencing technology and significant reduction in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations, and apply the knowledge towards improvements in traits. To facilitate large-scale NGS resequencing data analysis of genomic variations efficiently, we developed a systematic solution using high-performance computing environment, cloud data storage resources and graphics processing unit computing with cutting-edge deep learning approach. The solution contains an integrated and optimized variant calling workflow called ‘PGen’, a quantitative phenotype prediction model using convolutional neural network and an algorithm to study genome-wide association…
DATA MINING FOR GENETIC CONTRIBUTIONS TO THE ETIOLOGY OF AUTISM SUBGROUPS
Autism is a collection complex neurological disorders characterized by behavioral, social, and cognitive deficits. Previous investigation of the etiology of autism reveals it to be a complex disorder with no simple way to identify its root cause in most affected individuals. The difficulty determining causal variation leads to the hypothesis that multiple genetic risk factors are necessary in combination to produce the autistic phenotype. Furthermore, the immense phenotypic heterogeneity seen in autism patients leads to a second hypothesis that there exist multiple subtypes of autism with distinct genetic etiologies. We developed new methods combining strategies from bioinformatics, data science, and…
PROTEIN TRANSPORT: BIOINFORMATICS METHODS FOR UNDERSTANDING PROTEIN SUBCELLULAR LOCALIZATION
Eukaryotic cells contain diverse subcellular organelles. These organelles form distinct functional cellular compartments where different biological processes and functions are carried out. The accurate translocation of a protein is crucial to establish and maintain cellular organization and function. Newly synthesized proteins are transported to different cellular components with the assistance of protein transport machineries and complex targeting signals. Mis-localization of proteins is often associated with metabolic disorders and diseases. Compared with experimental methods, computational prediction of protein localization, utilizing different machine learning methods, provides an efficient and effective way for studying the protein subcellular localization on the whole-proteome level. Here,…
HOMOLOGY SEQUENCE ANALYSIS USING GPU ACCELERATION
A number of problems in bioinformatics, systems biology and computational biology field require abstracting physical entities to mathematical, computational models. In such studies, the computational paradigms often involve algorithms that can be solved by the Central Processing Unit (CPU). Historically, those algorithms bene- fit from the advancements of computing power in the serial processing capabilities of individual CPU cores. However, the growth has slowed down over recent years, scaling out CPU has shown to be both cost-prohibitive and insecure. To overcome this problem, parallel computing approaches that employ the Graphics Processing Unit (GPU) have gained attention as complementing or replacing…
MODELING THE HIPPOCAMUS: FINELY CONTROLLED MEMORY STORAGE USING SPIKING NEURONS
The hippocampus, an area in the temporal lobe of the mammalian brain, participates in the storage of personal memories and life events, including traumatic memories and the consequent symptoms of post-traumatic stress, giving importance to the study of the machinery of hippocampal memory storage and retrieval. The circuit is known to be controlled by the neuromodulator Acetylcholine, which switches the circuit between the memory storage state and the memory retrieval state. We built a computational model of the hippocampus with the ability to perform both memory storage and retrieval functions, controlled by the level of Acetylcholine. This functional separation decrease…
An Interventional Informatics Approach to Development and Evaluation of Population-based Health and Web Technologies
Interventional informatics is the use of health information technology (HIT) which drives evidence-based and evidence-generating practices to inform an improved health delivery system. Current HIT lacks movement towards data-driven infrastructures designed to promote information gathering, sharing, and new knowledge discovery in several areas. This thesis undertakes three specific areas where gaps exist. First, in undertaking quality improvement initiatives aligned with fidelity to program models, a web-based practice exchange was designed, built, tested and launched. Second, a systematic review of eHealth technology instruments for outcomes and evaluation components geared towards patient outcomes was conducted, uncovering gaps in the availability of psychometrically…
MUII Dissertation Defense-Awatef Ben Ramadan
Steps in Transforming the Missouri Cancer Registry (MCR) from an Incidence Registry to a Survival Registry Female Breast cancer (FBC) is the most common invasive cancer among women of all races and ethnicities in the United States (US). We aimed to estimate FBC burden in Missouri in terms of FBC incidence, mortality and survival rates; to visualize these results; and to assess the usability of the Missouri Cancer Registry and Research Center’s (MCR-ARC)’s interactive maps. FBC survival data were calculated from 2004 to 2010 after matching MCR’s FBC cases with Missouri death records, Social Security Death Index (SSDI),…