Evaluating the effectiveness of transfer-learning with DeepVariant
Genomic data has become ubiquitous for bioinformaticians; however, successfully inferring biological meaning depends upon the sensitive prediction of differences between genomes. The most popular method to infer short sequence variants is the Genome Analysis Toolkit (GATK). While GATK provides rigorous guidelines, the methods require knowledge-intensive refinement as software and sequencing technologies advance. A recent advancement from Google Health Genomics called DeepVariant uses a deep neural network to call variants in human whole-genome sequence (WGS) data. In comparison to GATKv4, after training, the human genome DeepVariant model achieved a significant drop in Mendelian Inheritance Errors (MIE). MIE variants are not passed…
Tool Development for Analyzing Arrhythmias in Fast Cardiac Magnetic Resonance Scans
Cardiac magnetic resonance (CMR) scanning provides a method to diagnose cardiac disease. For obtaining an effective image, the standard procedure of CMR requires patients to hold their breath during the scanning, but it is difficult for frail patients in clinic. Furthermore, standard CMR imaging depends upon averaging together regular cardiac cycles, which is disrupted by irregular heartbeats, this irregularity prevents visualization of arrhythmias. HeartSpeed software will introduce a new strategy to help frail patients, magnetic resonance technicians, and physicians, by enabling free-breathing CMR image post-processing that corrects the breathing motion. Closely related algorithms provide a new approach to visualize and…
Using Big Data to Identify Possible External Risk Factors for Poorly Understood Cancers
Worldwide, cancer is the second leading cause of death (Cancer, 2012). There were 17 million new cases and 9.6 million cancer deaths worldwide in 2018, including approximately 1.7 million new U.S. cases and 600,000 U.S. cancer deaths (Cancer Facts & Figures 2018 | American Cancer Society, 2018; Worldwide Cancer Statistics, 2019). The worldwide incidence of cancer is expected to increase to 27.5 million per year by 2040 (Worldwide Cancer Statistics, 2019). The U.S. expects an increase to over 1.9 million new cases per year by 2020 due to an aging Caucasian population and a growing African American population (CDC – Expected New Cancer Cases and Deaths in…
An Evaluation of Physician Burnout by EMR Use Characterization and Correlation
Burnout disproportionately affects healthcare workers and continues to rise. This condition potentially contributes to cost, quality and patient safety risk in an already overburdened United States healthcare system. While the causes of burnout are complex, evidence exists pointing to Electronic Medical Record use (EMR) as one major contributor due to the increased clerical burden that decreases patient contact time and contributes to disruption for the provider. The growth and consolidation of large-scale EMR vendors has given rise to enterprise-scale electronic medical records with workflows applied across disparate venues and specialties, further complicating the ability to optimize the physician EMR experience and leading…
Early Detection of Glaucoma Using Electronic Health Records
Glaucoma is the second leading cause of irreversible blindness worldwide. About 70 million people have glaucoma, and nearly 4.4 million people are blind from optic nerve damage due to undiagnosed glaucoma. Besides, the current glaucoma growth rate and its economic burdens are unsustainable. As a result, warrant a systematic evaluation for glaucoma risk assessment and early prediction for better glaucoma care management. Effective use of temporal information across electronic health records (EHR) provides data-driven and evidence-based risk factors linked to glaucoma development and supports the early predictive model. In the present study, we used 830,125 unique patient records from the…
Evaluation of chronic disease self-management information on social media using evidenced-based frameworks
The management of chronic diseases requires considerable patient education and self-management. Diabetes, cancer and mental illness are among the top ten searched chronic diseases on social media, a platform where people increasingly seek and disseminate information. Social media platforms such as Twitter can potentially shape online conversations and perceptions about chronic disease management. In this study, we analyze diabetes self-management (DSM) information on Twitter using AADE7™ behavioral guidelines. This study aims to illustrate that social network analysis based on such evidence-based behavioral frameworks can be used to inform the analysis of chronic disease information shared on social media. This approach…
Elucidate the Genome Evolution of Eusocial Corbiculate Bees Using Parent-progeny Sequencing Approach
Understanding the evolution of eusociality, defined by distinct reproductive and nonreproductive castes, at the molecular level, has always been an essential and highly challenging topic of biology. Eusociality has evolved multiple times independently and involved many incremental steps, resulted in intermediate levels of social complexity. The Apinae (corbiculate bees) consists of 4 tribes with a wide range of social complexity: orchid bees (Euglossini), bumble bees (Bombini), stingless bees (Meliponini), and honey bees (Apini); is an ideal group for comparative studies of eusocial evolution in Hymenoptera. The first sequenced genome of the honey bee Apis mellifera in 2006 has become a gateway for…
Analysis of snRNA-seq from CDX Models of Non-Small Cell Lung Cancer Identified Subpopulation of Cells Potentially Responsible for Tumor Progression
Circulating tumor cells (CTCs) are considered as seeds of metastasis and have potential to be used as biomarkers in cancer. Understanding the biology of CTCs is critical to evaluate tumor progression and response to treatment. Additionally, studying transcriptome of CTCs–derived tumors aids in deciphering the causes underlying metastasis. Single nuclei-RNA-sequencing (snRNA-seq) is an emerging technology that allows investigators to study individual cells with molecular typing to that drives tumor growth and resistance to therapy. In this study, we use novel human CTC-derived xenograft (CDX) mouse models of non-small cell lung cancer (NSCLC) and snRNA-seq to determine genetic and cellular drivers of…
What to Learn and What to Avoid from ClinicalTrials.gov for New Trial Design When Repurposing Drugs for Precision Medicine?
Clinical trials are essential in the process of new drug development and repositioning. As clinical trials involve significant investments of time and money, it is crucial for trial designers to carefully investigate trial settings prior to launching it. In the 356,282 trials registered on ClinicalTrials.gov , one can search similar trial setting with the current trial of interest and identify prior or ongoing trials that share similar patient’s population, genetic characteristics, intervention means, etc. It is a wise strategy to learn from successful trials and to avoid repeating mistakes from failed trials. For example, in our computational drug repositioning project…