News & Announcements

Sep. 14, 2015

Operational Taxonomic Units classification: Diving into Phenetics Approach with the 16S Subunit

Operational Taxonomic Units (or OTUs) are useful approximations for taxonomic species in groups where classification is difficult. As such, OTU classifications based on DNA sequences are commonly used in metagenomics studies to describe sample diversity. Since there are no a priori definitions of what constitutes an OTU, a number of different methods have been applied for defining them. We analyze 20,229 16S rDNA subunit sequences to explore the nature of several OTU classification approaches. In order to do so, we first perform all possible pairwise comparisons with the Needleman-Wunsch alignment algorithm. We then constructed OTU clusters using several different sampling…

Aug. 31, 2015

Researching the communication structure of online health communities with social network analysis and computational linguistics, a group informatics approach

Online communities are virtual social structures that promote communication among Internet users on various discussion subjects. Research has found that online communities make communication possible for every person and are highly active with almost every Web user being a member of a forum. Online health communities connect people facing health concerns, exchange health information, and offer emotional support. In health care, online support fora are shown to enable emotional support and information sharing. Objective: This research analyzes the interactions of an online health community and study its participants’ interests and level of engagement. The objective is to develop an informatics…

Nov. 17, 2014

Large-scale biomedical image analysis using Big Data Infrastructure

Biomedical imaging informatics involves the analysis, manipulation, and computational calculation of digitally acquired biomedical images to gain knowledge and insights. Informatics technologies are being developed to assist biomedical researchers to identify meaningful objects from raw images, extract content, process information, discover relationships, and share knowledge. However, as the ‘Big Data’ era arrives, the ever-exploding image quantity, resolution, and imaging modalities are challenging the already computationally intensive methods. Big Data Ecosystem is expected to accelerate the computing speed and therefore leaves more room to improve the efficiency and accuracy of image analysis, storage, retrieval and sharing. Last but not least, researchers…

Oct. 20, 2014

Developing a Decision Support Software for CINV Prevention

The US National Center for Health Statistics estimated that more than 19 million adults in the US have ever been diagnosed with cancer. Chemotherapy is one of the important modality of cancer treatments. Chemotherapy-Induced Nausea and Vomiting (CINV) are the two most dreadful and unpleasant side effects of chemotherapy. CINV substantially degrades the patients’ life quality (due to dehydration, nutritional deficits, electrolyte imbalance, etc.) and increases the healthcare cost (by requiring further management of CINV including outpatient visits, drugs, hospitalization, etc.). In addition, cancer patients sometimes discontinue chemotherapy due to intolerable CINV. Thus, this is imperative to identify and treat…

Oct. 13, 2014

Knowledge Discovery System for Research Hypothesis Generation from Serendipitous Findings

From the discovery of penicillin and x-rays to the development of many of today’s chemotherapy agents, serendipitous findings tangential to the researcher’s intended purpose, the “That’s funny…” moments, have greatly impacted the health and well-being of society. As an information behavior, these unexpected findings are an example of the Opportunistic Discovery of Information (ODI). ODI has been described in many contexts, from information behavior in virtual worlds to the impact of information encountering on health behaviors. Yet, little is known about instances of ODI within the context of scientific research. A major difficulty in the study of the ODI is…

Oct. 6, 2014

Model-, structure-, and sequence-based methods for prediction of protein binding sites

Identification of protein-protein binding sites is important in understanding the protein function. The binding site prediction methods that rely on structure are generally more accurate than those ones relying on sequence. However, the coverage of structure-based methods is significantly lower than of the sequence-based method due to the lack of experimental structures. Here, we propose a sequence-based protein binding site prediction approach that utilizes structure-based methods’ benefits. We utilize L1-regularized logistic regression to integrate sequence- and structure-based predictions for comparative models. The method relies on a series of features, including evaluation of comparative models, geometric features, solvent accessibility, hydrophobicity, secondary…

Sep. 15, 2014

Sequence Identity Study for Operational Taxonomic Unit Classification

In metagenomics studies of microorganisms, Operational Taxonomic Unit (OTU) is often used as the replacement for species distinction. This pseudo-species definition is helpful in cases when the scientists would like to understand the composition and diversity of the culture in different environments. Traditional numerical taxonomy method typically defines an OTU as a cluster in a graph resulting from sequence alignment. According to this method, organisms whose 16S rDNA sequences have more than 97% sequence similarity threshold are connected together to firm a cluster. In this study, we investigate on whether the tradition numerical method results in OTUs that behave as…

Sep. 8, 2014

Predictive Analytics On Medicare/Medicade Cost Outcomes

LIGHT2 (Leveraging Information Technology for Hi-Tech and Hi-Touch Care) is a federally funded project using 24 “Nurse Care Managers” to manage the health of 10,000 Medicare and Medicaid patients. Its goal is to reduce exacerbations of chronic diseases, which would improve health outcomes while lowering healthcare costs. Analytics support (“Hi-Tech”) support for the Nurse Care Managers (“Hi-Touch”) has been used to classify patients by past utilization and costs, but these are imperfect predictors of future exacerbations and increasing utilization. Mining the large available health histories of these patients, along with demographic and other data, reveals some expected and some surprising…

Feb. 17, 2014

A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction

The necessity for reliable ab initio protein secondary structure prediction is growing along with the demand for accurate tertiary structure prediction. Although recent developments have slightly exceeded previous methods of secondary structure prediction, these methods rarely surpass 80% accuracy. Developing new tools and methods to improve secondary structure prediction is essential to the improvement of tertiary structure prediction in proteins. Here we present DNSS, a secondary structure predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures. Graphical processing units and CUDA software are used to optimize the deep network architecture and efficiently…