Utilizing Real-World EHR Data: Early Predictive and Prescriptive Analysis for Glaucoma Using Machine Learning Methods

Electronic health records (EHRs) have emerged as a crucial source of data for data-driven translational clinical research. Meanwhile, machine learning (ML) models have advanced rapidly and are now a potent tool for analyzing EHR data, allowing us to harness a large amount of diverse clinical information. Various ML techniques have been created and adapted to predict outcomes using EHR data. However, effectively utilizing an observational and predictive window from the sequential EHR data is vital to developing a reliable model. This investigation applies advanced ML methods for predictive and prescriptive analysis, targeting multiple clinical encounters to provide suitable decision-making recommendations for personalized glaucoma care. Glaucoma is classified into two primary types: primary open-angle glaucoma (POAG) and angle-closure glaucoma. The most common form in the United States is primary open-angle glaucoma, which is the second leading cause of blindness globally. It is typically identified by the pattern of peripheral vision loss that distinguishes it from other types of vision impairment. Although the specific cause of glaucoma is not yet clear, most cases of glaucoma are associated with increased intraocular pressure. Studies have demonstrated that several risk factors, including age, race, gender, and family history, are common contributors to glaucoma. Other factors that raise the risk of developing glaucoma include medical conditions like diabetes, high blood pressure, and heart disease, prolonged use of corticosteroids, and other comorbidities. A better understanding of the combination of risk factors and the acceleration of glaucoma development is essential. In this project, we propose an evidence-based and data-driven approach, such as exploratory and sub-group contrast data mining analysis between glaucoma and non-glaucoma conditions to understand the comorbidities associated with glaucoma condition. We will employ ML methods to build a model and classify patients into groups based on the progressive development and early detection of glaucoma onset. We will also apply ML methods for recommending suitable medication for better treatments. We have three specific aims: Aim 1- Use data mining methods to investigate patterns and trends of risk factors associated with glaucoma. Aim 2- Develop an ML model for early prediction of glaucoma from the patient’s EHR encounters. Aim 3- Provide prescriptive analysis for an actionable drug recommendation for the glaucoma care team. The research outcome will aid researchers and clinicians in identifying the most prominent causes of glaucoma, leading to new strategies for preventing and delivering better glaucoma care.