Artificial Intelligence Driven Framework for the Structurization of Free-Text Diagnostic Reports

Diagnosticians record, share, and store a wealth of data on patients, diseases, and biomedical processes in free-text diagnostic reports. To continue providing advanced biomedical services, healthcare organizations should efficiently and effectively perform complex data management, aggregate data resources, and ensure the interconnectivity and interoperability of biomedical data sets. However, free-text is a poor starting point for the computational analysis of complex biomedical information. For data management applications, diagnostic reports, biomedical test results, diagnostic images must be in a structured and machine-readable format. Free-text diagnostic reports lack data structure, making it challenging to extract information and use it for medical care purposes and biomedical research.

To address computational challenges for the analysis of free-text biomedical reports, a novel informatics framework is introduced. This framework transforms free-text biomedical reports to a machine-readable format. The utility of the framework is demonstrated with three projects. In the first project, the framework transforms free-text pathology reports to a graph representation known as knowledge graphs. In the second project, in order to recover implicit relationships among structurized diagnostic information, the framework is extended to model contextual information. In the third project, an information entropy-based data mining technique is introduced. The goal of this project is to facilitate analysis of diagnostic information across pathology reports. This informatics framework has potential to broadly impact diagnostic medicine and to be extended to other biomedical domains as well.