{"title":"Identifying patterns of associated-conditions through topic models of Electronic Medical Records","authors":"Moumita Bhattacharya, C. Jurkovitz, H. Shatkay","doi":"10.1109/BIBM.2016.7822561","DOIUrl":null,"url":null,"abstract":"Multiple adverse health conditions co-occurring in a patient are typically associated with poor prognosis and increased office or hospital visits. Developing methods to identify patterns of co-occurring conditions can assist in diagnosis. Thus, identifying patterns of association among co-occurring conditions is of growing interest. In this paper, we report preliminary results from a data-driven study, in which we apply a machine learning method, namely, topic modeling, to Electronic Medical Records (EMRs), aiming to identify patterns of associated conditions. Specifically, we use the well-established Latent Dirichlet Allocation (LDA), a method based on the idea that documents can be modeled as a mixture of latent topics, where each topic is a distribution over words. In our study, we adapt the LDA model to identify latent topics in patients' EMRs. We evaluate the performance of our method both qualitatively and quantitatively, and show that the obtained topics indeed align well with distinct medical phenomena characterized by co-occurring conditions.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822561","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Multiple adverse health conditions co-occurring in a patient are typically associated with poor prognosis and increased office or hospital visits. Developing methods to identify patterns of co-occurring conditions can assist in diagnosis. Thus, identifying patterns of association among co-occurring conditions is of growing interest. In this paper, we report preliminary results from a data-driven study, in which we apply a machine learning method, namely, topic modeling, to Electronic Medical Records (EMRs), aiming to identify patterns of associated conditions. Specifically, we use the well-established Latent Dirichlet Allocation (LDA), a method based on the idea that documents can be modeled as a mixture of latent topics, where each topic is a distribution over words. In our study, we adapt the LDA model to identify latent topics in patients' EMRs. We evaluate the performance of our method both qualitatively and quantitatively, and show that the obtained topics indeed align well with distinct medical phenomena characterized by co-occurring conditions.