Nidhin Nandhakumar, Ehsan Sherkat, E. Milios, Hong Gu, Michael Butler
{"title":"Clinically Significant Information Extraction from Radiology Reports","authors":"Nidhin Nandhakumar, Ehsan Sherkat, E. Milios, Hong Gu, Michael Butler","doi":"10.1145/3103010.3103023","DOIUrl":null,"url":null,"abstract":"Radiology reports are one of the most important medical documents that a diagnostician looks into, especially in the emergency context. They provide the emergency physicians with critical information regarding the condition of the patient and help the physicians take immediate action on urgent conditions. However, the reports are in the form of unstructured text, which makes them time consuming for humans to interpret. We have developed a machine learning system to (a) efficiently extract the clinically significant parts and their level of importance in radiology reports, and (b) to classifies the overall report into critical or non-critical categories which help doctors to identify potential high priority reports. As a starting point, the system uses anonymized chest X-RAY reports of adults and provides three levels of importance for medical phrases. We used the Conditional Random Field (CRF) model to identify clinically significant phrases with an average f1-score of 0.75. The proposed system includes a web-based interface which highlights the medical phrases, and their level of importance to the emergency physician. The overall classification of the report is performed using the phrases extracted from the CRF model as features for the classifier. Average accuracy achieved is 85%.","PeriodicalId":200469,"journal":{"name":"Proceedings of the 2017 ACM Symposium on Document Engineering","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM Symposium on Document Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3103010.3103023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Radiology reports are one of the most important medical documents that a diagnostician looks into, especially in the emergency context. They provide the emergency physicians with critical information regarding the condition of the patient and help the physicians take immediate action on urgent conditions. However, the reports are in the form of unstructured text, which makes them time consuming for humans to interpret. We have developed a machine learning system to (a) efficiently extract the clinically significant parts and their level of importance in radiology reports, and (b) to classifies the overall report into critical or non-critical categories which help doctors to identify potential high priority reports. As a starting point, the system uses anonymized chest X-RAY reports of adults and provides three levels of importance for medical phrases. We used the Conditional Random Field (CRF) model to identify clinically significant phrases with an average f1-score of 0.75. The proposed system includes a web-based interface which highlights the medical phrases, and their level of importance to the emergency physician. The overall classification of the report is performed using the phrases extracted from the CRF model as features for the classifier. Average accuracy achieved is 85%.