Rima Hatoum, Ali Alkhazraji, Z. Ibrahim, Houssein Dhayni, Ihab Sbeity
{"title":"Towards a disease prediction system: BioBERT-based medical profile representation","authors":"Rima Hatoum, Ali Alkhazraji, Z. Ibrahim, Houssein Dhayni, Ihab Sbeity","doi":"10.11591/ijai.v13.i2.pp2314-2322","DOIUrl":null,"url":null,"abstract":"Healthcare professionals are increasingly interested in predicting diseases before they manifest, as this can prevent more serious health conditions and even save lives. Machine learning techniques are now playing an important role in healthcare, including in the early prediction of diseases based on prior medical knowledge. However, one of the biggest challenges is how to represent medical information in a way that can be processed by machine learning algorithms. Medical histories are often in a format that computers cannot read, so filtering and converting this information into numerical representations is a crucial step. This process has become easier with the advancement of natural language processing techniques. In this paper, we propose three representations of medical information, two of which are based on BioBERT, the latest text representation techniques for the biomedical sector. The efficiency of these representations is tested on the MIMIC-III database, which contains information on 46,520 patients. The focus of the study is on predicting Coronary Artery Disease, and the results demonstrate the effectiveness of the proposed approach. The study highlights the importance of medical history in disease prediction and demonstrates the potential of machine learning techniques to advance healthcare.","PeriodicalId":507934,"journal":{"name":"IAES International Journal of Artificial Intelligence (IJ-AI)","volume":"7 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IAES International Journal of Artificial Intelligence (IJ-AI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijai.v13.i2.pp2314-2322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Healthcare professionals are increasingly interested in predicting diseases before they manifest, as this can prevent more serious health conditions and even save lives. Machine learning techniques are now playing an important role in healthcare, including in the early prediction of diseases based on prior medical knowledge. However, one of the biggest challenges is how to represent medical information in a way that can be processed by machine learning algorithms. Medical histories are often in a format that computers cannot read, so filtering and converting this information into numerical representations is a crucial step. This process has become easier with the advancement of natural language processing techniques. In this paper, we propose three representations of medical information, two of which are based on BioBERT, the latest text representation techniques for the biomedical sector. The efficiency of these representations is tested on the MIMIC-III database, which contains information on 46,520 patients. The focus of the study is on predicting Coronary Artery Disease, and the results demonstrate the effectiveness of the proposed approach. The study highlights the importance of medical history in disease prediction and demonstrates the potential of machine learning techniques to advance healthcare.