Towards a disease prediction system: BioBERT-based medical profile representation

Rima Hatoum, Ali Alkhazraji, Z. Ibrahim, Houssein Dhayni, Ihab Sbeity
{"title":"Towards a disease prediction system: BioBERT-based medical profile representation","authors":"Rima Hatoum, Ali Alkhazraji, Z. Ibrahim, Houssein Dhayni, Ihab Sbeity","doi":"10.11591/ijai.v13.i2.pp2314-2322","DOIUrl":null,"url":null,"abstract":"Healthcare professionals are increasingly interested in predicting diseases before they manifest, as this can prevent more serious health conditions and even save lives. Machine learning techniques are now playing an important role in healthcare, including in the early prediction of diseases based on prior medical knowledge. However, one of the biggest challenges is how to represent medical information in a way that can be processed by machine learning algorithms. Medical histories are often in a format that computers cannot read, so filtering and converting this information into numerical representations is a crucial step. This process has become easier with the advancement of natural language processing techniques. In this paper, we propose three representations of medical information, two of which are based on BioBERT, the latest text representation techniques for the biomedical sector. The efficiency of these representations is tested on the MIMIC-III database, which contains information on 46,520 patients. The focus of the study is on predicting Coronary Artery Disease, and the results demonstrate the effectiveness of the proposed approach. The study highlights the importance of medical history in disease prediction and demonstrates the potential of machine learning techniques to advance healthcare.","PeriodicalId":507934,"journal":{"name":"IAES International Journal of Artificial Intelligence (IJ-AI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IAES International Journal of Artificial Intelligence (IJ-AI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijai.v13.i2.pp2314-2322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Healthcare professionals are increasingly interested in predicting diseases before they manifest, as this can prevent more serious health conditions and even save lives. Machine learning techniques are now playing an important role in healthcare, including in the early prediction of diseases based on prior medical knowledge. However, one of the biggest challenges is how to represent medical information in a way that can be processed by machine learning algorithms. Medical histories are often in a format that computers cannot read, so filtering and converting this information into numerical representations is a crucial step. This process has become easier with the advancement of natural language processing techniques. In this paper, we propose three representations of medical information, two of which are based on BioBERT, the latest text representation techniques for the biomedical sector. The efficiency of these representations is tested on the MIMIC-III database, which contains information on 46,520 patients. The focus of the study is on predicting Coronary Artery Disease, and the results demonstrate the effectiveness of the proposed approach. The study highlights the importance of medical history in disease prediction and demonstrates the potential of machine learning techniques to advance healthcare.
迈向疾病预测系统:基于 BioBERT 的医疗档案表示法
医疗保健专业人员对在疾病显现之前进行预测越来越感兴趣,因为这可以预防更严重的健康问题,甚至挽救生命。目前,机器学习技术在医疗保健领域发挥着重要作用,包括根据先前的医学知识对疾病进行早期预测。然而,最大的挑战之一是如何以机器学习算法可以处理的方式表示医疗信息。病史通常采用计算机无法读取的格式,因此过滤这些信息并将其转换为数字表示法是至关重要的一步。随着自然语言处理技术的发展,这一过程变得更加容易。在本文中,我们提出了三种医学信息表示法,其中两种基于生物医学领域最新的文本表示技术 BioBERT。这些表示法的效率在 MIMIC-III 数据库中进行了测试,该数据库包含 46520 名患者的信息。研究的重点是预测冠状动脉疾病,结果证明了所建议方法的有效性。该研究强调了病史在疾病预测中的重要性,并展示了机器学习技术在推进医疗保健方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信