Carole Faviez , Xiaomeng Wang , Marc Vincent , Nicolas Garcelon , Sophie Saunier , Valérie Cormier-Daire , Xiaoyi Chen , Anita Burgun
{"title":"Enhancing rare disease detection with deep phenotyping from EHR narratives: evaluation on Jeune syndrome","authors":"Carole Faviez , Xiaomeng Wang , Marc Vincent , Nicolas Garcelon , Sophie Saunier , Valérie Cormier-Daire , Xiaoyi Chen , Anita Burgun","doi":"10.1016/j.ijmedinf.2025.106021","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Patients with rare diseases frequently experience misdiagnoses and long diagnostic delays. Accelerating their diagnosis is essential to ensure timely access to appropriate care. Given the increasing availability of EHRs, combining artificial intelligence and deep phenotyping from large-scale clinical databases offers a promising approach to identify undiagnosed patients. This study assesses the impact of improved phenotype extraction on a screening algorithm for Jeune syndrome, a rare ciliopathy characterized by skeletal abnormalities.</div></div><div><h3>Methods</h3><div>Phenotypes from Jeune syndrome patients and controls were automatically extracted from patient unstructured EHRs relying on two thesauri separately: the standard UMLS Metathesaurus and the UMLS+, an enhanced version incorporating additional terms identified through deep learning. The machine learning pipeline that we designed for classifying patients with renal ciliopathy was adapted for Jeune syndrome detection. The model was trained and tested on both the datasets created using the two phenotyping strategies.</div></div><div><h3>Results</h3><div>Using UMLS+ strongly improved the classification of patients with Jeune syndrome, increasing the sensitivity from 49 % to 95 % while maintaining a 90 % specificity. The review of a subset of misclassified controls showed that most of them (69 %) had other genetic skeletal disorders, indicating that the model also captured patients who would benefit from referral to a bone disease geneticist.</div></div><div><h3>Conclusion</h3><div>AI-based screening combined with high-quality deep phenotyping can help reduce diagnostic delay in rare diseases. The completeness and accuracy of phenotyping from EHRs have a strong impact on screening performances.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106021"},"PeriodicalIF":4.1000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505625002382","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Patients with rare diseases frequently experience misdiagnoses and long diagnostic delays. Accelerating their diagnosis is essential to ensure timely access to appropriate care. Given the increasing availability of EHRs, combining artificial intelligence and deep phenotyping from large-scale clinical databases offers a promising approach to identify undiagnosed patients. This study assesses the impact of improved phenotype extraction on a screening algorithm for Jeune syndrome, a rare ciliopathy characterized by skeletal abnormalities.
Methods
Phenotypes from Jeune syndrome patients and controls were automatically extracted from patient unstructured EHRs relying on two thesauri separately: the standard UMLS Metathesaurus and the UMLS+, an enhanced version incorporating additional terms identified through deep learning. The machine learning pipeline that we designed for classifying patients with renal ciliopathy was adapted for Jeune syndrome detection. The model was trained and tested on both the datasets created using the two phenotyping strategies.
Results
Using UMLS+ strongly improved the classification of patients with Jeune syndrome, increasing the sensitivity from 49 % to 95 % while maintaining a 90 % specificity. The review of a subset of misclassified controls showed that most of them (69 %) had other genetic skeletal disorders, indicating that the model also captured patients who would benefit from referral to a bone disease geneticist.
Conclusion
AI-based screening combined with high-quality deep phenotyping can help reduce diagnostic delay in rare diseases. The completeness and accuracy of phenotyping from EHRs have a strong impact on screening performances.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.