Cameron C. Young, Ellie Enichen, Christian Rivera, Corinne A. Auger, Nathan Grant, Arya Rao, Marc D. Succi
{"title":"定制大语言模型对罕见儿科疾病病例报告的诊断准确性","authors":"Cameron C. Young, Ellie Enichen, Christian Rivera, Corinne A. Auger, Nathan Grant, Arya Rao, Marc D. Succi","doi":"10.1002/ajmg.a.63878","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT-4, Gemini Pro, and a custom-built LLM (GPT-4 integrated with the Human Phenotype Ontology [GPT-4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT-4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT-4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT-4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT-4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain-specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.</p>\n </div>","PeriodicalId":7507,"journal":{"name":"American Journal of Medical Genetics Part A","volume":"197 2","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports\",\"authors\":\"Cameron C. Young, Ellie Enichen, Christian Rivera, Corinne A. Auger, Nathan Grant, Arya Rao, Marc D. Succi\",\"doi\":\"10.1002/ajmg.a.63878\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT-4, Gemini Pro, and a custom-built LLM (GPT-4 integrated with the Human Phenotype Ontology [GPT-4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT-4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT-4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT-4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT-4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain-specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.</p>\\n </div>\",\"PeriodicalId\":7507,\"journal\":{\"name\":\"American Journal of Medical Genetics Part A\",\"volume\":\"197 2\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Medical Genetics Part A\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ajmg.a.63878\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Medical Genetics Part A","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ajmg.a.63878","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports
Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT-4, Gemini Pro, and a custom-built LLM (GPT-4 integrated with the Human Phenotype Ontology [GPT-4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT-4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT-4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT-4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT-4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain-specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.
期刊介绍:
The American Journal of Medical Genetics - Part A (AJMG) gives you continuous coverage of all biological and medical aspects of genetic disorders and birth defects, as well as in-depth documentation of phenotype analysis within the current context of genotype/phenotype correlations. In addition to Part A , AJMG also publishes two other parts:
Part B: Neuropsychiatric Genetics , covering experimental and clinical investigations of the genetic mechanisms underlying neurologic and psychiatric disorders.
Part C: Seminars in Medical Genetics , guest-edited collections of thematic reviews of topical interest to the readership of AJMG .