Expertise or Hallucination? A Comprehensive Evaluation of ChatGPT's Aptitude in Clinical Genetics

IF 5.7 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Big Data Pub Date : 2025-01-30 DOI:10.1109/TBDATA.2025.3536939

Yingbo Zhang;Shumin Ren;Jiao Wang;Chaoying Zhan;Mengqiao He;Xingyun Liu;Rongrong Wu;Jing Zhao;Cong Wu;Chuanzhu Fan;Bairong Shen

{"title":"Expertise or Hallucination? A Comprehensive Evaluation of ChatGPT's Aptitude in Clinical Genetics","authors":"Yingbo Zhang;Shumin Ren;Jiao Wang;Chaoying Zhan;Mengqiao He;Xingyun Liu;Rongrong Wu;Jing Zhao;Cong Wu;Chuanzhu Fan;Bairong Shen","doi":"10.1109/TBDATA.2025.3536939","DOIUrl":null,"url":null,"abstract":"Whether viewed as an expert or as a source of ‘knowledge hallucination’, the use of ChatGPT in medical practice has stirred ongoing debate. This study sought to evaluate ChatGPT's capabilities in the field of clinical genetics, focusing on tasks such as ‘Clinical genetics exams’, ‘Associations between genetic diseases and pathogenic genes’, and ‘Limitations and trends in clinical genetics’. Results indicated that ChatGPT performed exceptionally well in question-answering tasks, particularly in clinical genetics exams and diagnosing single-gene diseases. It also effectively outlined the current limitations and prospective trends in clinical genetics. However, ChatGPT struggled to provide comprehensive answers regarding multi-gene or epigenetic diseases, particularly with respect to genetic variations or chromosomal abnormalities. In terms of systematic summarization and inference, some randomness was evident in ChatGPT's responses. In summary, while ChatGPT possesses a foundational understanding of general knowledge in clinical genetics due to hyperparameter learning, it encounters significant challenges when delving into specialized knowledge and navigating the complexities of clinical genetics, particularly in mitigating ‘Knowledge Hallucination’. To optimize its performance and depth of expertise in clinical genetics, integration with specialized knowledge databases and knowledge graphs is imperative.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"919-932"},"PeriodicalIF":5.7000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10858419/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Whether viewed as an expert or as a source of ‘knowledge hallucination’, the use of ChatGPT in medical practice has stirred ongoing debate. This study sought to evaluate ChatGPT's capabilities in the field of clinical genetics, focusing on tasks such as ‘Clinical genetics exams’, ‘Associations between genetic diseases and pathogenic genes’, and ‘Limitations and trends in clinical genetics’. Results indicated that ChatGPT performed exceptionally well in question-answering tasks, particularly in clinical genetics exams and diagnosing single-gene diseases. It also effectively outlined the current limitations and prospective trends in clinical genetics. However, ChatGPT struggled to provide comprehensive answers regarding multi-gene or epigenetic diseases, particularly with respect to genetic variations or chromosomal abnormalities. In terms of systematic summarization and inference, some randomness was evident in ChatGPT's responses. In summary, while ChatGPT possesses a foundational understanding of general knowledge in clinical genetics due to hyperparameter learning, it encounters significant challenges when delving into specialized knowledge and navigating the complexities of clinical genetics, particularly in mitigating ‘Knowledge Hallucination’. To optimize its performance and depth of expertise in clinical genetics, integration with specialized knowledge databases and knowledge graphs is imperative.

查看原文本刊更多论文

专业还是幻觉？ChatGPT在临床遗传学上的综合评价

无论是被视为专家还是“知识幻觉”的来源，ChatGPT在医疗实践中的使用都引发了持续的争论。该研究旨在评估ChatGPT在临床遗传学领域的能力，重点关注诸如“临床遗传学检查”、“遗传疾病与致病基因之间的关联”和“临床遗传学的局限性和趋势”等任务。结果表明，ChatGPT在问答任务中表现异常出色，特别是在临床遗传学检查和诊断单基因疾病方面。它还有效地概述了临床遗传学目前的局限性和未来的趋势。然而，ChatGPT努力提供关于多基因或表观遗传疾病的全面答案，特别是关于遗传变异或染色体异常。在系统总结和推理方面，ChatGPT的回答有明显的随机性。总之，由于超参数学习，ChatGPT对临床遗传学的一般知识有了基本的了解，但在深入研究专业知识和驾驭临床遗传学的复杂性时，它遇到了重大挑战，特别是在减轻“知识幻觉”方面。为了优化其在临床遗传学方面的性能和专业知识的深度，与专业知识数据库和知识图谱的集成是必不可少的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.