Gabriel Katz, Ofira Zloto, Avner Hostovsky, Ruth Huna-Baron, Iris Ben-Bassat Mizrachi, Zvia Burgansky, Alon Skaat, Vicktoria Vishnevskia-Dai, Ido Didi Fabian, Oded Sagiv, Ayelet Priel, Benjamin S Glicksberg, Eyal Klang
{"title":"聊天GPT与经验丰富的眼科医生:评估聊天机器人在眼科中的写作表现。","authors":"Gabriel Katz, Ofira Zloto, Avner Hostovsky, Ruth Huna-Baron, Iris Ben-Bassat Mizrachi, Zvia Burgansky, Alon Skaat, Vicktoria Vishnevskia-Dai, Ido Didi Fabian, Oded Sagiv, Ayelet Priel, Benjamin S Glicksberg, Eyal Klang","doi":"10.1038/s41433-025-03779-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To examine the abilities of ChatGPT in writing scientific ophthalmology introductions and to compare those abilities to experienced ophthalmologists.</p><p><strong>Methods: </strong>OpenAI web interface was utilized to interact with and prompt ChatGPT 4 for generating the introductions for the selected papers. Consequently, each paper had two introductions-one drafted by ChatGPT and the other by the original author. Ten ophthalmology specialists with a minimal experience of more than 15 years, each representing distinct subspecialties-retina, neuro-ophthalmology, oculoplastic, glaucoma, and ocular oncology were provided with the two sets of introductions without revealing the origin (ChatGPT or human author) and were tasked to evaluate the introductions.</p><p><strong>Results: </strong>For each type of introduction, out of 45 instances, specialists correctly identified the source 26 times (57.7%) and erred 19 times (42.2%). The misclassification rates for introductions were 25% for experts evaluating introductions from their own subspecialty while to 44.4% for experts assessed introductions outside their subspecialty domain. In the comparative evaluation of introductions written by ChatGPT and human authors, no significant difference was identified across the assessed metrics (language, data arrangement, factual accuracy, originality, data Currency). The misclassification rate (the frequency at which reviewers incorrectly identified the authorship) was highest in Oculoplastic (66.7%) and lowest in Retina (11.1%).</p><p><strong>Conclusions: </strong>ChatGPT represents a significant advancement in facilitating the creation of original scientific papers in ophthalmology. The introductions generated by ChatGPT showed no statistically significant difference compared to those written by experts in terms of language, data organization, factual accuracy, originality, and the currency of information. In addition, nearly half of them being indistinguishable from the originals. Future research endeavours should explore ChatGPT-4's utility in composing other sections of research papers and delve into the associated ethical considerations.</p>","PeriodicalId":12125,"journal":{"name":"Eye","volume":" ","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chat GPT vs an experienced ophthalmologist: evaluating chatbot writing performance in ophthalmology.\",\"authors\":\"Gabriel Katz, Ofira Zloto, Avner Hostovsky, Ruth Huna-Baron, Iris Ben-Bassat Mizrachi, Zvia Burgansky, Alon Skaat, Vicktoria Vishnevskia-Dai, Ido Didi Fabian, Oded Sagiv, Ayelet Priel, Benjamin S Glicksberg, Eyal Klang\",\"doi\":\"10.1038/s41433-025-03779-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To examine the abilities of ChatGPT in writing scientific ophthalmology introductions and to compare those abilities to experienced ophthalmologists.</p><p><strong>Methods: </strong>OpenAI web interface was utilized to interact with and prompt ChatGPT 4 for generating the introductions for the selected papers. Consequently, each paper had two introductions-one drafted by ChatGPT and the other by the original author. Ten ophthalmology specialists with a minimal experience of more than 15 years, each representing distinct subspecialties-retina, neuro-ophthalmology, oculoplastic, glaucoma, and ocular oncology were provided with the two sets of introductions without revealing the origin (ChatGPT or human author) and were tasked to evaluate the introductions.</p><p><strong>Results: </strong>For each type of introduction, out of 45 instances, specialists correctly identified the source 26 times (57.7%) and erred 19 times (42.2%). The misclassification rates for introductions were 25% for experts evaluating introductions from their own subspecialty while to 44.4% for experts assessed introductions outside their subspecialty domain. In the comparative evaluation of introductions written by ChatGPT and human authors, no significant difference was identified across the assessed metrics (language, data arrangement, factual accuracy, originality, data Currency). The misclassification rate (the frequency at which reviewers incorrectly identified the authorship) was highest in Oculoplastic (66.7%) and lowest in Retina (11.1%).</p><p><strong>Conclusions: </strong>ChatGPT represents a significant advancement in facilitating the creation of original scientific papers in ophthalmology. The introductions generated by ChatGPT showed no statistically significant difference compared to those written by experts in terms of language, data organization, factual accuracy, originality, and the currency of information. In addition, nearly half of them being indistinguishable from the originals. Future research endeavours should explore ChatGPT-4's utility in composing other sections of research papers and delve into the associated ethical considerations.</p>\",\"PeriodicalId\":12125,\"journal\":{\"name\":\"Eye\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Eye\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1038/s41433-025-03779-1\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eye","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41433-025-03779-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
Chat GPT vs an experienced ophthalmologist: evaluating chatbot writing performance in ophthalmology.
Purpose: To examine the abilities of ChatGPT in writing scientific ophthalmology introductions and to compare those abilities to experienced ophthalmologists.
Methods: OpenAI web interface was utilized to interact with and prompt ChatGPT 4 for generating the introductions for the selected papers. Consequently, each paper had two introductions-one drafted by ChatGPT and the other by the original author. Ten ophthalmology specialists with a minimal experience of more than 15 years, each representing distinct subspecialties-retina, neuro-ophthalmology, oculoplastic, glaucoma, and ocular oncology were provided with the two sets of introductions without revealing the origin (ChatGPT or human author) and were tasked to evaluate the introductions.
Results: For each type of introduction, out of 45 instances, specialists correctly identified the source 26 times (57.7%) and erred 19 times (42.2%). The misclassification rates for introductions were 25% for experts evaluating introductions from their own subspecialty while to 44.4% for experts assessed introductions outside their subspecialty domain. In the comparative evaluation of introductions written by ChatGPT and human authors, no significant difference was identified across the assessed metrics (language, data arrangement, factual accuracy, originality, data Currency). The misclassification rate (the frequency at which reviewers incorrectly identified the authorship) was highest in Oculoplastic (66.7%) and lowest in Retina (11.1%).
Conclusions: ChatGPT represents a significant advancement in facilitating the creation of original scientific papers in ophthalmology. The introductions generated by ChatGPT showed no statistically significant difference compared to those written by experts in terms of language, data organization, factual accuracy, originality, and the currency of information. In addition, nearly half of them being indistinguishable from the originals. Future research endeavours should explore ChatGPT-4's utility in composing other sections of research papers and delve into the associated ethical considerations.
期刊介绍:
Eye seeks to provide the international practising ophthalmologist with high quality articles, of academic rigour, on the latest global clinical and laboratory based research. Its core aim is to advance the science and practice of ophthalmology with the latest clinical- and scientific-based research. Whilst principally aimed at the practising clinician, the journal contains material of interest to a wider readership including optometrists, orthoptists, other health care professionals and research workers in all aspects of the field of visual science worldwide. Eye is the official journal of The Royal College of Ophthalmologists.
Eye encourages the submission of original articles covering all aspects of ophthalmology including: external eye disease; oculo-plastic surgery; orbital and lacrimal disease; ocular surface and corneal disorders; paediatric ophthalmology and strabismus; glaucoma; medical and surgical retina; neuro-ophthalmology; cataract and refractive surgery; ocular oncology; ophthalmic pathology; ophthalmic genetics.