评价ChatGPT在回答哮喘相关问题中的准确性。

IF 3 4区医学 Q2 RESPIRATORY SYSTEM

Jornal Brasileiro De Pneumologia Pub Date : 2025-09-08 eCollection Date: 2025-01-01 DOI:10.36416/1806-3756/e20240388

Bruno Pellozo Cerqueira, Vinicius Cappellette da Silva Leite, Carla Gonzaga França, Fernando Sergio Leitão Filho, Sonia Maria Faresin, Ricardo Gassmann Figueiredo, Andrea Antunes Cetlin, Lilian Serrasqueiro Ballini Caetano, José Baddini-Martinez

{"title":"评价ChatGPT在回答哮喘相关问题中的准确性。","authors":"Bruno Pellozo Cerqueira, Vinicius Cappellette da Silva Leite, Carla Gonzaga França, Fernando Sergio Leitão Filho, Sonia Maria Faresin, Ricardo Gassmann Figueiredo, Andrea Antunes Cetlin, Lilian Serrasqueiro Ballini Caetano, José Baddini-Martinez","doi":"10.36416/1806-3756/e20240388","DOIUrl":null,"url":null,"abstract":"Objective: To evaluate the quality of ChatGPT answers to asthma-related questions, as assessed from the perspectives of asthma specialists and laypersons.Methods: Seven asthma-related questions were asked to ChatGPT (version 4) between May 3, 2024 and May 4, 2024. The questions were standardized with no memory of previous conversations to avoid bias. Six pulmonologists with extensive expertise in asthma acted as judges, independently assessing the quality and reproducibility of the answers from the perspectives of asthma specialists and laypersons. A Likert scale ranging from 1 to 4 was used, and the content validity coefficient was calculated to assess the level of agreement among the judges.Results: The evaluations showed variability in the quality of the answers provided by ChatGPT. From the perspective of asthma specialists, the scores ranged from 2 to 3, with greater divergence in questions 2, 3, and 5. From the perspective of laypersons, the content validity coefficient exceeded 0.80 for four of the seven questions, with most answers being correct despite a lack of significant depth.Conclusions: Although ChatGPT performed well in providing answers to laypersons, the answers that it provided to specialists were less accurate and superficial. Although AI has the potential to provide useful information to the public, it should not replace medical guidance. Critical analysis of AI-generated information remains essential for health care professionals and laypersons alike, especially for complex conditions such as asthma.","PeriodicalId":14845,"journal":{"name":"Jornal Brasileiro De Pneumologia","volume":"51 3","pages":"e20240388"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12401147/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluation of the accuracy of ChatGPT in answering asthma-related questions.\",\"authors\":\"Bruno Pellozo Cerqueira, Vinicius Cappellette da Silva Leite, Carla Gonzaga França, Fernando Sergio Leitão Filho, Sonia Maria Faresin, Ricardo Gassmann Figueiredo, Andrea Antunes Cetlin, Lilian Serrasqueiro Ballini Caetano, José Baddini-Martinez\",\"doi\":\"10.36416/1806-3756/e20240388\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objective: To evaluate the quality of ChatGPT answers to asthma-related questions, as assessed from the perspectives of asthma specialists and laypersons.Methods: Seven asthma-related questions were asked to ChatGPT (version 4) between May 3, 2024 and May 4, 2024. The questions were standardized with no memory of previous conversations to avoid bias. Six pulmonologists with extensive expertise in asthma acted as judges, independently assessing the quality and reproducibility of the answers from the perspectives of asthma specialists and laypersons. A Likert scale ranging from 1 to 4 was used, and the content validity coefficient was calculated to assess the level of agreement among the judges.Results: The evaluations showed variability in the quality of the answers provided by ChatGPT. From the perspective of asthma specialists, the scores ranged from 2 to 3, with greater divergence in questions 2, 3, and 5. From the perspective of laypersons, the content validity coefficient exceeded 0.80 for four of the seven questions, with most answers being correct despite a lack of significant depth.Conclusions: Although ChatGPT performed well in providing answers to laypersons, the answers that it provided to specialists were less accurate and superficial. Although AI has the potential to provide useful information to the public, it should not replace medical guidance. Critical analysis of AI-generated information remains essential for health care professionals and laypersons alike, especially for complex conditions such as asthma.\",\"PeriodicalId\":14845,\"journal\":{\"name\":\"Jornal Brasileiro De Pneumologia\",\"volume\":\"51 3\",\"pages\":\"e20240388\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12401147/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jornal Brasileiro De Pneumologia\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.36416/1806-3756/e20240388\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"RESPIRATORY SYSTEM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jornal Brasileiro De Pneumologia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.36416/1806-3756/e20240388","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"RESPIRATORY SYSTEM","Score":null,"Total":0}

引用次数: 0

摘要

目的：从哮喘专家和外行人的角度评价ChatGPT对哮喘相关问题的回答质量。方法：于2024年5月3日至2024年5月4日在ChatGPT（第4版）中询问7个与哮喘相关的问题。为了避免偏见，这些问题被标准化了，没有对之前的对话进行记忆。六名在哮喘方面具有丰富专业知识的肺科医生担任评委，从哮喘专家和外行人的角度独立评估答案的质量和可重复性。采用1 ~ 4分的李克特量表，计算内容效度系数，评价评委之间的一致程度。结果：评估显示ChatGPT提供的答案质量存在差异。从哮喘专家的角度来看，得分范围从2到3，问题2、3和5的差异更大。从外行人的角度来看，7个问题中有4个问题的内容效度系数超过0.80，大多数答案是正确的，但缺乏显著的深度。结论：虽然ChatGPT在为外行人提供答案方面表现良好，但它为专家提供的答案不够准确和肤浅。虽然人工智能有可能向公众提供有用的信息，但它不应取代医疗指导。对人工智能产生的信息进行批判性分析对于卫生保健专业人员和非专业人员来说仍然至关重要，特别是对于哮喘等复杂疾病。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Evaluation of the accuracy of ChatGPT in answering asthma-related questions.

查看原文本刊更多论文

Evaluation of the accuracy of ChatGPT in answering asthma-related questions.

Objective: To evaluate the quality of ChatGPT answers to asthma-related questions, as assessed from the perspectives of asthma specialists and laypersons.

Methods: Seven asthma-related questions were asked to ChatGPT (version 4) between May 3, 2024 and May 4, 2024. The questions were standardized with no memory of previous conversations to avoid bias. Six pulmonologists with extensive expertise in asthma acted as judges, independently assessing the quality and reproducibility of the answers from the perspectives of asthma specialists and laypersons. A Likert scale ranging from 1 to 4 was used, and the content validity coefficient was calculated to assess the level of agreement among the judges.

Results: The evaluations showed variability in the quality of the answers provided by ChatGPT. From the perspective of asthma specialists, the scores ranged from 2 to 3, with greater divergence in questions 2, 3, and 5. From the perspective of laypersons, the content validity coefficient exceeded 0.80 for four of the seven questions, with most answers being correct despite a lack of significant depth.

Conclusions: Although ChatGPT performed well in providing answers to laypersons, the answers that it provided to specialists were less accurate and superficial. Although AI has the potential to provide useful information to the public, it should not replace medical guidance. Critical analysis of AI-generated information remains essential for health care professionals and laypersons alike, especially for complex conditions such as asthma.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Jornal Brasileiro De Pneumologia RESPIRATORY SYSTEM-

CiteScore

3.50

自引率

14.80%

发文量

118

审稿时长

20 weeks

期刊介绍： The Brazilian Journal of Pulmonology publishes scientific articles that contribute to the improvement of knowledge in the field of the lung diseases and related areas.