100篇肿瘤学研究文章的人工智能辅助统计审查:符合SAMPL指南

IF 3 4区 医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL
Michal Ordak
{"title":"100篇肿瘤学研究文章的人工智能辅助统计审查:符合SAMPL指南","authors":"Michal Ordak","doi":"10.1016/j.retram.2025.103544","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div><strong>:</strong> Ensuring accurate statistical reporting is critical in oncology research, where data-driven conclusions impact clinical decision-making. Despite standardized guidelines such as the Statistical Analyses and Methods in the Published Literature (SAMPL), adherence remains inconsistent. This study evaluates the performance of Gemini Advanced 2.0 Flash, an AI model, in assessing compliance with SAMPL guidelines in oncology research articles.</div></div><div><h3>Methods</h3><div><strong>:</strong> A total of 100 original research articles published in four peer-reviewed oncology journals (October 2024–February 2025) were analyzed. Gemini Advanced 2.0 Flash assessed adherence to ten key SAMPL guidelines, categorizing each as \"not met,\" \"partially met,\" or \"fully met.\" AI evaluations were compared with independent assessments by a statistical editor, with agreement quantified using Cohen’s Kappa coefficient.</div></div><div><h3>Results</h3><div>The overall weighted Kappa coefficient was 0.77 (95 % CI: 0.6–0.94), indicating substantial agreement between AI and manual assessment. Full agreement (Kappa = 1) was found for four guidelines, including naming statistical packages and reporting confidence intervals. High agreement was observed for specifying statistical methods (Kappa = 0.85) and confirming test assumptions (Kappa = 0.75). Moderate agreement was noted for summarizing non-normally distributed data (Kappa = 0.42) and specifying test directionality (Kappa = 0.43). The lowest agreement (Kappa = 0.37) was observed in multiple comparison adjustments due to missing justifications for post hoc tests.</div></div><div><h3>Conclusion</h3><div><strong>:</strong> AI-assisted evaluation showed substantial agreement with expert assessment, demonstrating its potential in statistical review. However, discrepancies in specific guidelines suggest human oversight remains essential for ensuring statistical rigor in oncology research. Further refinement of AI models may enhance their reliability in scientific publishing.</div></div>","PeriodicalId":54260,"journal":{"name":"Current Research in Translational Medicine","volume":"73 4","pages":"Article 103544"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AI-assisted statistical review of 100 oncology research articles: compliance with SAMPL guidelines\",\"authors\":\"Michal Ordak\",\"doi\":\"10.1016/j.retram.2025.103544\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div><strong>:</strong> Ensuring accurate statistical reporting is critical in oncology research, where data-driven conclusions impact clinical decision-making. Despite standardized guidelines such as the Statistical Analyses and Methods in the Published Literature (SAMPL), adherence remains inconsistent. This study evaluates the performance of Gemini Advanced 2.0 Flash, an AI model, in assessing compliance with SAMPL guidelines in oncology research articles.</div></div><div><h3>Methods</h3><div><strong>:</strong> A total of 100 original research articles published in four peer-reviewed oncology journals (October 2024–February 2025) were analyzed. Gemini Advanced 2.0 Flash assessed adherence to ten key SAMPL guidelines, categorizing each as \\\"not met,\\\" \\\"partially met,\\\" or \\\"fully met.\\\" AI evaluations were compared with independent assessments by a statistical editor, with agreement quantified using Cohen’s Kappa coefficient.</div></div><div><h3>Results</h3><div>The overall weighted Kappa coefficient was 0.77 (95 % CI: 0.6–0.94), indicating substantial agreement between AI and manual assessment. Full agreement (Kappa = 1) was found for four guidelines, including naming statistical packages and reporting confidence intervals. High agreement was observed for specifying statistical methods (Kappa = 0.85) and confirming test assumptions (Kappa = 0.75). Moderate agreement was noted for summarizing non-normally distributed data (Kappa = 0.42) and specifying test directionality (Kappa = 0.43). The lowest agreement (Kappa = 0.37) was observed in multiple comparison adjustments due to missing justifications for post hoc tests.</div></div><div><h3>Conclusion</h3><div><strong>:</strong> AI-assisted evaluation showed substantial agreement with expert assessment, demonstrating its potential in statistical review. However, discrepancies in specific guidelines suggest human oversight remains essential for ensuring statistical rigor in oncology research. Further refinement of AI models may enhance their reliability in scientific publishing.</div></div>\",\"PeriodicalId\":54260,\"journal\":{\"name\":\"Current Research in Translational Medicine\",\"volume\":\"73 4\",\"pages\":\"Article 103544\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current Research in Translational Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2452318625000534\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452318625000534","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:确保准确的统计报告在肿瘤研究中至关重要,其中数据驱动的结论影响临床决策。尽管标准化的指导方针,如统计分析和方法发表文献(SAMPL),坚持仍然不一致。本研究评估了人工智能模型Gemini Advanced 2.0 Flash在评估肿瘤学研究文章是否符合SAMPL指南方面的表现。方法:对4种同行评议肿瘤学期刊(2024年10月- 2025年2月)发表的100篇原创研究论文进行分析。Gemini Advanced 2.0 Flash评估了10个关键SAMPL准则的遵守情况,将每个准则分为“未满足”、“部分满足”或“完全满足”。统计编辑将人工智能评估与独立评估进行比较,并使用科恩的Kappa系数对一致性进行量化。结果总体加权Kappa系数为0.77 (95% CI: 0.6 ~ 0.94),表明人工智能评估与人工评估基本一致。发现四个准则完全一致(Kappa = 1),包括命名统计包和报告置信区间。在指定统计方法(Kappa = 0.85)和确认测试假设(Kappa = 0.75)方面观察到高度一致性。在总结非正态分布数据(Kappa = 0.42)和指定测试方向性(Kappa = 0.43)方面,一致性中等。由于缺乏事后检验的理由,在多次比较调整中观察到最低的一致性(Kappa = 0.37)。结论:人工智能辅助评估与专家评估基本一致,显示了其在统计审查中的潜力。然而,具体指导方针的差异表明,人为监督对于确保肿瘤研究中的统计严谨性仍然至关重要。人工智能模型的进一步完善可能会提高它们在科学出版中的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
AI-assisted statistical review of 100 oncology research articles: compliance with SAMPL guidelines

Background

: Ensuring accurate statistical reporting is critical in oncology research, where data-driven conclusions impact clinical decision-making. Despite standardized guidelines such as the Statistical Analyses and Methods in the Published Literature (SAMPL), adherence remains inconsistent. This study evaluates the performance of Gemini Advanced 2.0 Flash, an AI model, in assessing compliance with SAMPL guidelines in oncology research articles.

Methods

: A total of 100 original research articles published in four peer-reviewed oncology journals (October 2024–February 2025) were analyzed. Gemini Advanced 2.0 Flash assessed adherence to ten key SAMPL guidelines, categorizing each as "not met," "partially met," or "fully met." AI evaluations were compared with independent assessments by a statistical editor, with agreement quantified using Cohen’s Kappa coefficient.

Results

The overall weighted Kappa coefficient was 0.77 (95 % CI: 0.6–0.94), indicating substantial agreement between AI and manual assessment. Full agreement (Kappa = 1) was found for four guidelines, including naming statistical packages and reporting confidence intervals. High agreement was observed for specifying statistical methods (Kappa = 0.85) and confirming test assumptions (Kappa = 0.75). Moderate agreement was noted for summarizing non-normally distributed data (Kappa = 0.42) and specifying test directionality (Kappa = 0.43). The lowest agreement (Kappa = 0.37) was observed in multiple comparison adjustments due to missing justifications for post hoc tests.

Conclusion

: AI-assisted evaluation showed substantial agreement with expert assessment, demonstrating its potential in statistical review. However, discrepancies in specific guidelines suggest human oversight remains essential for ensuring statistical rigor in oncology research. Further refinement of AI models may enhance their reliability in scientific publishing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Current Research in Translational Medicine
Current Research in Translational Medicine Biochemistry, Genetics and Molecular Biology-General Biochemistry,Genetics and Molecular Biology
CiteScore
7.00
自引率
4.90%
发文量
51
审稿时长
45 days
期刊介绍: Current Research in Translational Medicine is a peer-reviewed journal, publishing worldwide clinical and basic research in the field of hematology, immunology, infectiology, hematopoietic cell transplantation, and cellular and gene therapy. The journal considers for publication English-language editorials, original articles, reviews, and short reports including case-reports. Contributions are intended to draw attention to experimental medicine and translational research. Current Research in Translational Medicine periodically publishes thematic issues and is indexed in all major international databases (2017 Impact Factor is 1.9). Core areas covered in Current Research in Translational Medicine are: Hematology, Immunology, Infectiology, Hematopoietic, Cell Transplantation, Cellular and Gene Therapy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信