Tucker Callanan, Josue Marquez, Claire Pisani, Phillip Schmitt, John Pietro, Miaoyan Chen, John Milner, Mohammad Daher, Luka Katz, Jonathan Liu, Alan H Daniels
{"title":"在已发表的骨科研究中评估基于人工智能的写作辅助:未来解读的检测和趋势。","authors":"Tucker Callanan, Josue Marquez, Claire Pisani, Phillip Schmitt, John Pietro, Miaoyan Chen, John Milner, Mohammad Daher, Luka Katz, Jonathan Liu, Alan H Daniels","doi":"10.2106/JBJS.24.01462","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The integration of artificial intelligence (AI), particularly large language models (LLMs), into scientific writing has led to questions about its ethics, prevalence, and impact in orthopaedic literature. While tools have been developed to detect AI-generated content, the interpretation of AI detection percentages and their clinical relevance remain unclear. The aim of this study was to quantify AI involvement in published orthopaedic manuscripts and to establish a statistical threshold for interpreting AI detection percentages.</p><p><strong>Methods: </strong>To establish a baseline, 300 manuscripts published in the year 2000 were analyzed for AI-generated content with use of ZeroGPT. This was followed by an analysis of 3,374 consecutive orthopaedic manuscripts published after the release of ChatGPT. A 95% confidence interval was calculated in order to set a threshold for significant AI involvement. Manuscripts with AI detection percentages above this threshold (32.875%) were considered to have significant AI involvement in their content generation.</p><p><strong>Results: </strong>Empirical analysis of the 300 pre-AI-era manuscripts revealed a mean AI detection percentage (and standard deviation [SD]) of 10.84% ± 11.02%. Among the 3,374 post-AI-era manuscripts analyzed, 16.7% exceeded the AI detection threshold of 32.875% (2 SDs above the baseline for the pre-AI era), indicating significant AI involvement. No significant difference was found between primary manuscripts and review studies (percentage with significant AI involvement, 16.4% and 18.2%, respectively; p = 0.40). Significant AI involvement varied significantly across journals, with rates ranging from 5.6% in The American Journal of Sports Medicine to 38.3% in The Journal of Bone & Joint Surgery (p < 0.001).</p><p><strong>Conclusions: </strong>This study examined AI assistance in the writing of published orthopaedic manuscripts and provides the first evidence-based threshold for interpreting AI detection percentages. Our results revealed significant AI involvement in 16.7% of recently published orthopaedic literature. This finding highlights the importance of clear guidelines, ethical standards, responsible AI use, and improved detection tools to maintain the quality, authenticity, and integrity of orthopaedic research.</p>","PeriodicalId":15273,"journal":{"name":"Journal of Bone and Joint Surgery, American Volume","volume":" ","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Artificial Intelligence-Based Writing Assistance Among Published Orthopaedic Studies: Detection and Trends for Future Interpretation.\",\"authors\":\"Tucker Callanan, Josue Marquez, Claire Pisani, Phillip Schmitt, John Pietro, Miaoyan Chen, John Milner, Mohammad Daher, Luka Katz, Jonathan Liu, Alan H Daniels\",\"doi\":\"10.2106/JBJS.24.01462\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The integration of artificial intelligence (AI), particularly large language models (LLMs), into scientific writing has led to questions about its ethics, prevalence, and impact in orthopaedic literature. While tools have been developed to detect AI-generated content, the interpretation of AI detection percentages and their clinical relevance remain unclear. The aim of this study was to quantify AI involvement in published orthopaedic manuscripts and to establish a statistical threshold for interpreting AI detection percentages.</p><p><strong>Methods: </strong>To establish a baseline, 300 manuscripts published in the year 2000 were analyzed for AI-generated content with use of ZeroGPT. This was followed by an analysis of 3,374 consecutive orthopaedic manuscripts published after the release of ChatGPT. A 95% confidence interval was calculated in order to set a threshold for significant AI involvement. Manuscripts with AI detection percentages above this threshold (32.875%) were considered to have significant AI involvement in their content generation.</p><p><strong>Results: </strong>Empirical analysis of the 300 pre-AI-era manuscripts revealed a mean AI detection percentage (and standard deviation [SD]) of 10.84% ± 11.02%. Among the 3,374 post-AI-era manuscripts analyzed, 16.7% exceeded the AI detection threshold of 32.875% (2 SDs above the baseline for the pre-AI era), indicating significant AI involvement. No significant difference was found between primary manuscripts and review studies (percentage with significant AI involvement, 16.4% and 18.2%, respectively; p = 0.40). Significant AI involvement varied significantly across journals, with rates ranging from 5.6% in The American Journal of Sports Medicine to 38.3% in The Journal of Bone & Joint Surgery (p < 0.001).</p><p><strong>Conclusions: </strong>This study examined AI assistance in the writing of published orthopaedic manuscripts and provides the first evidence-based threshold for interpreting AI detection percentages. Our results revealed significant AI involvement in 16.7% of recently published orthopaedic literature. This finding highlights the importance of clear guidelines, ethical standards, responsible AI use, and improved detection tools to maintain the quality, authenticity, and integrity of orthopaedic research.</p>\",\"PeriodicalId\":15273,\"journal\":{\"name\":\"Journal of Bone and Joint Surgery, American Volume\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Bone and Joint Surgery, American Volume\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2106/JBJS.24.01462\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ORTHOPEDICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bone and Joint Surgery, American Volume","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2106/JBJS.24.01462","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
Evaluating Artificial Intelligence-Based Writing Assistance Among Published Orthopaedic Studies: Detection and Trends for Future Interpretation.
Background: The integration of artificial intelligence (AI), particularly large language models (LLMs), into scientific writing has led to questions about its ethics, prevalence, and impact in orthopaedic literature. While tools have been developed to detect AI-generated content, the interpretation of AI detection percentages and their clinical relevance remain unclear. The aim of this study was to quantify AI involvement in published orthopaedic manuscripts and to establish a statistical threshold for interpreting AI detection percentages.
Methods: To establish a baseline, 300 manuscripts published in the year 2000 were analyzed for AI-generated content with use of ZeroGPT. This was followed by an analysis of 3,374 consecutive orthopaedic manuscripts published after the release of ChatGPT. A 95% confidence interval was calculated in order to set a threshold for significant AI involvement. Manuscripts with AI detection percentages above this threshold (32.875%) were considered to have significant AI involvement in their content generation.
Results: Empirical analysis of the 300 pre-AI-era manuscripts revealed a mean AI detection percentage (and standard deviation [SD]) of 10.84% ± 11.02%. Among the 3,374 post-AI-era manuscripts analyzed, 16.7% exceeded the AI detection threshold of 32.875% (2 SDs above the baseline for the pre-AI era), indicating significant AI involvement. No significant difference was found between primary manuscripts and review studies (percentage with significant AI involvement, 16.4% and 18.2%, respectively; p = 0.40). Significant AI involvement varied significantly across journals, with rates ranging from 5.6% in The American Journal of Sports Medicine to 38.3% in The Journal of Bone & Joint Surgery (p < 0.001).
Conclusions: This study examined AI assistance in the writing of published orthopaedic manuscripts and provides the first evidence-based threshold for interpreting AI detection percentages. Our results revealed significant AI involvement in 16.7% of recently published orthopaedic literature. This finding highlights the importance of clear guidelines, ethical standards, responsible AI use, and improved detection tools to maintain the quality, authenticity, and integrity of orthopaedic research.
期刊介绍:
The Journal of Bone & Joint Surgery (JBJS) has been the most valued source of information for orthopaedic surgeons and researchers for over 125 years and is the gold standard in peer-reviewed scientific information in the field. A core journal and essential reading for general as well as specialist orthopaedic surgeons worldwide, The Journal publishes evidence-based research to enhance the quality of care for orthopaedic patients. Standards of excellence and high quality are maintained in everything we do, from the science of the content published to the customer service we provide. JBJS is an independent, non-profit journal.