Luke Farrow, Arslan Raja, Mingjun Zhong, Lesley Anderson
{"title":"A systematic review of natural language processing applications in Trauma & Orthopaedics.","authors":"Luke Farrow, Arslan Raja, Mingjun Zhong, Lesley Anderson","doi":"10.1302/2633-1462.63.BJO-2024-0081.R1","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>Prevalence of artificial intelligence (AI) algorithms within the Trauma & Orthopaedics (T&O) literature has greatly increased over the last ten years. One increasingly explored aspect of AI is the automated interpretation of free-text data often prevalent in electronic medical records (known as natural language processing (NLP)). We set out to review the current evidence for applications of NLP methodology in T&O, including assessment of study design and reporting.</p><p><strong>Methods: </strong>MEDLINE, Allied and Complementary Medicine (AMED), Excerpta Medica Database (EMBASE), and Cochrane Central Register of Controlled Trials (CENTRAL) were screened for studies pertaining to NLP in T&O from database inception to 31 December 2023. An additional grey literature search was performed. NLP quality assessment followed the criteria outlined by Farrow et al in 2021 with two independent reviewers (classification as absent, incomplete, or complete). Reporting was performed according to the Synthesis-Without Meta-Analysis (SWiM) guidelines. The review protocol was registered on the Prospective Register of Systematic Reviews (PROSPERO; registration no. CRD42022291714).</p><p><strong>Results: </strong>The final review included 31 articles (published between 2012 and 2021). The most common subspeciality areas included trauma, arthroplasty, and spine; 13% (4/31) related to online reviews/social media, 42% (13/31) to clinical notes/operation notes, 42% (13/31) to radiology reports, and 3% (1/31) to systematic review. According to the reporting criteria, 16% (5/31) were considered good quality, 74% (23/31) average quality, and 6% (2/31) poor quality. The most commonly absent reporting criteria were evaluation of missing data (26/31), sample size calculation (31/31), and external validation of the study results (29/31 papers). Code and data availability were also poorly documented in most studies.</p><p><strong>Conclusion: </strong>Application of NLP is becoming increasingly common in T&O; however, published article quality is mixed, with few high-quality studies. There are key consistent deficiencies in published work relating to NLP which ultimately influence the potential for clinical application. Open science is an important part of research transparency that should be encouraged in NLP algorithm development and reporting.</p>","PeriodicalId":34103,"journal":{"name":"Bone & Joint Open","volume":"6 3","pages":"264-274"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11879473/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bone & Joint Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1302/2633-1462.63.BJO-2024-0081.R1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0
Abstract
Aims: Prevalence of artificial intelligence (AI) algorithms within the Trauma & Orthopaedics (T&O) literature has greatly increased over the last ten years. One increasingly explored aspect of AI is the automated interpretation of free-text data often prevalent in electronic medical records (known as natural language processing (NLP)). We set out to review the current evidence for applications of NLP methodology in T&O, including assessment of study design and reporting.
Methods: MEDLINE, Allied and Complementary Medicine (AMED), Excerpta Medica Database (EMBASE), and Cochrane Central Register of Controlled Trials (CENTRAL) were screened for studies pertaining to NLP in T&O from database inception to 31 December 2023. An additional grey literature search was performed. NLP quality assessment followed the criteria outlined by Farrow et al in 2021 with two independent reviewers (classification as absent, incomplete, or complete). Reporting was performed according to the Synthesis-Without Meta-Analysis (SWiM) guidelines. The review protocol was registered on the Prospective Register of Systematic Reviews (PROSPERO; registration no. CRD42022291714).
Results: The final review included 31 articles (published between 2012 and 2021). The most common subspeciality areas included trauma, arthroplasty, and spine; 13% (4/31) related to online reviews/social media, 42% (13/31) to clinical notes/operation notes, 42% (13/31) to radiology reports, and 3% (1/31) to systematic review. According to the reporting criteria, 16% (5/31) were considered good quality, 74% (23/31) average quality, and 6% (2/31) poor quality. The most commonly absent reporting criteria were evaluation of missing data (26/31), sample size calculation (31/31), and external validation of the study results (29/31 papers). Code and data availability were also poorly documented in most studies.
Conclusion: Application of NLP is becoming increasingly common in T&O; however, published article quality is mixed, with few high-quality studies. There are key consistent deficiencies in published work relating to NLP which ultimately influence the potential for clinical application. Open science is an important part of research transparency that should be encouraged in NLP algorithm development and reporting.