Joshua T Fuchs, Cara Johnson, Nathan Foster, Peter J Leese
{"title":"Epidemiologic Method Review at Scale: Assessing Charlson Comorbidity Versioning Using a LLM.","authors":"Joshua T Fuchs, Cara Johnson, Nathan Foster, Peter J Leese","doi":"10.1101/2025.09.23.25336010","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The Charlson Comorbidity Index (CCI) is widely used in epidemiologic studies. However, many versions of the CCI have been developed since the original method was published in 1987, and it is unclear which version is used most frequently and how version utilization in research has changed over time.</p><p><strong>Objective: </strong>We present an approach using a Large Language Model (LLM) to extract data from articles by detecting which specific CCI version is employed.</p><p><strong>Design: </strong>We designed a series of prompts that carefully guided the LLM through the identification and extraction of references used in the calculation of the CCI for each particular article. We used the Llama 3.3 70B Instruct model to identify and extract which references were used in the calculation of the CCI.</p><p><strong>Setting: </strong>We analyzed 31,767 articles published since 2012 to evaluate the landscape of CCI implementation. The articles were sourced from the PubMed Central Open Access subset.</p><p><strong>Measurements: </strong>For each article, we measured which version of the CCI was used, if any.</p><p><strong>Results: </strong>We show that 63% of articles that cite only a single method version cite only the original CCI publication, which cannot be applied in the modern real-world data era, leading to ambiguity about how the CCI is being calculated.</p><p><strong>Limitations: </strong>For articles that did not reference one of the CCI versions we searched for, we were unable to determine whether the paper used a different version, created a specific implementation for that paper, or is ambiguous about how the CCI was calculated.</p><p><strong>Conclusion: </strong>This paper introduces a generalizable approach to scaling methods literature review beyond what is typically possible by human-review, which we then validate and demonstrate the value of through application to the CCI.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12485990/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.09.23.25336010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The Charlson Comorbidity Index (CCI) is widely used in epidemiologic studies. However, many versions of the CCI have been developed since the original method was published in 1987, and it is unclear which version is used most frequently and how version utilization in research has changed over time.
Objective: We present an approach using a Large Language Model (LLM) to extract data from articles by detecting which specific CCI version is employed.
Design: We designed a series of prompts that carefully guided the LLM through the identification and extraction of references used in the calculation of the CCI for each particular article. We used the Llama 3.3 70B Instruct model to identify and extract which references were used in the calculation of the CCI.
Setting: We analyzed 31,767 articles published since 2012 to evaluate the landscape of CCI implementation. The articles were sourced from the PubMed Central Open Access subset.
Measurements: For each article, we measured which version of the CCI was used, if any.
Results: We show that 63% of articles that cite only a single method version cite only the original CCI publication, which cannot be applied in the modern real-world data era, leading to ambiguity about how the CCI is being calculated.
Limitations: For articles that did not reference one of the CCI versions we searched for, we were unable to determine whether the paper used a different version, created a specific implementation for that paper, or is ambiguous about how the CCI was calculated.
Conclusion: This paper introduces a generalizable approach to scaling methods literature review beyond what is typically possible by human-review, which we then validate and demonstrate the value of through application to the CCI.
背景:Charlson共病指数(CCI)在流行病学研究中被广泛使用。然而,自1987年最初的CCI方法发表以来,已经开发了许多版本的CCI,目前尚不清楚哪个版本最常被使用,以及研究中版本的使用如何随着时间的推移而变化。目的:我们提出了一种使用大型语言模型(LLM)的方法,通过检测使用哪个特定的CCI版本来从文章中提取数据。设计:我们设计了一系列提示,仔细指导LLM识别和提取用于计算每篇特定文章的CCI的参考文献。我们使用Llama 3.3 70B指令模型来识别和提取CCI计算中使用的参考文献。背景:我们分析了自2012年以来发表的31767篇文章,以评估CCI实施的前景。这些文章来自PubMed Central Open Access子集。测量:对于每篇文章,我们测量了使用的CCI版本(如果有的话)。结果:我们发现,63%只引用单一方法版本的文章只引用原始CCI出版物,这不能应用于现代现实世界的数据时代,导致CCI的计算方式不明确。限制:对于没有引用我们搜索的CCI版本之一的文章,我们无法确定该论文是否使用了不同的版本,为该论文创建了特定的实现,或者对CCI的计算方式不明确。结论:本文介绍了一种可推广的方法来扩展方法文献综述,而不是通常可能的人工综述,然后我们通过应用于CCI来验证和展示其价值。