Xin Meng, Daqiu Wang, Yan Huo, Wenhan Shang, Aiping Wang
{"title":"The inappropriateness of internal consistency testing and factor analysis for formative indicators: comment on Felicia et al, 2024","authors":"Xin Meng, Daqiu Wang, Yan Huo, Wenhan Shang, Aiping Wang","doi":"10.1016/j.jclinepi.2025.111748","DOIUrl":"10.1016/j.jclinepi.2025.111748","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"181 ","pages":"Article 111748"},"PeriodicalIF":7.3,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143538043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Judith-Lisa Lieberum , Markus Töws , Maria-Inti Metzendorf , Felix Heilmeyer , Waldemar Siemens , Christian Haverkamp , Daniel Böhringer , Joerg J. Meerpohl , Angelika Eisele-Metzger
{"title":"Large language models for conducting systematic reviews: on the rise, but not yet ready for use—a scoping review","authors":"Judith-Lisa Lieberum , Markus Töws , Maria-Inti Metzendorf , Felix Heilmeyer , Waldemar Siemens , Christian Haverkamp , Daniel Böhringer , Joerg J. Meerpohl , Angelika Eisele-Metzger","doi":"10.1016/j.jclinepi.2025.111746","DOIUrl":"10.1016/j.jclinepi.2025.111746","url":null,"abstract":"<div><h3>Background and Objectives</h3><div>Machine learning promises versatile help in the creation of systematic reviews (SRs). Recently, further developments in the form of large language models (LLMs) and their application in SR conduct attracted attention. We aimed at providing an overview of LLM applications in SR conduct in health research.</div></div><div><h3>Methods</h3><div>We systematically searched MEDLINE, Web of Science, IEEEXplore, ACM Digital Library, Europe PMC (preprints), Google Scholar, and conducted an additional hand search (last search: February 26, 2024). We included scientific articles in English or German, published from April 2021 onwards, building upon the results of a mapping review that has not yet identified LLM applications to support SRs. Two reviewers independently screened studies for eligibility; after piloting, 1 reviewer extracted data, checked by another.</div></div><div><h3>Results</h3><div>Our database search yielded 8054 hits, and we identified 33 articles from our hand search. We finally included 37 articles on LLM support. LLM approaches covered 10 of 13 defined SR steps, most frequently literature search (<em>n</em> = 15, 41%), study selection (<em>n</em> = 14, 38%), and data extraction (<em>n</em> = 11, 30%). The mostly recurring LLM was Generative Pretrained Transformer (GPT) (<em>n</em> = 33, 89%). Validation studies were predominant (<em>n</em> = 21, 57%). In half of the studies, authors evaluated LLM use as promising (<em>n</em> = 20, 54%), one-quarter as neutral (<em>n</em> = 9, 24%) and one-fifth as nonpromising (<em>n</em> = 8, 22%).</div></div><div><h3>Conclusion</h3><div>Although LLMs show promise in supporting SR creation, fully established or validated applications are often lacking. The rapid increase in research on LLMs for evidence synthesis production highlights their growing relevance.</div></div><div><h3>Plain Language Summary</h3><div>Systematic reviews are a crucial tool in health research where experts carefully collect and analyze all available evidence on a specific research question. Creating these reviews is typically time- and resource-intensive, often taking months or even years to complete, as researchers must thoroughly search, evaluate, and synthesize an immense number of scientific studies. For the present article, we conducted a review to understand how new artificial intelligence (AI) tools, specifically large language models (LLMs) like Generative Pretrained Transformer (GPT), can be used to help create systematic reviews in health research. We searched multiple scientific databases and finally found 37 relevant articles. We found that LLMs have been tested to help with various parts of the systematic review process, particularly in 3 main areas: searching scientific literature (41% of studies), selecting relevant studies (38%), and extracting important information from these studies (30%). GPT was the most commonly used LLM, appearing in 89% of the studies. Most of th","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"181 ","pages":"Article 111746"},"PeriodicalIF":7.3,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143532184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial, April 2025","authors":"David Tovey, Andrea C. Tricco","doi":"10.1016/j.jclinepi.2025.111749","DOIUrl":"10.1016/j.jclinepi.2025.111749","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"180 ","pages":"Article 111749"},"PeriodicalIF":7.3,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143609145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leticia W. Ribeiro, Gregore I. Mielke, Jenny Doust, Gita D. Mishra
{"title":"Two-decade health-related quality of life and performance on physical function tests in midaged women: findings from a prospective cohort study","authors":"Leticia W. Ribeiro, Gregore I. Mielke, Jenny Doust, Gita D. Mishra","doi":"10.1016/j.jclinepi.2025.111730","DOIUrl":"10.1016/j.jclinepi.2025.111730","url":null,"abstract":"<div><h3>Objectives</h3><div>Health-related quality of life (HRQoL) is more commonly measured in younger populations than objective physical function tests. However, associations between HRQoL and the performance on physical function tests are unclear. This study investigates the association between HRQoL measures across adulthood and performance on physical function tests in midaged women.</div></div><div><h3>Methods</h3><div>Data were derived from 499 women born during 1973–1978 from the Menarche-to-PreMenopause Study, a substudy of the Australian Longitudinal Study on Women's Health. HRQoL was assessed every 3 years from ages 18–23 years to 40–45 years using the eight Short Form Health Survey subscales. Generalized estimating equation models examined the associations between HRQoL over 22 years and three performance tests at a mean age 44.6 years: handgrip strength (kg), chair rise (sec), and standing balance (sec). Worse performance was defined by the lowest tertile of the sample.</div></div><div><h3>Results</h3><div>Several HRQoL subscales showed longitudinal associations with performance. Repeatedly lower scores on nearly all subscales were linked to worse chair rise performance, except for social functioning and mental health. Bodily pain was associated with all three tests; women reporting more pain across the 22-year follow-up showed 50% higher odds of worse chair rise and 30% higher odds of both worse handgrip strength and balance. Women with lower physical functioning scores had higher odds of worse grip (odds ratio 1.4, 95% CI 1.1–1.9) and worse chair rise performances (odds ratio 1.4, 95% CI 1.4–2.6).</div></div><div><h3>Conclusion</h3><div>This study showed poorer HRQoL from early-to-mid adulthood was associated with worse physical performance in midaged Australian women, particularly in the chair rise test.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"181 ","pages":"Article 111730"},"PeriodicalIF":7.3,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143525106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tabea Kaul , Johanna A.A. Damen , Laure Wynants , Ben Van Calster , Maarten van Smeden , Lotty Hooft , Karel G.M. Moons
{"title":"Assessing the quality of prediction models in health care using the Prediction model Risk Of Bias ASsessment Tool (PROBAST): an evaluation of its use and practical application","authors":"Tabea Kaul , Johanna A.A. Damen , Laure Wynants , Ben Van Calster , Maarten van Smeden , Lotty Hooft , Karel G.M. Moons","doi":"10.1016/j.jclinepi.2025.111732","DOIUrl":"10.1016/j.jclinepi.2025.111732","url":null,"abstract":"<div><h3>Background and Objectives</h3><div>Since 2019, the Prediction model Risk Of Bias ASsessment Tool (PROBAST; <span><span>www.probast.org</span><svg><path></path></svg></span>) has supported methodological quality assessments of prediction model studies. Most prediction model studies are rated with a “High” risk of bias (ROB) and researchers report low interrater reliability (IRR) using PROBAST. We aimed to (1) assess the IRR of PROBAST ratings between assessors of the same study and understand reasons for discrepancies, (2) determine which items contribute most to domain-level ROB ratings, and (3) explore the impact of consensus meetings.</div></div><div><h3>Study Design and Setting</h3><div>We used PROBAST assessments from a systematic review of diagnostic and prognostic COVID-19 prediction models as a case study. Assessors included international experts in prediction model studies or their reviews. We assessed IRR using prevalence-adjusted bias-adjusted kappa (PABAK) before consensus meetings, examined bias ratings per domain-level ROB judgments, and evaluated the impact of consensus meetings by identifying rating changes after discussion.</div></div><div><h3>Results</h3><div>We analyzed 2167 PROBAST assessments from 27 assessor pairs covering 760 prediction models: 384 developments, 242 validations, and 134 mixed assessments (including both). The IRR using PABAK was higher for overall ROB judgments (development: 0.82 [0.76; 0.89]; validation: 0.78 [0.68; 0.88]) compared to domain- and item-level judgments. Some PROBAST items frequently contributed to domain-level ROB judgments, eg, 3.5 Outcome blinding and 4.1 Sample size. Consensus discussions mainly led to item-level and never to overall ROB rating changes.</div></div><div><h3>Conclusion</h3><div>Within this case study, PROBAST assessments received high IRR at the overall ROB level, with some variation at item- and domain-level. To reduce variability, PROBAST assessors should standardize item- and domain-level judgments and hold well-structured consensus meetings between assessors of the same study.</div></div><div><h3>Plain Language Summary</h3><div>The Prediction model Risk Of Bias ASsessment Tool (PROBAST; <span><span>www.probast.org</span><svg><path></path></svg></span>) provides a set of items to assess the quality of medical studies on so-called prediction tools that calculate an individual's probability of having or developing a certain disease or health outcome. Previous research found low interrater reliability (IRR; ie, how consistently two assessors rate aspects of the same study) when using PROBAST. To understand why this is the case, we conducted a large study involving more than 30 experts from around the world, all of whom applied PROBAST to the same set of prediction tool studies. Based on more than 2150 PROBAST assessments, we identified which PROBAST items led to the most disagreements between raters, explored reasons for these disagreements, and examined whether the","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"181 ","pages":"Article 111732"},"PeriodicalIF":7.3,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143517415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinxin Guo , Tiansheng Wang , Hui Cao , Qinyi Ma , Yuchuan Tang , Tong Li , Lu Wang , Yang Xu , Siyan Zhan
{"title":"Application of methodological strategies to address unmeasured confounding in real-world vaccine safety and effectiveness study: a systematic review","authors":"Jinxin Guo , Tiansheng Wang , Hui Cao , Qinyi Ma , Yuchuan Tang , Tong Li , Lu Wang , Yang Xu , Siyan Zhan","doi":"10.1016/j.jclinepi.2025.111737","DOIUrl":"10.1016/j.jclinepi.2025.111737","url":null,"abstract":"<div><h3>Objectives</h3><div>Uses of real-world data to evaluate vaccine safety and effectiveness are often challenged by unmeasured confounding. The study aimed to review the application of methods to address unmeasured confounding in observational vaccine safety and effectiveness research.</div></div><div><h3>Study Design and Setting</h3><div>We conducted a systematic review (PROSPERO: CRD42024519882), and searched PubMed, Web of Science, Embase, and Scopus for epidemiological studies investigating influenza and COVID-19 vaccines as exposures, and respiratory and cardiovascular diseases as outcomes, published between January 1, 2017, and December 31, 2023. Data on study design and statistical analyses were extracted from eligible articles.</div></div><div><h3>Results</h3><div>A total of 913 studies were included, of which 42 (4.6%, 42/913) accounted for unmeasured confounding through statistical correction (31.0%, 13/42) or confounding detection or quantification (78.6%, 33/42). Negative control was employed in 24 (57.1%, 24/42) studies—2 (8.3%, 2/24) for confounding correction and 22 (91.7%, 22/24) for confounding detection or quantification—followed by E-value (31.0%, 13/42), prior event rate ratio (11.9%, 5/42), regression discontinuity design (7.1%, 3/42), instrumental variable (4.8%, 2/42), and difference-in-differences (2.4%, 1/42). A total of 871 (95.4%, 871/913) studies did not address unmeasured confounding, but 38.9% (355/913) reported it as study limitation.</div></div><div><h3>Conclusion</h3><div>Unmeasured confounding in real-world vaccine safety and effectiveness studies remains underexplored. Current research primarily employed confounding detection or quantification, notably negative control and E-value, which did not yield adjusted effect estimates. While some studies used correction methods like instrumental variable, regression discontinuity design, and negative control, challenges arise from the stringent assumptions. Future efforts should prioritize developing valid methodologies to mitigate unmeasured confounding.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"181 ","pages":"Article 111737"},"PeriodicalIF":7.3,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143525085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahin Nomali , Mehdi Yaseri , Saharnaz Nedjat , Fereidoun Azizi , Mohammad Ali Mansournia , Hossein Navid , Goodarz Danaei , Mark Woodward , Noushin Fahimfar , Ewout Steyerberg , Davood Khalili
{"title":"Performance of the revised World Health Organization cardiovascular disease risk prediction models for the Middle East and North Africa: a validation study in the Tehran Lipid and Glucose Study","authors":"Mahin Nomali , Mehdi Yaseri , Saharnaz Nedjat , Fereidoun Azizi , Mohammad Ali Mansournia , Hossein Navid , Goodarz Danaei , Mark Woodward , Noushin Fahimfar , Ewout Steyerberg , Davood Khalili","doi":"10.1016/j.jclinepi.2025.111736","DOIUrl":"10.1016/j.jclinepi.2025.111736","url":null,"abstract":"<div><h3>Objectives</h3><div>We aimed to evaluate the performance of the revised World Health Organization (WHO) models in predicting the 10-year risk of cardiovascular disease (CVD) in Iran, as part of the Middle East and North Africa (MENA) region.</div></div><div><h3>Study Design and Setting</h3><div>We analyzed data from the Tehran Lipid and Glucose Study (TLGS), including 5162 participants (2241 men) aged 40–80 years without CVD at baseline (the third examination, 2006–2008), for the occurrence of CVD (myocardial infarction (MI), coronary heart disease (CHD) death, and stroke). We assessed the statistical performance of original and regionally recalibrated models, both laboratory- and non-laboratory-based, using discrimination (C-statistic) calibration (calibration plot and observed-to-expected[O:E] ratio) and clinical performance applying net benefit (NB), a measure of true positives (TP) penalized for a weight of false positives (FP), a decimal value representing the expected proportion of TP outcomes among total population.</div></div><div><h3>Results</h3><div>During the 10-year follow-up, 307 CVD events occurred. The cumulative incidence of CVD was 9.0% (95% CI: 8.0%–10.0%) in men and 4.0% (3.0%–5.0%) in women. For the laboratory-based model, the C-statistic was 0.72 (0.68–0.75) in men and 0.83 (0.80–0.86) in women; for the nonlaboratory-based model, it was 0.70 (0.66–0.73) and 0.82 (0.79–0.86) for men and women, respectively. The lab model underpredicted the risk (O:E = 1.20 [1.00–1.33] for men and 1.40 [1.13-1.60] for women). At the risk threshold of 10%, NB for the lab model was 0.03 (0.02–0.04) for men and 0.01 (0.004–0.01) for women; these values became zero or negative for thresholds over 20%. Regionally recalibrated models overestimated the risk (O:E < 1) and showed lower NB.</div></div><div><h3>Conclusion</h3><div>The loss of specificity was not sufficiently offset by the increase in sensitivity provided by the regionally recalibrated models compared to the original models.</div></div><div><h3>Plain Language Summary</h3><div>In this study, we assessed the performance of the World Health Organization (WHO) cardiovascular disease (CVD) risk models in Iran, which is part of the Middle East and North Africa (MENA) region. Regarding the statistical performance of the models, both the original and regionally recalibrated WHO models had good discriminative ability. Concerning calibration, another component of statistical performance, the original models underestimated the actual risk, while the recalibrated version overestimated it. Regarding the clinical performance of the models, both the original and regionally recalibrated versions were clinically useful at the risk threshold of 10%.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"182 ","pages":"Article 111736"},"PeriodicalIF":7.3,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143525089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wasifa Zarin , Sharmila Sreetharan , Amanda Doherty-Kirby , Michael Scott , Elaine Zibrowski , Charlene Soobiah , Meghan Elliott , Sabrina Chaudhry , Safa Al-Khateeb , Clara Tam , Ba Pham , Sharon E. Straus , Andrea C. Tricco
{"title":"Patient- and public-driven health research: a model of coleadership and partnership in research priority setting using a modified James Lind Alliance approach","authors":"Wasifa Zarin , Sharmila Sreetharan , Amanda Doherty-Kirby , Michael Scott , Elaine Zibrowski , Charlene Soobiah , Meghan Elliott , Sabrina Chaudhry , Safa Al-Khateeb , Clara Tam , Ba Pham , Sharon E. Straus , Andrea C. Tricco","doi":"10.1016/j.jclinepi.2025.111731","DOIUrl":"10.1016/j.jclinepi.2025.111731","url":null,"abstract":"<div><h3>Objectives</h3><div>To describe the Strategy for Patient-Oriented Research Evidence Alliance's methodological approach to systematically identify 23 high priority health research topics (three in 2021 and 20 in 2023) from patient partners (including caregivers) and members of the public across Canada and beyond.</div></div><div><h3>Study Design and Setting</h3><div>In 2021 and 2023, we collaborated with patient and public partners to co-design and co-conduct two priority setting initiatives. These initiatives involved a diverse group of patients, the public, clinicians, researchers, and health system decision-makers to systematically and collectively prioritize research topics based on their perceived importance and anticipated impact. We used a modified James Lind Alliance approach, where all participants were engaged as equal partners. The prioritization process consisted of the following steps: 1) identification and collection of research priorities from patients and the public; 2) summarizing the research priorities gathered; 3) conducting semistructured interviews (1-on-1 or focus groups depending on the number of submissions for each unique topic), conducting literature searches on each topic to identify relevant knowledge synthesis and appraising the quality of relevant evidence using the AMSTAR 2 (A MeaSurement Tool to Assess systematic Reviews) checklist, and preparing lay summaries (1–2 pages) for each unique topic using a predefined template cocreated with patient partners; 4) conducting a priority setting exercise with a multidisciplinary panel consisting of an interim priority setting rating questionnaire to score each topic based on nine questions, followed by a virtual workshop to reach consensus on the final rating and ranking of topics; and 5) facilitating research by funding selected topics and providing capacity-building support to research teams. We conducted a formal process evaluation of engagement, transparency, information management, and considerations of values and context in 2023.</div></div><div><h3>Results</h3><div>A total of 98 topics were received across two research priority setting initiatives. Approximately, half the submissions were made by individuals who identified as patients (2021: 45% [<em>n</em> = 5] and 2023: 52% [<em>n</em> = 45]), whereas the rest identified as caregivers or members of the public. Topics were spread across 26 health themes, with arthritis and osteoporosis (27% [<em>n</em> = 3]) being the most common theme in 2021 and quality of care (26% [<em>n</em> = 23]) in 2023. Twenty-three priorities from 98 topics submitted by patients and public were selected. A formal process evaluation in 2023 revealed 85% of the respondents who participated in the priority setting panel “strongly agreed” that their experience was valuable and they would participate again in a future initiative. The 23 prioritized projects are currently being co-led with the patient and public partner topic submitters ","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"181 ","pages":"Article 111731"},"PeriodicalIF":7.3,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143517416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Artificial intelligence to semiautomate trustworthiness assessment of randomized controlled trials: response to Au et al.","authors":"Hinpetch Daungsupawong, Viroj Wiwanitkit","doi":"10.1016/j.jclinepi.2025.111734","DOIUrl":"10.1016/j.jclinepi.2025.111734","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":" ","pages":"111734"},"PeriodicalIF":7.3,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143494752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frank You , Taylor Coffey , Daniel Powell , Paula R. Williamson , Katie Gillies
{"title":"Carbon emissions associated with clinical trials: a scoping review","authors":"Frank You , Taylor Coffey , Daniel Powell , Paula R. Williamson , Katie Gillies","doi":"10.1016/j.jclinepi.2025.111733","DOIUrl":"10.1016/j.jclinepi.2025.111733","url":null,"abstract":"<div><h3>Objectives</h3><div>To review and synthesize available evidence on carbon emissions associated with clinical trials to inform future research on design and delivery of greener trials.</div></div><div><h3>Study Design and Setting</h3><div>We performed a scoping review by following the Joanna Briggs Institute guidance and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews. A systematic search was conducted on MEDLINE (Ovid) from January 1, 2007, to April 15, 2024, with no geographic and language restrictions complemented by forward and backward citation analysis (snowballing). We included all types of research literature within the context of clinical trials reporting any aspect related to trial specific carbon emissions.</div></div><div><h3>Results</h3><div>Twenty-two articles were identified as eligible and included in the review. Most included studies (<em>n</em> = 17, 77%) were published between 2020 and 2024. Over half of the included studies (<em>n</em> = 13, 59%) were primary research articles with the majority reporting carbon audits of trials and their associated processes. The remaining literature comprised secondary studies (<em>n</em> = 3, 14%) and opinion pieces (<em>n</em> = 6, 27%). Diverse and evolving approaches to studying trial-related carbon emissions were identified alongside several carbon hotspots including those associated with trial-related travel, trial facilities, and sample lifecycle.</div></div><div><h3>Conclusion</h3><div>The literature on carbon emissions associated with clinical trials has focused on studies reporting carbon audits of trials and their associated processes. Efforts have been made to quantify the trial carbon output with variability in methods and carbon output. Despite the development and evolution of carbon measurement tools, strategies to mitigate trial specific carbon emissions are still much in need.</div></div><div><h3>Plain Language Summary</h3><div>Clinical trials are important to the development of medicine and health care but they have great unintended environmental impacts, especially in the form of carbon emissions. We looked at the literature to understand how carbon emissions generated by clinical trials were measured, which components across trials were carbon heavy, and what could be done to reduce the carbon output of clinical trials. We found 22 relevant articles of which 13 were primary research studies. Twelve of these primary studies measured carbon output of a range of trials. Their results varied considerably because of the variability of a host of factors, such as the number of trials analyzed, trial duration, geographical scope, trial processes measured and methods for quantifying carbon emissions. Despite varied definitions of carbon hotspots, several trial activities, including trial-related travels and meetings, trial facilities, and sample and laboratory activities, were found to be carbon heavy across studies. The ","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"181 ","pages":"Article 111733"},"PeriodicalIF":7.3,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143494768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}