Retrospective use of the Pragmatic-Explanatory Continuum Indicator Summary-2 trial design tool to assess design choices in randomized controlled trials: an empirical review
Andrew Willis , Frances Shiely , Alison H. Howie , Shaun Treweek , Monica Taljaard , Kirsty Loudon , Ellen Murphy , Aarian Bhakoo , Yasaman Yazdani , Frank Ward , Perrine Janiaud , Andrea Haren , Aileen Yining Liang , Clare Robinson , Daisy Deng , Lars Hemkens , Evelyn O'Sullivan Greene , Laura Slattery , Merrick Zwarenstein
{"title":"Retrospective use of the Pragmatic-Explanatory Continuum Indicator Summary-2 trial design tool to assess design choices in randomized controlled trials: an empirical review","authors":"Andrew Willis , Frances Shiely , Alison H. Howie , Shaun Treweek , Monica Taljaard , Kirsty Loudon , Ellen Murphy , Aarian Bhakoo , Yasaman Yazdani , Frank Ward , Perrine Janiaud , Andrea Haren , Aileen Yining Liang , Clare Robinson , Daisy Deng , Lars Hemkens , Evelyn O'Sullivan Greene , Laura Slattery , Merrick Zwarenstein","doi":"10.1016/j.jclinepi.2025.111959","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>The Pragmatic-Explanatory Continuum Indicator Summary-2 (PRECIS-2) tool has been widely used to help investigators design randomized trials, facilitating the task of aligning design choices with an explanatory or pragmatic primary trial intention. PRECIS-2 is increasingly being used to retrospectively assess the degree of pragmatism or explanatoriness among published trials within reviews. There is little information on the interrater reliability of the tool and no consensus on the preferred method of achieving an accurate and reliable judgment of trial “pragmatism” when using PRECIS-2 retrospectively. The aims of this study were to assess the level of pragmatism or explanatoriness of trials that cite PRECIS-2 and to assess interrater reliability of PRECIS-2 using different scoring approaches. We compared agreement between two independent ratings within a single pair with agreement between consensus scores reached by two independent pairs of reviewers and whether widening the agreement criteria increased interrater reliability.</div></div><div><h3>Methods</h3><div>Thirty randomized controlled trials (RCTs) were randomly selected from trials citing the PRECIS-2 tool. Two pairs of reviewers, a clinician paired with a methodologist in each case, were trained and independently scored each trial and reached a consensus score within pairs. Agreement between reviewers within pairs and between consensus scores across pairs was assessed using kappa statistics for each of the nine PRECIS-2 domains.</div></div><div><h3>Results</h3><div>RCTs citing PRECIS-2 had predominantly pragmatic design features. Interrater reliability within pairs was low across all domains, with the highest levels found in the two domains of analysis (0.32) and follow-up (0.33). Agreement across pairs on the consensus scores was similarly low. Agreement between reviewers and reviewer pairs was above 70% when agreement was reclassified as “within 1-point difference on the scoring scale” for eight domains, but no improvement was obtained for the remaining domain.</div></div><div><h3>Conclusion</h3><div>Trials citing PRECIS-2 tend to have predominantly pragmatic design features. When using PRECIS-2 to retrospectively score trial publications, agreement between consensus scores across pairs of reviewers was no better than agreement within pairs. Reconfiguring the PRECIS scoring scale and improving scoring guidance may provide a more meaningful, easily interpreted measure of “pragmatism” for trialists wishing to use PRECIS-2 as a review tool.</div></div><div><h3>Plain Language Summary</h3><div>The Pragmatic-Explanatory Continuum Indicator Summary-2 (PRECIS-2) tool is designed to help researchers match their design decisions to the intended purpose of their trial. The intention of a trial can be “explanatory,” which improves our understanding of how an intervention works, or “pragmatic,” which supports decision-making in health care. Increasingly, the tool has been used for a secondary purpose: in systematic reviews. Here the tool is used to judge the level of “pragmatism” or “explanatoriness” of trials included in the review to aid the understanding of trial results. However, there is debate on the most reliable means of making this judgment. Sometimes judgements are made using one reviewer; other times, multiple reviewers. Our study evaluated interrater reliability of two methods of scoring trial publications using PRECIS-2: individual reviewer scores and pairs of reviewers agreeing on a consensus score. We also found that neither method we tested produced a reliable judgment using PRECIS-2, and the scores from two reviewers agreeing on a consensus were no more reliable than scores from a single reviewer. We performed an additional analysis that showed that simplifying the scoring from the original five-point scale to a three-point scale may give a more reliable judgment of the “pragmatism” or “explanatioriness” of published trials. This simpler method of scoring should be encouraged for retrospective use of PRECIS-2 in systematic reviews.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"187 ","pages":"Article 111959"},"PeriodicalIF":5.2000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895435625002926","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Objective
The Pragmatic-Explanatory Continuum Indicator Summary-2 (PRECIS-2) tool has been widely used to help investigators design randomized trials, facilitating the task of aligning design choices with an explanatory or pragmatic primary trial intention. PRECIS-2 is increasingly being used to retrospectively assess the degree of pragmatism or explanatoriness among published trials within reviews. There is little information on the interrater reliability of the tool and no consensus on the preferred method of achieving an accurate and reliable judgment of trial “pragmatism” when using PRECIS-2 retrospectively. The aims of this study were to assess the level of pragmatism or explanatoriness of trials that cite PRECIS-2 and to assess interrater reliability of PRECIS-2 using different scoring approaches. We compared agreement between two independent ratings within a single pair with agreement between consensus scores reached by two independent pairs of reviewers and whether widening the agreement criteria increased interrater reliability.
Methods
Thirty randomized controlled trials (RCTs) were randomly selected from trials citing the PRECIS-2 tool. Two pairs of reviewers, a clinician paired with a methodologist in each case, were trained and independently scored each trial and reached a consensus score within pairs. Agreement between reviewers within pairs and between consensus scores across pairs was assessed using kappa statistics for each of the nine PRECIS-2 domains.
Results
RCTs citing PRECIS-2 had predominantly pragmatic design features. Interrater reliability within pairs was low across all domains, with the highest levels found in the two domains of analysis (0.32) and follow-up (0.33). Agreement across pairs on the consensus scores was similarly low. Agreement between reviewers and reviewer pairs was above 70% when agreement was reclassified as “within 1-point difference on the scoring scale” for eight domains, but no improvement was obtained for the remaining domain.
Conclusion
Trials citing PRECIS-2 tend to have predominantly pragmatic design features. When using PRECIS-2 to retrospectively score trial publications, agreement between consensus scores across pairs of reviewers was no better than agreement within pairs. Reconfiguring the PRECIS scoring scale and improving scoring guidance may provide a more meaningful, easily interpreted measure of “pragmatism” for trialists wishing to use PRECIS-2 as a review tool.
Plain Language Summary
The Pragmatic-Explanatory Continuum Indicator Summary-2 (PRECIS-2) tool is designed to help researchers match their design decisions to the intended purpose of their trial. The intention of a trial can be “explanatory,” which improves our understanding of how an intervention works, or “pragmatic,” which supports decision-making in health care. Increasingly, the tool has been used for a secondary purpose: in systematic reviews. Here the tool is used to judge the level of “pragmatism” or “explanatoriness” of trials included in the review to aid the understanding of trial results. However, there is debate on the most reliable means of making this judgment. Sometimes judgements are made using one reviewer; other times, multiple reviewers. Our study evaluated interrater reliability of two methods of scoring trial publications using PRECIS-2: individual reviewer scores and pairs of reviewers agreeing on a consensus score. We also found that neither method we tested produced a reliable judgment using PRECIS-2, and the scores from two reviewers agreeing on a consensus were no more reliable than scores from a single reviewer. We performed an additional analysis that showed that simplifying the scoring from the original five-point scale to a three-point scale may give a more reliable judgment of the “pragmatism” or “explanatioriness” of published trials. This simpler method of scoring should be encouraged for retrospective use of PRECIS-2 in systematic reviews.
期刊介绍:
The Journal of Clinical Epidemiology strives to enhance the quality of clinical and patient-oriented healthcare research by advancing and applying innovative methods in conducting, presenting, synthesizing, disseminating, and translating research results into optimal clinical practice. Special emphasis is placed on training new generations of scientists and clinical practice leaders.