Making the best use of quantitative fecal immunochemical test results in colorectal cancer screening

IF 9 2区医学 Q1 MEDICINE, GENERAL & INTERNAL

Journal of Internal Medicine Pub Date : 2024-06-18 DOI:10.1111/joim.13812

Hermann Brenner, Michael Hoffmeister

{"title":"Making the best use of quantitative fecal immunochemical test results in colorectal cancer screening","authors":"Hermann Brenner, Michael Hoffmeister","doi":"10.1111/joim.13812","DOIUrl":null,"url":null,"abstract":"Fecal immunochemical tests (FITs) have become the most widely used tests for colorectal cancer (CRC) screening [1]. They detect the vast majority of CRCs and some proportion of advanced precancerous neoplasms [2], and modeling studies suggest that annual or biennial FIT-based screening programs have the potential to substantially lower the burden of CRC incidence and mortality [3]. Yet, uncertainty prevails with respect to the optimal use of FITs regarding a number of key parameters of screening programs, such as the starting age of screening, screening intervals, and positivity thresholds of FITs. In this issue, Westerberg et al. reported most valuable results from the baseline exam of the large Swedish SCREESCO screening trial that may inform the design and planning of screening programs [4]. In particular, the study allows thorough evaluation of the tradeoffs between increasing the positive predictive value (PPV) and the decreasing numbers needed to undergo colonoscopy (“numbers needed to scope”, NNS) on one hand, and decreasing sensitivity on the other hand, when increasing the FIT positivity threshold from 10 μg hemoglobin (Hb)/g feces to higher levels. These data may be most valuable for multiple purposes, including the provision of key background information for more comprehensive modeling of the effectiveness and cost-effectiveness of various screening strategies.However, in the interpretation of the results, a number of additional factors require careful consideration. In the SCREESCO trial, two FITs per screening round were applied, and the overall test result was rated as positive if one of the two FITs showed an Hb concentration >10 μg/g feces in the baseline scenario, or >20, 40, 60, 80, 120, or 160 μg/g in the alternative scenarios. By contrast, in most screening programs, just one FIT is employed per screening round. Defining the result as positive if one of two tests is positive increases the sensitivity and decreases the specificity compared to the application of a single test. This implies that comparable positivity rates and sensitivity would be expected at somewhat lower cutoffs in one-sample rather than two-sample testing, which should be kept in mind in interpreting the presented data. It remains an open question whether two-sample testing is worth the extra effort and cost. Possibly, (almost) equivalent results as those reported by Westerberg et al. could be obtained with one-sample testing by lowering the FIT cutoff [5]. Further analyses of the dataset by Westerberg et al. may offer unique opportunities to answer this question.Another important aspect to keep in mind is that all the results reported by Westerberg et al. refer to a first-round FIT screening. With annual or biennial FIT-based screening, as recommended and practiced in many countries, the prevalences of advanced neoplasms will decrease at subsequent screening rounds. Although this may have a limited impact on sensitivity and specificity, PPVs would be expected to be lower, and NNS would be expected to be higher in subsequent screening rounds. Moreover, the starting age of screening in the SCREESCO screening trial was 60 years, whereas much lower starting ages, for example, at 50 or even 45 years of age, are implemented in most screening programs [6]. As prevalences of colorectal neoplasms are lower at younger ages, this would again imply that PPVs would be expected to be lower, and NNS would be expected to be higher than those reported from the SCREESCO trial.Another very relevant question addressed by Westerberg et al. is whether and how sex differences in CRC epidemiology or screening test performance should be reflected in the interpretation of FIT results. It is a well-known universal observation that men have higher age-specific and age-standardized incidence and prevalence of colorectal neoplasms than women [7]. This implies that at any given age and FIT cutoff, positivity rates and PPVs are expected to be higher, and NNS are expected to be lower for men than for women, an observation that was also made in the study by Westerberg et al. These sex differences could be accounted for by using a lower cutoff for men than for women, by which equal PPVs and NNS could be achieved and limited colonoscopy capacities could be used in the most efficient possible way. A potential drawback of this approach would, though be that women would have a lower chance of having their neoplasms early detected or prevented than men. In an attempt to achieve “gender fairness” in terms of equal positivity rates for women and men, the Swedish screening program of Stockholm–Gotland even set a higher positivity threshold (80 μg/g) for men than for women (40 μg/g) [8]. However, such an approach further increases rather than decreases gender discrepancies in PPVs and NNS. The resulting high NNS for women may be a rather inefficient use of limited colonoscopy capacities and appears to require careful reconsideration.A question the study by Westerberg et al. could not address, but which is carefully discussed by the authors, is the lack of colonoscopy results of participants with fecal Hb concentrations below 10 μg Hb/g feces. As a result, only relative sensitivities rather than absolute sensitivities could be derived, and it is unclear how the prevalence of neoplasms among participants in the various categories of FIT positivity compares to the vast majority of FIT negative screening participants. Such information can be derived from other studies conducted in the setting of screening colonoscopies [9]. These data show that even people in the “low positive range” with fecal Hb concentrations between 10 and 25 μg/g feces have a 3.5-fold risk of carrying any AN compared to the vast majority of >80% of screening participants with Hb concentrations below 8 μg/g feces. The strongly increased risk already in the “low positive range” would suggest that the observations by Westerberg et al. of higher PPVs and lower NNS achieved with higher FIT cutoffs compared to those with the 10 μg Hb/g cutoff should not be interpreted as support for higher FIT cutoffs. Although higher cutoffs may, in some instances, be inevitable due to limited colonoscopy resources, colonoscopic follow-up appears to be warranted after FIT-based detection of ≥3.5-fold increased risk of AN compared to the vast majority of the screening population.The final goal of CRC screening should be to lower CRC incidence and mortality as much and efficiently as possible. Randomized controlled trials (RCTs) require a long time period from conception until the availability of long-term incidence and mortality outcomes and very large sample sizes, even if only two different screening strategies are compared. This strongly limits the implementation and use of RCTs for evaluating innovative screening approaches. Well-designed modeling studies may be a promising and rational complementary approach for the timely evaluation of novel screening strategies [10]. The detailed results on diagnostic performance parameters of FIT reported by Westerberg et al., along with data from screening colonoscopy cohorts, may be most valuable in informing such modeling studies for FIT-based and alternative screening approaches.Hermann Brenner: Writing—original draft; writing—review and editing; conceptualization. Michael Hoffmeister: Writing—review and editing.The authors have no conflicts of interest to declare.","PeriodicalId":196,"journal":{"name":"Journal of Internal Medicine","volume":"296 2","pages":"118-120"},"PeriodicalIF":9.0000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/joim.13812","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Internal Medicine","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/joim.13812","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

Abstract

Fecal immunochemical tests (FITs) have become the most widely used tests for colorectal cancer (CRC) screening [1]. They detect the vast majority of CRCs and some proportion of advanced precancerous neoplasms [2], and modeling studies suggest that annual or biennial FIT-based screening programs have the potential to substantially lower the burden of CRC incidence and mortality [3]. Yet, uncertainty prevails with respect to the optimal use of FITs regarding a number of key parameters of screening programs, such as the starting age of screening, screening intervals, and positivity thresholds of FITs. In this issue, Westerberg et al. reported most valuable results from the baseline exam of the large Swedish SCREESCO screening trial that may inform the design and planning of screening programs [4]. In particular, the study allows thorough evaluation of the tradeoffs between increasing the positive predictive value (PPV) and the decreasing numbers needed to undergo colonoscopy (“numbers needed to scope”, NNS) on one hand, and decreasing sensitivity on the other hand, when increasing the FIT positivity threshold from 10 μg hemoglobin (Hb)/g feces to higher levels. These data may be most valuable for multiple purposes, including the provision of key background information for more comprehensive modeling of the effectiveness and cost-effectiveness of various screening strategies.

However, in the interpretation of the results, a number of additional factors require careful consideration. In the SCREESCO trial, two FITs per screening round were applied, and the overall test result was rated as positive if one of the two FITs showed an Hb concentration >10 μg/g feces in the baseline scenario, or >20, 40, 60, 80, 120, or 160 μg/g in the alternative scenarios. By contrast, in most screening programs, just one FIT is employed per screening round. Defining the result as positive if one of two tests is positive increases the sensitivity and decreases the specificity compared to the application of a single test. This implies that comparable positivity rates and sensitivity would be expected at somewhat lower cutoffs in one-sample rather than two-sample testing, which should be kept in mind in interpreting the presented data. It remains an open question whether two-sample testing is worth the extra effort and cost. Possibly, (almost) equivalent results as those reported by Westerberg et al. could be obtained with one-sample testing by lowering the FIT cutoff [5]. Further analyses of the dataset by Westerberg et al. may offer unique opportunities to answer this question.

Another important aspect to keep in mind is that all the results reported by Westerberg et al. refer to a first-round FIT screening. With annual or biennial FIT-based screening, as recommended and practiced in many countries, the prevalences of advanced neoplasms will decrease at subsequent screening rounds. Although this may have a limited impact on sensitivity and specificity, PPVs would be expected to be lower, and NNS would be expected to be higher in subsequent screening rounds. Moreover, the starting age of screening in the SCREESCO screening trial was 60 years, whereas much lower starting ages, for example, at 50 or even 45 years of age, are implemented in most screening programs [6]. As prevalences of colorectal neoplasms are lower at younger ages, this would again imply that PPVs would be expected to be lower, and NNS would be expected to be higher than those reported from the SCREESCO trial.

Another very relevant question addressed by Westerberg et al. is whether and how sex differences in CRC epidemiology or screening test performance should be reflected in the interpretation of FIT results. It is a well-known universal observation that men have higher age-specific and age-standardized incidence and prevalence of colorectal neoplasms than women [7]. This implies that at any given age and FIT cutoff, positivity rates and PPVs are expected to be higher, and NNS are expected to be lower for men than for women, an observation that was also made in the study by Westerberg et al. These sex differences could be accounted for by using a lower cutoff for men than for women, by which equal PPVs and NNS could be achieved and limited colonoscopy capacities could be used in the most efficient possible way. A potential drawback of this approach would, though be that women would have a lower chance of having their neoplasms early detected or prevented than men. In an attempt to achieve “gender fairness” in terms of equal positivity rates for women and men, the Swedish screening program of Stockholm–Gotland even set a higher positivity threshold (80 μg/g) for men than for women (40 μg/g) [8]. However, such an approach further increases rather than decreases gender discrepancies in PPVs and NNS. The resulting high NNS for women may be a rather inefficient use of limited colonoscopy capacities and appears to require careful reconsideration.

A question the study by Westerberg et al. could not address, but which is carefully discussed by the authors, is the lack of colonoscopy results of participants with fecal Hb concentrations below 10 μg Hb/g feces. As a result, only relative sensitivities rather than absolute sensitivities could be derived, and it is unclear how the prevalence of neoplasms among participants in the various categories of FIT positivity compares to the vast majority of FIT negative screening participants. Such information can be derived from other studies conducted in the setting of screening colonoscopies [9]. These data show that even people in the “low positive range” with fecal Hb concentrations between 10 and 25 μg/g feces have a 3.5-fold risk of carrying any AN compared to the vast majority of >80% of screening participants with Hb concentrations below 8 μg/g feces. The strongly increased risk already in the “low positive range” would suggest that the observations by Westerberg et al. of higher PPVs and lower NNS achieved with higher FIT cutoffs compared to those with the 10 μg Hb/g cutoff should not be interpreted as support for higher FIT cutoffs. Although higher cutoffs may, in some instances, be inevitable due to limited colonoscopy resources, colonoscopic follow-up appears to be warranted after FIT-based detection of ≥3.5-fold increased risk of AN compared to the vast majority of the screening population.

The final goal of CRC screening should be to lower CRC incidence and mortality as much and efficiently as possible. Randomized controlled trials (RCTs) require a long time period from conception until the availability of long-term incidence and mortality outcomes and very large sample sizes, even if only two different screening strategies are compared. This strongly limits the implementation and use of RCTs for evaluating innovative screening approaches. Well-designed modeling studies may be a promising and rational complementary approach for the timely evaluation of novel screening strategies [10]. The detailed results on diagnostic performance parameters of FIT reported by Westerberg et al., along with data from screening colonoscopy cohorts, may be most valuable in informing such modeling studies for FIT-based and alternative screening approaches.

Hermann Brenner: Writing—original draft; writing—review and editing; conceptualization. Michael Hoffmeister: Writing—review and editing.

The authors have no conflicts of interest to declare.

查看原文本刊更多论文

在大肠癌筛查中充分利用粪便免疫化学定量检测结果。

Westerberg 等人的研究无法解决的一个问题是，缺乏粪便中 Hb 浓度低于 10 μg Hb/g 粪便的参与者的结肠镜检查结果，作者对此进行了仔细讨论。因此，只能得出相对灵敏度而非绝对灵敏度，而且目前还不清楚各类 FIT 阳性参与者的肿瘤发病率与绝大多数 FIT 阴性筛查参与者的肿瘤发病率相比如何。这些信息可以从其他在结肠镜筛查背景下进行的研究中获得[9]。这些数据显示，即使是粪便 Hb 浓度在 10-25 μg/g 粪便之间的 "低阳性范围 "人群，与绝大多数 Hb 浓度低于 8 μg/g 粪便的 80% 筛查参与者相比，携带任何 AN 的风险也要高出 3.5 倍。在 "低阳性范围 "中，风险已经大大增加，这表明 Westerberg 等人观察到，与 10 μg Hb/g 临界值相比，FIT 临界值越高，PPV 越高，NNS 越低，但这不应被解释为支持提高 FIT 临界值。尽管在某些情况下，由于结肠镜检查资源有限，较高的临界值可能是不可避免的，但与绝大多数筛查人群相比，在基于 FIT 检测出 AN 风险增加≥3.5 倍后，似乎有必要进行结肠镜随访。随机对照试验（RCT）从概念提出到获得长期发病率和死亡率结果需要很长的时间，而且样本量非常大，即使只比较两种不同的筛查策略也是如此。这极大地限制了 RCT 在评估创新筛查方法时的实施和使用。精心设计的模型研究可能是及时评估新型筛查策略的一种有前途的合理补充方法[10]。Westerberg 等人报告的关于 FIT 诊断性能参数的详细结果，以及来自筛查结肠镜队列的数据，可能对基于 FIT 和替代筛查方法的建模研究非常有价值：赫尔曼-布伦纳：写作-原稿；写作-审阅和编辑；构思。迈克尔-霍夫迈斯特作者无利益冲突需要声明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Internal Medicine 医学-医学：内科

CiteScore

22.00

自引率

0.90%

发文量

176

审稿时长

4-8 weeks

期刊介绍： JIM – The Journal of Internal Medicine, in continuous publication since 1863, is an international, peer-reviewed scientific journal. It publishes original work in clinical science, spanning from bench to bedside, encompassing a wide range of internal medicine and its subspecialties. JIM showcases original articles, reviews, brief reports, and research letters in the field of internal medicine.