Unreliable evidence from problematic risk of bias assessments: Comment on Begh et al., ‘Electronic cigarettes and subsequent cigarette smoking in young people: A systematic review’
{"title":"Unreliable evidence from problematic risk of bias assessments: Comment on Begh et al., ‘Electronic cigarettes and subsequent cigarette smoking in young people: A systematic review’","authors":"Sam Egger, Martin McKee","doi":"10.1111/add.70143","DOIUrl":null,"url":null,"abstract":"<p>It is widely acknowledged that cohort studies consistently find young people who use e-cigarettes are more likely to start smoking compared with non-users [<span>1</span>]. It is also recognised that in many countries, youth smoking prevalence has declined for decades and continues to decline [<span>2</span>]. However, these findings, while not necessarily contradictory, have been portrayed as inconsistent by e-cigarette and tobacco manufacturers, to emphasise uncertainty and doubt. Manufacturers often focus on results from ecological studies to minimise perceived risks to youth and argue against precautionary regulations [<span>3, 4</span>]. Given these complexities, high-quality systematic reviews integrating evidence from cohort and ecological studies are welcome, but they must assess both study types fairly and appropriately. Unfortunately, the recent review ‘Electronic cigarettes and subsequent cigarette smoking in young people: A systematic review’ [<span>5</span>] fails to do so.</p><p>Before addressing specific issues with the review, we must note that there is no established instrument for assessing the risk of bias (ROB) in ecological studies, at least not one recommended by Cochrane, the leading authority on systematic review methodology. The ROB assessment tools endorsed by Cochrane for non-randomised studies, including those by Morgan <i>et al</i>. [<span>6</span>], ROBINS-I (risk of bias in non-randomised studies of interventions) [<span>7</span>] or ROBINS-E (risk of bias in non-randomised studies of exposures) [<span>8</span>], are primarily designed for prospective cohort studies. While these include suggestions for possible adaptations for other individual-level studies, such as case–control studies, they lack guidance for ecological studies. Consequently, the ROB instrument used in this review to assess ecological studies appears largely self-designed and does not align with those tools, despite the authors’ claim that their instrument was adapted from Morgan <i>et al</i>.</p><p>A close examination of the review’s ROB assessment tool (https://osf.io/svgud) reveals a recurring pattern of ‘problematic standards’ in the design and application of ROB criteria. We consider them ‘problematic’ because they lead to unduly harsh ROB assessments of prospective cohort studies (individual-level studies) and/or unduly lenient ROB assessments of ecological studies (population-level studies). In many cases, they take the form of ‘double standards’, as the same or similar criteria could have been applied to both study types but were not. In Appendix S1, we detail 17 examples of ‘problematic standards’. The first two, relating to the ROB domain of ‘bias due to confounding’, are described as follows.</p><p>The first problematic standard concerns the requirement of instrumental variables (IVs) in cohort studies. The ROB criteria used in the review specify that for cohort studies to be classified as being at ‘low’ or ‘moderate’ risk of ‘bias due to confounding’, they must be ‘instrumental variable designs’. All other cohort studies are deemed to be at ‘serious’ or ‘critical’ risk of bias. This is despite prospective cohort studies being regarded as one of the most appropriate non-randomised study designs for determining cause and effect [<span>8-17</span>] when properly conducted, addressing issues such as potential confounding factors, selection bias and missing data. Given this extremely rigid requirement, even a well-adjusted cohort study that carefully controls for multiple confounding factors, including factors related to smoking and risk-taking behaviour, could only be classified as ‘serious risk’ of confounding at best. Given this, it is unsurprising that all 40 cohort studies assessed for ROB were classified as ‘serious’ or ‘critical’ risk of confounding.</p><p>So why do the authors privilege IV designs over all others? Unfortunately, they do not justify their inclusion anywhere in the review, and the term ‘instrumental variable’, or ‘IV’, is mentioned only once throughout the entire article, and even then no explanation is given as to why IVs would result in less biased estimates. This is concerning because IV analysis is a controversial statistical method that is rarely used in prospective cohort studies because of its important limitations, including potential biases [<span>18-20</span>]. Using an IV can sometimes introduce more bias than not adjusting for confounding factors, as it may amplify the impact of unmeasured confounding [<span>20</span>]. Moreover, although the review authors claim their ROB methods were informed by Morgan <i>et al</i>., neither Morgan <i>et al</i>. [<span>6</span>] nor ROBINS-E [<span>8</span>] mentions the use of IVs. While the ROBINS-I guidance article [<span>7</span>] briefly acknowledges the potential use of IVs it does so about an entirely different issue, the risk of ‘bias due to deviations from intended interventions’.</p><p>Worryingly, the insistence on IV designs for cohort studies also represents a ‘double standard’. If IV designs are considered essential for addressing confounding bias in cohort studies, why was this not also applied to ecological studies? If it had, 26 of the 27 included ecological studies would have been classified as having a ‘serious risk’ of confounding bias or worse (instead of the reported 11), as only one used IVs [<span>21</span>].</p><p>The authors might argue that ecological studies should not be held to the same standard as cohort studies, perhaps because of the challenges of identifying or obtaining suitable IVs in ecological research. However, this argument is fundamentally flawed. The assessment of a study’s ROB should be based solely on its inherent ROB. Whether there are practical obstacles to reducing bias is not relevant.</p><p>In summary, the review’s rigid and arbitrary requirement for cohort studies to be IV designs to be classified as ‘low risk’ or even ‘moderate risk’ of confounding is not only inconsistent with standard epidemiological practices but is also unsupported by Morgan <i>et al</i>., ROBINS-I and ROBINS-E. Having introduced this requirement for cohort studies without any apparent justification, their failure to apply it to ecological studies demonstrates how the review’s ROB methods unfairly disadvantage cohort studies.</p><p>This problematic standard is that ecological studies are classified as ‘low risk’ of ‘bias due to confounding’ if they meet the following conditions: ‘Natural experiments OR Parallel trends assumptions are tested and met AND dose response is tested for AND there are no concurrent policy changes or concurrent policy changes are controlled for AND fixed effects for place and time over which exposure varies are included’.</p><p>The review finds that 48% (<i>n</i> = 13) of the 27 ecological studies included are at ‘low risk’ of ‘bias due to confounding’. This seems surprising given the well-established limitations of ecological studies in controlling for confounding bias [<span>9-11, 22, 23</span>]. However, once again, this finding is less surprising when one looks at the criteria for classifying ecological studies as ‘low risk’. They comprise five considerations: four are interconnected and must all be met, while the fifth (‘natural experiments’) is independent and only needs to be satisfied. Despite their seemingly comprehensive nature, the four interconnected criteria: (i) provide only a superficial assessment of confounding; (ii) offer weak control of confounding; or (iii) are unrelated to the issue of confounding altogether.</p><p>The ‘parallel trends assumption’ is necessary for valid difference-in-differences analyses but cannot be empirically verified. Even if trends appear similar before the exposure, this does not guarantee that confounding factors are properly accounted for, as it fails to address unobserved confounding factors that may emerge or change after the exposure begins. ‘Dose–response’ is important for establishing causal exposure–outcome relationships and is part of the GRADE (Grading of Recommendations, Assessment, Development and Evaluations) assessment, but it is not relevant to the domain of confounding. ‘Fixed effects for place and time’ sounds impressive, but it simply refers to including fixed-effect covariates for time and place. While this provides limited control for certain characteristics that are constant in the location of the study and are subject to consistent trends, it does little to control for the many potential confounding factors that change within places over time. Examples here include changes in price, marketing and availability. Additionally, in a simple two-group design (where place defines the exposure), the effects of place and exposure cannot be separated owing to perfect collinearity. ‘Concurrent policy changes’ is the only potential confounding factor specific to smoking and vaping mentioned in the criteria. However, even if we accept that ecological studies could adequately control for concurrent policy changes (which is questionable given the complexity of potential interactions), this approach ignores a wide range of other social, economic and behavioural factors that may shift over time and influence outcomes.</p><p>A more problematic aspect of the ROB criteria is that any study labelled as a ‘natural experiment’ is automatically classified as being at ‘low risk’ of confounding, regardless of whether it takes steps to address confounding. Natural experiments employ various statistical methods, such as interrupted time series analysis or synthetic controls, and can provide valuable insights when certain conditions are met, such as a discrete intervention, rapid implementation, and short lags between implementation and impact. However, as with any study, all assumptions must be tested, sensitivity analyses conducted and falsifiability tests conducted. Just because something is a ‘natural experiment’, it cannot be deemed ‘low risk’, regardless of whether any efforts are made to control for confounding. This is akin to claiming that all cohort studies are inherently ‘low risk’ for confounding bias because they have an exposed and a non-exposed group, and do not need to control for confounding factors.</p><p>Given this standard, the authors’ ROB assessments effectively imply that 48% of the 27 ecological studies are equivalent to high-quality randomised trials in terms of confounding control. This is so implausible that it raises serious concerns about the credibility of the entire review.</p><p>However, there is another concern. Two review group members were credited with providing ‘… expert input on the risk of bias due to confounding associated with different population-level study designs’. Notably, they are co-authors on 10 of the 27 ecological studies included in the review [<span>21, 25-33</span>], and of those 10 studies, eight were classified as being at ‘low risk’ of confounding bias. In other words, the same experts who helped design the ROB criteria benefited from an assessment that deemed 80% of their ecological studies equivalent to high-quality randomised trials.</p><p>In this letter, we described two problematic ROB standards from Begh <i>et al</i>.’s systematic review, with 15 additional issues detailed in Appendix S1. In these two examples, we showed how the review creates a significant methodological imbalance by imposing an unjustified requirement for IV designs in cohort studies while using lenient and inconsistently justified ROB criteria for ecological studies. This leads to cohort studies – one of the most effective non-randomised study designs for determining cause and effect when properly conducted – to be dismissed while ecological studies are elevated to an implausible level of credibility in terms of confounding control.</p><p>The insistence on IV designs for cohort studies is particularly problematic, as IV analyses are rarely used in prospective cohort studies because of their limitations. The authors fail to justify their inclusion, and established ROB assessment tools do not endorse this approach. Consequently, all 40 cohort studies were classified as having a ‘serious’ or ‘critical’ ROB due to confounding.</p><p>Conversely, the review’s treatment of ecological studies is far more permissive. Despite well-documented limitations, nearly half (48%) were classified as being at ‘low risk’ for confounding. The ROB criteria rely on weak or irrelevant factors, such as the ‘parallel trends assumption’ (which cannot be empirically verified) and ‘dose–response’ (which pertains to causal inference rather than confounding). Most concerning is the blanket classification of all ‘natural experiments’ as being at ‘low risk’ for confounding, regardless of whether they take steps to address bias. This creates an illusion of methodological rigour while shielding ecological studies from scrutiny.</p><p>What appears to be a potential conflict of interest raises further concern. Two review group members who helped design the ROB criteria are co-authors of 10 of the included ecological studies, eight of which were rated as being at ‘low risk’ for confounding.</p><p>Because we did not have the capacity to conduct a complete re-analysis of Berg <i>et al</i>.’s ROB assessments with an appropriate ROB instrument, we cannot be certain how their flawed methodology affected their conclusions. Ultimately, however, this review’s selective and inconsistent design and application of ROB criteria has the potential to distort the scientific landscape. Rather than fairly synthesising the evidence, the review systematically disadvantages cohort studies while presenting ecological studies as more reliable than they truly are. Given the public health implications of e-cigarette regulation, these methodological biases risk shaping policy based on unreliable evidence.</p><p><b>Sam Egger</b>: Conceptualization; investigation; writing—original draft. <b>Martin McKee</b>: Conceptualization; writing—review and editing.</p><p>Martin McKee is a past president of the British Medical Association and of the European Public Health Association, both of which have expressed concerns about the public health impact and appropriate regulation of e-cigarettes. Sam Egger is a statistical editor on the editorial board of the Cochrane Breast Cancer Group, serves as a peer reviewer for the Cochrane Database of Systematic Reviews, and has expertise in the relationship between adolescent vaping and future smoking.</p>","PeriodicalId":109,"journal":{"name":"Addiction","volume":"120 11","pages":"2355-2358"},"PeriodicalIF":5.3000,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/add.70143","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Addiction","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/add.70143","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
It is widely acknowledged that cohort studies consistently find young people who use e-cigarettes are more likely to start smoking compared with non-users [1]. It is also recognised that in many countries, youth smoking prevalence has declined for decades and continues to decline [2]. However, these findings, while not necessarily contradictory, have been portrayed as inconsistent by e-cigarette and tobacco manufacturers, to emphasise uncertainty and doubt. Manufacturers often focus on results from ecological studies to minimise perceived risks to youth and argue against precautionary regulations [3, 4]. Given these complexities, high-quality systematic reviews integrating evidence from cohort and ecological studies are welcome, but they must assess both study types fairly and appropriately. Unfortunately, the recent review ‘Electronic cigarettes and subsequent cigarette smoking in young people: A systematic review’ [5] fails to do so.
Before addressing specific issues with the review, we must note that there is no established instrument for assessing the risk of bias (ROB) in ecological studies, at least not one recommended by Cochrane, the leading authority on systematic review methodology. The ROB assessment tools endorsed by Cochrane for non-randomised studies, including those by Morgan et al. [6], ROBINS-I (risk of bias in non-randomised studies of interventions) [7] or ROBINS-E (risk of bias in non-randomised studies of exposures) [8], are primarily designed for prospective cohort studies. While these include suggestions for possible adaptations for other individual-level studies, such as case–control studies, they lack guidance for ecological studies. Consequently, the ROB instrument used in this review to assess ecological studies appears largely self-designed and does not align with those tools, despite the authors’ claim that their instrument was adapted from Morgan et al.
A close examination of the review’s ROB assessment tool (https://osf.io/svgud) reveals a recurring pattern of ‘problematic standards’ in the design and application of ROB criteria. We consider them ‘problematic’ because they lead to unduly harsh ROB assessments of prospective cohort studies (individual-level studies) and/or unduly lenient ROB assessments of ecological studies (population-level studies). In many cases, they take the form of ‘double standards’, as the same or similar criteria could have been applied to both study types but were not. In Appendix S1, we detail 17 examples of ‘problematic standards’. The first two, relating to the ROB domain of ‘bias due to confounding’, are described as follows.
The first problematic standard concerns the requirement of instrumental variables (IVs) in cohort studies. The ROB criteria used in the review specify that for cohort studies to be classified as being at ‘low’ or ‘moderate’ risk of ‘bias due to confounding’, they must be ‘instrumental variable designs’. All other cohort studies are deemed to be at ‘serious’ or ‘critical’ risk of bias. This is despite prospective cohort studies being regarded as one of the most appropriate non-randomised study designs for determining cause and effect [8-17] when properly conducted, addressing issues such as potential confounding factors, selection bias and missing data. Given this extremely rigid requirement, even a well-adjusted cohort study that carefully controls for multiple confounding factors, including factors related to smoking and risk-taking behaviour, could only be classified as ‘serious risk’ of confounding at best. Given this, it is unsurprising that all 40 cohort studies assessed for ROB were classified as ‘serious’ or ‘critical’ risk of confounding.
So why do the authors privilege IV designs over all others? Unfortunately, they do not justify their inclusion anywhere in the review, and the term ‘instrumental variable’, or ‘IV’, is mentioned only once throughout the entire article, and even then no explanation is given as to why IVs would result in less biased estimates. This is concerning because IV analysis is a controversial statistical method that is rarely used in prospective cohort studies because of its important limitations, including potential biases [18-20]. Using an IV can sometimes introduce more bias than not adjusting for confounding factors, as it may amplify the impact of unmeasured confounding [20]. Moreover, although the review authors claim their ROB methods were informed by Morgan et al., neither Morgan et al. [6] nor ROBINS-E [8] mentions the use of IVs. While the ROBINS-I guidance article [7] briefly acknowledges the potential use of IVs it does so about an entirely different issue, the risk of ‘bias due to deviations from intended interventions’.
Worryingly, the insistence on IV designs for cohort studies also represents a ‘double standard’. If IV designs are considered essential for addressing confounding bias in cohort studies, why was this not also applied to ecological studies? If it had, 26 of the 27 included ecological studies would have been classified as having a ‘serious risk’ of confounding bias or worse (instead of the reported 11), as only one used IVs [21].
The authors might argue that ecological studies should not be held to the same standard as cohort studies, perhaps because of the challenges of identifying or obtaining suitable IVs in ecological research. However, this argument is fundamentally flawed. The assessment of a study’s ROB should be based solely on its inherent ROB. Whether there are practical obstacles to reducing bias is not relevant.
In summary, the review’s rigid and arbitrary requirement for cohort studies to be IV designs to be classified as ‘low risk’ or even ‘moderate risk’ of confounding is not only inconsistent with standard epidemiological practices but is also unsupported by Morgan et al., ROBINS-I and ROBINS-E. Having introduced this requirement for cohort studies without any apparent justification, their failure to apply it to ecological studies demonstrates how the review’s ROB methods unfairly disadvantage cohort studies.
This problematic standard is that ecological studies are classified as ‘low risk’ of ‘bias due to confounding’ if they meet the following conditions: ‘Natural experiments OR Parallel trends assumptions are tested and met AND dose response is tested for AND there are no concurrent policy changes or concurrent policy changes are controlled for AND fixed effects for place and time over which exposure varies are included’.
The review finds that 48% (n = 13) of the 27 ecological studies included are at ‘low risk’ of ‘bias due to confounding’. This seems surprising given the well-established limitations of ecological studies in controlling for confounding bias [9-11, 22, 23]. However, once again, this finding is less surprising when one looks at the criteria for classifying ecological studies as ‘low risk’. They comprise five considerations: four are interconnected and must all be met, while the fifth (‘natural experiments’) is independent and only needs to be satisfied. Despite their seemingly comprehensive nature, the four interconnected criteria: (i) provide only a superficial assessment of confounding; (ii) offer weak control of confounding; or (iii) are unrelated to the issue of confounding altogether.
The ‘parallel trends assumption’ is necessary for valid difference-in-differences analyses but cannot be empirically verified. Even if trends appear similar before the exposure, this does not guarantee that confounding factors are properly accounted for, as it fails to address unobserved confounding factors that may emerge or change after the exposure begins. ‘Dose–response’ is important for establishing causal exposure–outcome relationships and is part of the GRADE (Grading of Recommendations, Assessment, Development and Evaluations) assessment, but it is not relevant to the domain of confounding. ‘Fixed effects for place and time’ sounds impressive, but it simply refers to including fixed-effect covariates for time and place. While this provides limited control for certain characteristics that are constant in the location of the study and are subject to consistent trends, it does little to control for the many potential confounding factors that change within places over time. Examples here include changes in price, marketing and availability. Additionally, in a simple two-group design (where place defines the exposure), the effects of place and exposure cannot be separated owing to perfect collinearity. ‘Concurrent policy changes’ is the only potential confounding factor specific to smoking and vaping mentioned in the criteria. However, even if we accept that ecological studies could adequately control for concurrent policy changes (which is questionable given the complexity of potential interactions), this approach ignores a wide range of other social, economic and behavioural factors that may shift over time and influence outcomes.
A more problematic aspect of the ROB criteria is that any study labelled as a ‘natural experiment’ is automatically classified as being at ‘low risk’ of confounding, regardless of whether it takes steps to address confounding. Natural experiments employ various statistical methods, such as interrupted time series analysis or synthetic controls, and can provide valuable insights when certain conditions are met, such as a discrete intervention, rapid implementation, and short lags between implementation and impact. However, as with any study, all assumptions must be tested, sensitivity analyses conducted and falsifiability tests conducted. Just because something is a ‘natural experiment’, it cannot be deemed ‘low risk’, regardless of whether any efforts are made to control for confounding. This is akin to claiming that all cohort studies are inherently ‘low risk’ for confounding bias because they have an exposed and a non-exposed group, and do not need to control for confounding factors.
Given this standard, the authors’ ROB assessments effectively imply that 48% of the 27 ecological studies are equivalent to high-quality randomised trials in terms of confounding control. This is so implausible that it raises serious concerns about the credibility of the entire review.
However, there is another concern. Two review group members were credited with providing ‘… expert input on the risk of bias due to confounding associated with different population-level study designs’. Notably, they are co-authors on 10 of the 27 ecological studies included in the review [21, 25-33], and of those 10 studies, eight were classified as being at ‘low risk’ of confounding bias. In other words, the same experts who helped design the ROB criteria benefited from an assessment that deemed 80% of their ecological studies equivalent to high-quality randomised trials.
In this letter, we described two problematic ROB standards from Begh et al.’s systematic review, with 15 additional issues detailed in Appendix S1. In these two examples, we showed how the review creates a significant methodological imbalance by imposing an unjustified requirement for IV designs in cohort studies while using lenient and inconsistently justified ROB criteria for ecological studies. This leads to cohort studies – one of the most effective non-randomised study designs for determining cause and effect when properly conducted – to be dismissed while ecological studies are elevated to an implausible level of credibility in terms of confounding control.
The insistence on IV designs for cohort studies is particularly problematic, as IV analyses are rarely used in prospective cohort studies because of their limitations. The authors fail to justify their inclusion, and established ROB assessment tools do not endorse this approach. Consequently, all 40 cohort studies were classified as having a ‘serious’ or ‘critical’ ROB due to confounding.
Conversely, the review’s treatment of ecological studies is far more permissive. Despite well-documented limitations, nearly half (48%) were classified as being at ‘low risk’ for confounding. The ROB criteria rely on weak or irrelevant factors, such as the ‘parallel trends assumption’ (which cannot be empirically verified) and ‘dose–response’ (which pertains to causal inference rather than confounding). Most concerning is the blanket classification of all ‘natural experiments’ as being at ‘low risk’ for confounding, regardless of whether they take steps to address bias. This creates an illusion of methodological rigour while shielding ecological studies from scrutiny.
What appears to be a potential conflict of interest raises further concern. Two review group members who helped design the ROB criteria are co-authors of 10 of the included ecological studies, eight of which were rated as being at ‘low risk’ for confounding.
Because we did not have the capacity to conduct a complete re-analysis of Berg et al.’s ROB assessments with an appropriate ROB instrument, we cannot be certain how their flawed methodology affected their conclusions. Ultimately, however, this review’s selective and inconsistent design and application of ROB criteria has the potential to distort the scientific landscape. Rather than fairly synthesising the evidence, the review systematically disadvantages cohort studies while presenting ecological studies as more reliable than they truly are. Given the public health implications of e-cigarette regulation, these methodological biases risk shaping policy based on unreliable evidence.
Sam Egger: Conceptualization; investigation; writing—original draft. Martin McKee: Conceptualization; writing—review and editing.
Martin McKee is a past president of the British Medical Association and of the European Public Health Association, both of which have expressed concerns about the public health impact and appropriate regulation of e-cigarettes. Sam Egger is a statistical editor on the editorial board of the Cochrane Breast Cancer Group, serves as a peer reviewer for the Cochrane Database of Systematic Reviews, and has expertise in the relationship between adolescent vaping and future smoking.
期刊介绍:
Addiction publishes peer-reviewed research reports on pharmacological and behavioural addictions, bringing together research conducted within many different disciplines.
Its goal is to serve international and interdisciplinary scientific and clinical communication, to strengthen links between science and policy, and to stimulate and enhance the quality of debate. We seek submissions that are not only technically competent but are also original and contain information or ideas of fresh interest to our international readership. We seek to serve low- and middle-income (LAMI) countries as well as more economically developed countries.
Addiction’s scope spans human experimental, epidemiological, social science, historical, clinical and policy research relating to addiction, primarily but not exclusively in the areas of psychoactive substance use and/or gambling. In addition to original research, the journal features editorials, commentaries, reviews, letters, and book reviews.