Claire Heffernan, Kirsten Koehler, Misti Levy Zamora, Colby Buehler, Drew R Gentner, Roger D Peng, Abhirup Datta
{"title":"研究政策对空气污染影响的因果机器学习框架:COVID-19 封锁案例研究。","authors":"Claire Heffernan, Kirsten Koehler, Misti Levy Zamora, Colby Buehler, Drew R Gentner, Roger D Peng, Abhirup Datta","doi":"10.1093/aje/kwae171","DOIUrl":null,"url":null,"abstract":"<p><p>When studying the impact of policy interventions or natural experiments on air pollution, such as new environmental policies or the opening or closing of an industrial facility, careful statistical analysis is needed to separate causal changes from other confounding factors. Using COVID-19 lockdowns as a case study, we present a comprehensive framework for estimating and validating causal changes from such perturbations. We propose using flexible machine learning-based comparative interrupted time series (CITS) models for estimating such a causal effect. We outline the assumptions required to identify causal effects, showing that many common methods rely on strong assumptions that are relaxed by machine learning models. For empirical validation, we also propose a simple diagnostic criterion, guarding against false effects in baseline years when there was no intervention. The framework is applied to study the impact of COVID-19 lockdowns on atmospheric nitrogen dioxide (NO2) levels in the eastern United States. The machine learning approaches guard against false effects better than common methods and suggest decreases in NO2 levels in 4 US cities (Boston, Massachusetts; New York, New York; Baltimore, Maryland; and Washington, DC) during the pandemic lockdowns. The study showcases the importance of our validation framework in selecting a suitable method and the utility of a machine learning-based CITS model for studying causal changes in air pollution time series. This article is part of a Special Collection on Environmental Epidemiology.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"185-194"},"PeriodicalIF":5.0000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735973/pdf/","citationCount":"0","resultStr":"{\"title\":\"A causal machine-learning framework for studying policy impact on air pollution: a case study in COVID-19 lockdowns.\",\"authors\":\"Claire Heffernan, Kirsten Koehler, Misti Levy Zamora, Colby Buehler, Drew R Gentner, Roger D Peng, Abhirup Datta\",\"doi\":\"10.1093/aje/kwae171\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>When studying the impact of policy interventions or natural experiments on air pollution, such as new environmental policies or the opening or closing of an industrial facility, careful statistical analysis is needed to separate causal changes from other confounding factors. Using COVID-19 lockdowns as a case study, we present a comprehensive framework for estimating and validating causal changes from such perturbations. We propose using flexible machine learning-based comparative interrupted time series (CITS) models for estimating such a causal effect. We outline the assumptions required to identify causal effects, showing that many common methods rely on strong assumptions that are relaxed by machine learning models. For empirical validation, we also propose a simple diagnostic criterion, guarding against false effects in baseline years when there was no intervention. The framework is applied to study the impact of COVID-19 lockdowns on atmospheric nitrogen dioxide (NO2) levels in the eastern United States. The machine learning approaches guard against false effects better than common methods and suggest decreases in NO2 levels in 4 US cities (Boston, Massachusetts; New York, New York; Baltimore, Maryland; and Washington, DC) during the pandemic lockdowns. The study showcases the importance of our validation framework in selecting a suitable method and the utility of a machine learning-based CITS model for studying causal changes in air pollution time series. This article is part of a Special Collection on Environmental Epidemiology.</p>\",\"PeriodicalId\":7472,\"journal\":{\"name\":\"American journal of epidemiology\",\"volume\":\" \",\"pages\":\"185-194\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2025-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735973/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/aje/kwae171\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwae171","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
摘要
在研究政策干预或自然实验对空气污染的影响时,例如新的环境政策和工业设施的开放或关闭,需要进行仔细的统计分析,以便将因果变化与其他干扰因素区分开来。以 COVID-19 封锁为例,我们提出了一个综合框架,用于估算和验证此类扰动的因果变化。我们建议使用灵活的基于机器学习的比较中断时间序列(CITS)模型来估计这种因果效应。我们概述了识别因果效应所需的假设,表明许多常用方法都依赖于机器学习模型所放宽的强假设。为了进行经验验证,我们还提出了一个简单的诊断标准,以防止在没有干预措施的基线年出现虚假效应。该框架被用于研究 COVID-19 封锁对美国东部二氧化氮的影响。与普通方法相比,机器学习方法能更好地防止误报,并表明波士顿、纽约市、巴尔的摩和华盛顿特区的二氧化氮有所下降。这项研究表明了我们的验证框架在选择合适方法方面的重要性,以及基于机器学习的 CITS 模型在研究空气污染时间序列因果变化方面的实用性。
A causal machine-learning framework for studying policy impact on air pollution: a case study in COVID-19 lockdowns.
When studying the impact of policy interventions or natural experiments on air pollution, such as new environmental policies or the opening or closing of an industrial facility, careful statistical analysis is needed to separate causal changes from other confounding factors. Using COVID-19 lockdowns as a case study, we present a comprehensive framework for estimating and validating causal changes from such perturbations. We propose using flexible machine learning-based comparative interrupted time series (CITS) models for estimating such a causal effect. We outline the assumptions required to identify causal effects, showing that many common methods rely on strong assumptions that are relaxed by machine learning models. For empirical validation, we also propose a simple diagnostic criterion, guarding against false effects in baseline years when there was no intervention. The framework is applied to study the impact of COVID-19 lockdowns on atmospheric nitrogen dioxide (NO2) levels in the eastern United States. The machine learning approaches guard against false effects better than common methods and suggest decreases in NO2 levels in 4 US cities (Boston, Massachusetts; New York, New York; Baltimore, Maryland; and Washington, DC) during the pandemic lockdowns. The study showcases the importance of our validation framework in selecting a suitable method and the utility of a machine learning-based CITS model for studying causal changes in air pollution time series. This article is part of a Special Collection on Environmental Epidemiology.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.