{"title":"Estimating overall survival by combining administrative and hospital death data: a methodological challenge.","authors":"Pierre-Yves Cren,Clémence Leguillette,Franck Craynest,Ali Hammoudi,Maël Barthoulot,Matthieu Carton,Thomas Filleron,Sylvie Chabaud,Youenn Drouet,Marie-Cécile Le Deley","doi":"10.1007/s10654-025-01278-x","DOIUrl":null,"url":null,"abstract":"Since 2019, death data published by the Institute of Statistics and Economic Studies (INSEE) are available, raising questions regarding methodology and potential biases in overall survival analyses. We conducted a simulation study to quantify biases and formulate recommendations for using these data for research. We compared several approaches for estimating overall survival by (i) including only hospital data (EMR), (ii) adding deaths known to the INSEE (EMR_INSEE), or (iii) considering patients without reported death as \"alive\" (EMR_INSEE_IMP). We conducted simulation studies by varying the mortality risk of the disease studied, rate of loss to follow-up, and death capture rate from INSEE. With the EMR_INSEE approach, the risk of bias appeared to be significant in all clinical scenarios, with a large underestimation of overall survival. On comparing two survival curves, the hazard ratio estimate was highly biased, and type-I and II errors were inflated. With the EMR_INSEE_IMP approach, the risk of bias seemed low and acceptable for clinical situations involving low mortality, especially if loss to follow-up was low. However, some clinical situations seemed to require greater vigilance because of risk of bias when mortality was intermediate or high, especially when the risk of loss to follow-up was high. To our knowledge, this is the first study to assess the impact of using INSEE data in addition to hospital data on vital status. Various simulated scenarios enabled us to quantify the biases involved and thus make recommendations on the various possible strategies for using these data.","PeriodicalId":11907,"journal":{"name":"European Journal of Epidemiology","volume":"2 1","pages":""},"PeriodicalIF":5.9000,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10654-025-01278-x","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Since 2019, death data published by the Institute of Statistics and Economic Studies (INSEE) are available, raising questions regarding methodology and potential biases in overall survival analyses. We conducted a simulation study to quantify biases and formulate recommendations for using these data for research. We compared several approaches for estimating overall survival by (i) including only hospital data (EMR), (ii) adding deaths known to the INSEE (EMR_INSEE), or (iii) considering patients without reported death as "alive" (EMR_INSEE_IMP). We conducted simulation studies by varying the mortality risk of the disease studied, rate of loss to follow-up, and death capture rate from INSEE. With the EMR_INSEE approach, the risk of bias appeared to be significant in all clinical scenarios, with a large underestimation of overall survival. On comparing two survival curves, the hazard ratio estimate was highly biased, and type-I and II errors were inflated. With the EMR_INSEE_IMP approach, the risk of bias seemed low and acceptable for clinical situations involving low mortality, especially if loss to follow-up was low. However, some clinical situations seemed to require greater vigilance because of risk of bias when mortality was intermediate or high, especially when the risk of loss to follow-up was high. To our knowledge, this is the first study to assess the impact of using INSEE data in addition to hospital data on vital status. Various simulated scenarios enabled us to quantify the biases involved and thus make recommendations on the various possible strategies for using these data.
期刊介绍:
The European Journal of Epidemiology, established in 1985, is a peer-reviewed publication that provides a platform for discussions on epidemiology in its broadest sense. It covers various aspects of epidemiologic research and statistical methods. The journal facilitates communication between researchers, educators, and practitioners in epidemiology, including those in clinical and community medicine. Contributions from diverse fields such as public health, preventive medicine, clinical medicine, health economics, and computational biology and data science, in relation to health and disease, are encouraged. While accepting submissions from all over the world, the journal particularly emphasizes European topics relevant to epidemiology. The published articles consist of empirical research findings, developments in methodology, and opinion pieces.