Statistical Methods for Baseline Adjustment and Cohort Analysis in Korean National Health Insurance Claims Data: A Review of PSM, IPTW, and Survival Analysis With Future Directions.
{"title":"Statistical Methods for Baseline Adjustment and Cohort Analysis in Korean National Health Insurance Claims Data: A Review of PSM, IPTW, and Survival Analysis With Future Directions.","authors":"Dong Wook Kim","doi":"10.3346/jkms.2025.40.e110","DOIUrl":null,"url":null,"abstract":"<p><p>The utilization of health insurance claims data has expanded significantly, enabling researchers to conduct epidemiological studies on a large scale. This review examines key statistical methods for addressing baseline differences and conducting cohort analyses using Korean National Health Insurance claims data. Propensity score matching and inverse probability of treatment weighting are widely used to mitigate selection bias and enhance causal inference in observational studies. These methods help improve study validity by balancing covariates between treatment and control groups. Additionally, survival analysis techniques, such as the Cox proportional hazards model, are essential for assessing time-to-event outcomes and estimating hazard ratios while accounting for censoring. However, the application of these statistical methods is accompanied by challenges, including unmeasured confounding, instability in weight estimation, and violations of model assumptions. To address these limitations, emerging approaches, such as Doubly robust estimation, machine learning-based causal inference, and the marginal structural model, have gained prominence. These techniques offer greater flexibility and robustness in real-world data analysis. Future research should focus on refining methodologies for integrating high-dimensional health datasets and leveraging artificial intelligence to enhance predictive modeling and causal inference. Furthermore, the expansion of international collaborations and the adoption of standardized data models will facilitate large-scale multi-center studies. Ethical considerations, including data privacy and algorithmic transparency, should also be prioritized to ensure responsible data use. Maximizing the utility of health insurance claims data requires interdisciplinary collaboration, methodological advancements, and the implementation of rigorous statistical techniques to support evidence-based healthcare policy and improve public health outcomes.</p>","PeriodicalId":16249,"journal":{"name":"Journal of Korean Medical Science","volume":"40 8","pages":"e110"},"PeriodicalIF":3.0000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876781/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Korean Medical Science","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3346/jkms.2025.40.e110","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
Abstract
The utilization of health insurance claims data has expanded significantly, enabling researchers to conduct epidemiological studies on a large scale. This review examines key statistical methods for addressing baseline differences and conducting cohort analyses using Korean National Health Insurance claims data. Propensity score matching and inverse probability of treatment weighting are widely used to mitigate selection bias and enhance causal inference in observational studies. These methods help improve study validity by balancing covariates between treatment and control groups. Additionally, survival analysis techniques, such as the Cox proportional hazards model, are essential for assessing time-to-event outcomes and estimating hazard ratios while accounting for censoring. However, the application of these statistical methods is accompanied by challenges, including unmeasured confounding, instability in weight estimation, and violations of model assumptions. To address these limitations, emerging approaches, such as Doubly robust estimation, machine learning-based causal inference, and the marginal structural model, have gained prominence. These techniques offer greater flexibility and robustness in real-world data analysis. Future research should focus on refining methodologies for integrating high-dimensional health datasets and leveraging artificial intelligence to enhance predictive modeling and causal inference. Furthermore, the expansion of international collaborations and the adoption of standardized data models will facilitate large-scale multi-center studies. Ethical considerations, including data privacy and algorithmic transparency, should also be prioritized to ensure responsible data use. Maximizing the utility of health insurance claims data requires interdisciplinary collaboration, methodological advancements, and the implementation of rigorous statistical techniques to support evidence-based healthcare policy and improve public health outcomes.
期刊介绍:
The Journal of Korean Medical Science (JKMS) is an international, peer-reviewed Open Access journal of medicine published weekly in English. The Journal’s publisher is the Korean Academy of Medical Sciences (KAMS), Korean Medical Association (KMA). JKMS aims to publish evidence-based, scientific research articles from various disciplines of the medical sciences. The Journal welcomes articles of general interest to medical researchers especially when they contain original information. Articles on the clinical evaluation of drugs and other therapies, epidemiologic studies of the general population, studies on pathogenic organisms and toxic materials, and the toxicities and adverse effects of therapeutics are welcome.