Machine learning for automated cause-of-death classification from 2021 to 2022 in Korea: development and validation of an ICD-10 prediction model.

IF 0.2 Q3 MEDICINE, GENERAL & INTERNAL

Ewha Medical Journal Pub Date : 2025-07-01 Epub Date: 2025-07-28 DOI:10.12771/emj.2025.00675

Seokmin Lee, Gyeongmin Im

{"title":"Machine learning for automated cause-of-death classification from 2021 to 2022 in Korea: development and validation of an ICD-10 prediction model.","authors":"Seokmin Lee, Gyeongmin Im","doi":"10.12771/emj.2025.00675","DOIUrl":null,"url":null,"abstract":"Purpose: This study evaluated the feasibility and performance of a deep learning approach utilizing the Korean Medical BERT (KM-BERT) model for the automated classification of underlying causes of death within national mortality statistics. It aimed to assess predictive accuracy throughout the cause-of-death coding workflow and to identify limitations and opportunities for further artificial intelligence (AI) integration.Methods: We performed a retrospective prediction study using 693,587 death certificates issued in Korea between January 2021 and December 2022. Free-text fields for immediate, antecedent, and contributory causes were concatenated and fine-tuned with KM-BERT. Three classification models were developed: (1) final underlying cause prediction (International Classification of Diseases, 10th Revision [ICD-10] code) from certificate inputs, (2) tentative underlying cause selection based on ICD-10 Volume 2 rules, and (3) classification of individual cause-of-death entries. Models were trained and validated using 2021 data (80% training, 20% validation) and evaluated on 2022 data. Performance metrics included overall accuracy, weighted F1 score, and macro F1 score.Results: On 306,898 certificates from 2022, the final cause model achieved 62.65% accuracy (F1-weighted, 0.5940; F1-macro, 0.1503). The tentative cause model demonstrated 95.35% accuracy (F1-weighted, 0.9516; F1-macro, 0.4996). The individual entry model yielded 79.51% accuracy (F1-weighted, 0.7741; F1-macro, 0.9250). Error analysis indicated reduced reliability for rare diseases and for specific ICD chapters, which require supplementary administrative data.Conclusion: Despite strong performance in mapping free-text inputs and selecting tentative underlying causes, there remains a need for improved data quality, administrative record integration, and model refinement. A systematic, long-term approach is essential for the broad adoption of AI in mortality statistics.","PeriodicalId":41392,"journal":{"name":"Ewha Medical Journal","volume":"48 3","pages":"e45"},"PeriodicalIF":0.2000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12362283/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ewha Medical Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12771/emj.2025.00675","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/28 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: This study evaluated the feasibility and performance of a deep learning approach utilizing the Korean Medical BERT (KM-BERT) model for the automated classification of underlying causes of death within national mortality statistics. It aimed to assess predictive accuracy throughout the cause-of-death coding workflow and to identify limitations and opportunities for further artificial intelligence (AI) integration.

Methods: We performed a retrospective prediction study using 693,587 death certificates issued in Korea between January 2021 and December 2022. Free-text fields for immediate, antecedent, and contributory causes were concatenated and fine-tuned with KM-BERT. Three classification models were developed: (1) final underlying cause prediction (International Classification of Diseases, 10th Revision [ICD-10] code) from certificate inputs, (2) tentative underlying cause selection based on ICD-10 Volume 2 rules, and (3) classification of individual cause-of-death entries. Models were trained and validated using 2021 data (80% training, 20% validation) and evaluated on 2022 data. Performance metrics included overall accuracy, weighted F1 score, and macro F1 score.

Results: On 306,898 certificates from 2022, the final cause model achieved 62.65% accuracy (F1-weighted, 0.5940; F1-macro, 0.1503). The tentative cause model demonstrated 95.35% accuracy (F1-weighted, 0.9516; F1-macro, 0.4996). The individual entry model yielded 79.51% accuracy (F1-weighted, 0.7741; F1-macro, 0.9250). Error analysis indicated reduced reliability for rare diseases and for specific ICD chapters, which require supplementary administrative data.

Conclusion: Despite strong performance in mapping free-text inputs and selecting tentative underlying causes, there remains a need for improved data quality, administrative record integration, and model refinement. A systematic, long-term approach is essential for the broad adoption of AI in mortality statistics.

Abstract Image

查看原文本刊更多论文

韩国2021 - 2022年自动死因分类的机器学习：ICD-10预测模型的开发和验证

目的：本研究评估了利用韩国医学BERT （KM-BERT）模型对国家死亡率统计中潜在死亡原因进行自动分类的深度学习方法的可行性和性能。它旨在评估整个死因编码工作流程的预测准确性，并确定进一步集成人工智能的限制和机会。方法：我们对韩国在2021年1月至2022年12月期间签发的693587份死亡证明进行了回顾性预测研究。直接原因、先行原因和辅助原因的自由文本字段用KM-BERT进行了连接和微调。开发了三种分类模型：(1)根据证书输入的最终潜在原因预测（国际疾病分类，第十次修订[ICD-10]代码），(2)基于ICD-10第2卷规则的暂定潜在原因选择，以及(3)个人死因条目的分类。使用2021个数据对模型进行训练和验证（80%训练，20%验证），并对2022个数据进行评估。性能指标包括总体准确性、加权F1分数和宏观F1分数。结果：对2022年以来的306,898份证书，最终原因模型的准确率达到62.65% (f1加权，0.5940；F1-macro, 0.1503)。初步原因模型的准确率为95.35% (f1加权，0.9516；F1-macro, 0.4996)。个体进入模型的准确率为79.51% (f1加权，0.7741；F1-macro, 0.9250)。错误分析表明，罕见病和特定ICD章节的可靠性降低，这需要补充管理数据。结论：尽管在映射自由文本输入和选择暂定的潜在原因方面表现出色，但仍需要改进数据质量、管理记录集成和模型优化。要在死亡率统计中广泛采用人工智能，系统的长期方法至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ewha Medical Journal MEDICINE, GENERAL & INTERNAL-

自引率

33.30%

发文量