数据驱动算法在丹麦国家患者登记的住院和门诊患者分类。

IF 3.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Clinical Epidemiology Pub Date : 2025-02-21 eCollection Date: 2025-01-01 DOI:10.2147/CLEP.S500800
Ann-Sophie Buchardt, Pi Vejsig Madsen, Andreas Jensen
{"title":"数据驱动算法在丹麦国家患者登记的住院和门诊患者分类。","authors":"Ann-Sophie Buchardt, Pi Vejsig Madsen, Andreas Jensen","doi":"10.2147/CLEP.S500800","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The Danish National Patient Register (DNPR) is an important data source for research providing detailed information on all hospital contacts in Denmark. With the transition from the second version of the DNPR (DNPR2) to the third version (DNPR3) in early 2019, the patient type variable (inpatient, elective outpatient, acute outpatient) was removed. This study proposes and evaluates algorithms to classify hospital contacts into these categories in DNPR3, aiming for consensus in data interpretation for researchers using Danish registries.</p><p><strong>Patients and methods: </strong>We analyzed somatic public hospital contacts in Denmark from 2017 to 2020, with 20,882,018 unique contacts in DNPR2 and 27,694,584 in DNPR3. Several classification algorithms were developed and assessed, including department-based, contact-based, and hybrid methods, to infer patient types in DNPR3 based on contact features, such as duration and admission type. In DNPR3, where the true patient type is unknown, proxy labels were used to train classification algorithms.</p><p><strong>Results: </strong>Compared to the true patient type variable in DNPR2, our department-based classifier showed high positive predictive values (PPVs) and sensitivities in DNPR2 with PPVs ranging from 95.6 to 99.5 and sensitivities ranging from 94.1 to 99.6 across patient types. The hybrid approach showed improved PPVs and sensitivities for acute (PPV = 97.3, sensitivity = 96.8) and elective (PPV = 99.8, sensitivity = 99.9) outpatients. In both DNPR2 and DNPR3 high agreement between contact-based classification algorithms was obtained indicating robustness in our classification methods which suggests the presence of inherent patterns in the data.</p><p><strong>Conclusion: </strong>Our study shows that all presented classification methods are suitable for categorizing patient types in DNPR2 depending on the available data and furthermore demonstrated robustness, supporting their suitability for classification in DNPR3. Future research should explore advanced techniques and comprehensive department classification for enhanced accuracy and applicability.</p>","PeriodicalId":10362,"journal":{"name":"Clinical Epidemiology","volume":"17 ","pages":"147-163"},"PeriodicalIF":3.4000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11853825/pdf/","citationCount":"0","resultStr":"{\"title\":\"Data-Driven Algorithms for Classification of In- and Outpatients in the Danish National Patient Register.\",\"authors\":\"Ann-Sophie Buchardt, Pi Vejsig Madsen, Andreas Jensen\",\"doi\":\"10.2147/CLEP.S500800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>The Danish National Patient Register (DNPR) is an important data source for research providing detailed information on all hospital contacts in Denmark. With the transition from the second version of the DNPR (DNPR2) to the third version (DNPR3) in early 2019, the patient type variable (inpatient, elective outpatient, acute outpatient) was removed. This study proposes and evaluates algorithms to classify hospital contacts into these categories in DNPR3, aiming for consensus in data interpretation for researchers using Danish registries.</p><p><strong>Patients and methods: </strong>We analyzed somatic public hospital contacts in Denmark from 2017 to 2020, with 20,882,018 unique contacts in DNPR2 and 27,694,584 in DNPR3. Several classification algorithms were developed and assessed, including department-based, contact-based, and hybrid methods, to infer patient types in DNPR3 based on contact features, such as duration and admission type. In DNPR3, where the true patient type is unknown, proxy labels were used to train classification algorithms.</p><p><strong>Results: </strong>Compared to the true patient type variable in DNPR2, our department-based classifier showed high positive predictive values (PPVs) and sensitivities in DNPR2 with PPVs ranging from 95.6 to 99.5 and sensitivities ranging from 94.1 to 99.6 across patient types. The hybrid approach showed improved PPVs and sensitivities for acute (PPV = 97.3, sensitivity = 96.8) and elective (PPV = 99.8, sensitivity = 99.9) outpatients. In both DNPR2 and DNPR3 high agreement between contact-based classification algorithms was obtained indicating robustness in our classification methods which suggests the presence of inherent patterns in the data.</p><p><strong>Conclusion: </strong>Our study shows that all presented classification methods are suitable for categorizing patient types in DNPR2 depending on the available data and furthermore demonstrated robustness, supporting their suitability for classification in DNPR3. Future research should explore advanced techniques and comprehensive department classification for enhanced accuracy and applicability.</p>\",\"PeriodicalId\":10362,\"journal\":{\"name\":\"Clinical Epidemiology\",\"volume\":\"17 \",\"pages\":\"147-163\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11853825/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2147/CLEP.S500800\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/CLEP.S500800","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

摘要

目的:丹麦国家病人登记册(DNPR)是一个重要的研究数据来源,提供丹麦所有医院接触者的详细信息。随着2019年初从第二版dpr (DNPR2)过渡到第三版dpr (DNPR3),患者类型变量(住院、选择性门诊、急性门诊)被删除。本研究提出并评估了将医院接触者分类为DNPR3中这些类别的算法,旨在为使用丹麦注册表的研究人员在数据解释方面达成共识。患者和方法:我们分析了2017年至2020年丹麦公立医院的躯体接触者,DNPR2中有20,882,018个唯一接触者,DNPR3中有27,694,584个。研究人员开发并评估了几种分类算法,包括基于科室、基于接触者和混合方法,以根据接触特征(如持续时间和住院类型)推断DNPR3中的患者类型。在DNPR3中,患者的真实类型未知,使用代理标签来训练分类算法。结果:与DNPR2的真实患者类型变量相比,我们基于科室的分类器在DNPR2中显示出较高的阳性预测值(ppv)和敏感性,不同患者类型的ppv范围为95.6至99.5,敏感性范围为94.1至99.6。混合入路对急性门诊患者(PPV = 97.3,敏感性= 96.8)和择期门诊患者(PPV = 99.8,敏感性= 99.9)的PPV和敏感性均有改善。在DNPR2和DNPR3中,基于接触的分类算法之间的一致性很高,表明我们的分类方法具有鲁棒性,这表明数据中存在固有模式。结论:我们的研究表明,所有提出的分类方法都适合根据现有数据对DNPR2中的患者类型进行分类,并且具有鲁棒性,支持它们在DNPR3中的分类适用性。未来的研究应探索先进的技术和综合的部门分类,以提高准确性和适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data-Driven Algorithms for Classification of In- and Outpatients in the Danish National Patient Register.

Purpose: The Danish National Patient Register (DNPR) is an important data source for research providing detailed information on all hospital contacts in Denmark. With the transition from the second version of the DNPR (DNPR2) to the third version (DNPR3) in early 2019, the patient type variable (inpatient, elective outpatient, acute outpatient) was removed. This study proposes and evaluates algorithms to classify hospital contacts into these categories in DNPR3, aiming for consensus in data interpretation for researchers using Danish registries.

Patients and methods: We analyzed somatic public hospital contacts in Denmark from 2017 to 2020, with 20,882,018 unique contacts in DNPR2 and 27,694,584 in DNPR3. Several classification algorithms were developed and assessed, including department-based, contact-based, and hybrid methods, to infer patient types in DNPR3 based on contact features, such as duration and admission type. In DNPR3, where the true patient type is unknown, proxy labels were used to train classification algorithms.

Results: Compared to the true patient type variable in DNPR2, our department-based classifier showed high positive predictive values (PPVs) and sensitivities in DNPR2 with PPVs ranging from 95.6 to 99.5 and sensitivities ranging from 94.1 to 99.6 across patient types. The hybrid approach showed improved PPVs and sensitivities for acute (PPV = 97.3, sensitivity = 96.8) and elective (PPV = 99.8, sensitivity = 99.9) outpatients. In both DNPR2 and DNPR3 high agreement between contact-based classification algorithms was obtained indicating robustness in our classification methods which suggests the presence of inherent patterns in the data.

Conclusion: Our study shows that all presented classification methods are suitable for categorizing patient types in DNPR2 depending on the available data and furthermore demonstrated robustness, supporting their suitability for classification in DNPR3. Future research should explore advanced techniques and comprehensive department classification for enhanced accuracy and applicability.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Epidemiology
Clinical Epidemiology Medicine-Epidemiology
CiteScore
6.30
自引率
5.10%
发文量
169
审稿时长
16 weeks
期刊介绍: Clinical Epidemiology is an international, peer reviewed, open access journal. Clinical Epidemiology focuses on the application of epidemiological principles and questions relating to patients and clinical care in terms of prevention, diagnosis, prognosis, and treatment. Clinical Epidemiology welcomes papers covering these topics in form of original research and systematic reviews. Clinical Epidemiology has a special interest in international electronic medical patient records and other routine health care data, especially as applied to safety of medical interventions, clinical utility of diagnostic procedures, understanding short- and long-term clinical course of diseases, clinical epidemiological and biostatistical methods, and systematic reviews. When considering submission of a paper utilizing publicly-available data, authors should ensure that such studies add significantly to the body of knowledge and that they use appropriate validated methods for identifying health outcomes. The journal has launched special series describing existing data sources for clinical epidemiology, international health care systems and validation studies of algorithms based on databases and registries.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信