Christian Brieghel, Mikkel Werling, Casper Møller Frederiksen, Mehdi Parviz, Thomas Lacoppidan, Tereza Faitova, Rebecca Svanberg Teglgaard, Noomi Vainer, Caspar da Cunha-Bang, Emelie Curovic Rotbain, Rudi Agius, Carsten Utoft Niemann
{"title":"丹麦淋巴癌研究(DALY-CARE)数据资源:发展数据驱动血液学的基础。","authors":"Christian Brieghel, Mikkel Werling, Casper Møller Frederiksen, Mehdi Parviz, Thomas Lacoppidan, Tereza Faitova, Rebecca Svanberg Teglgaard, Noomi Vainer, Caspar da Cunha-Bang, Emelie Curovic Rotbain, Rudi Agius, Carsten Utoft Niemann","doi":"10.2147/CLEP.S479672","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Lymphoid-lineage cancers (LC; International Classification of Diseases, 10<sup>th</sup> edition [ICD10] C81.x-C90.x, C91.1-C91.9, C95.1, C95.7, C95.9, D47.2, D47.9B, and E85.8A) share many epidemiological and clinical features, which favor meta-learning when developing medical artificial intelligence (mAI). However, access to large, shared datasets is largely missing and limits mAI research.</p><p><strong>Aim: </strong>Creating a large-scale data repository for patients with LC to develop data-driven hematology.</p><p><strong>Methods: </strong>We gathered electronic health data and created open-source processing pipelines to create a comprehensive data resource for Danish LC Research (DALY-CARE) approved for epidemiological, molecular, and data-driven research.</p><p><strong>Results: </strong>We included all Danish adults registered with LC diagnoses since 2002 (n=65,774) and combined 10 nationwide registers, electronic health records (EHR), and laboratory data on a high-powered cloud-computer to develop a secure research environment. Among other, data include treatments (ie 21,750 cytoreductive treatment plans, 21.3M outpatient prescriptions, and 12.7M in-hospital administrations), biochemical analyses (77.3M), comorbidity (14.8M ICD10 codes), pathology codes (4.5M), treatment procedures (8.3M), surgical procedures (1.0M), radiological examinations (3.3M), vital signs (18.3M values), and survival data. We herein describe the data infrastructure and exemplify how DALY-CARE has been used for molecular studies, real-world evidence to evaluate the efficacy of care, and mAI deployed directly into EHR systems.</p><p><strong>Conclusion: </strong>The DALY-CARE data resource allows for the development of near real-time decision-support tools and extrapolation of clinical trial results to clinical practice, thereby improving care for patients with LC while facilitating streamlining of health data infrastructure across cohorts and medical specialties.</p>","PeriodicalId":10362,"journal":{"name":"Clinical Epidemiology","volume":"17 ","pages":"131-145"},"PeriodicalIF":3.4000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11849980/pdf/","citationCount":"0","resultStr":"{\"title\":\"The Danish Lymphoid Cancer Research (DALY-CARE) Data Resource: The Basis for Developing Data-Driven Hematology.\",\"authors\":\"Christian Brieghel, Mikkel Werling, Casper Møller Frederiksen, Mehdi Parviz, Thomas Lacoppidan, Tereza Faitova, Rebecca Svanberg Teglgaard, Noomi Vainer, Caspar da Cunha-Bang, Emelie Curovic Rotbain, Rudi Agius, Carsten Utoft Niemann\",\"doi\":\"10.2147/CLEP.S479672\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Lymphoid-lineage cancers (LC; International Classification of Diseases, 10<sup>th</sup> edition [ICD10] C81.x-C90.x, C91.1-C91.9, C95.1, C95.7, C95.9, D47.2, D47.9B, and E85.8A) share many epidemiological and clinical features, which favor meta-learning when developing medical artificial intelligence (mAI). However, access to large, shared datasets is largely missing and limits mAI research.</p><p><strong>Aim: </strong>Creating a large-scale data repository for patients with LC to develop data-driven hematology.</p><p><strong>Methods: </strong>We gathered electronic health data and created open-source processing pipelines to create a comprehensive data resource for Danish LC Research (DALY-CARE) approved for epidemiological, molecular, and data-driven research.</p><p><strong>Results: </strong>We included all Danish adults registered with LC diagnoses since 2002 (n=65,774) and combined 10 nationwide registers, electronic health records (EHR), and laboratory data on a high-powered cloud-computer to develop a secure research environment. Among other, data include treatments (ie 21,750 cytoreductive treatment plans, 21.3M outpatient prescriptions, and 12.7M in-hospital administrations), biochemical analyses (77.3M), comorbidity (14.8M ICD10 codes), pathology codes (4.5M), treatment procedures (8.3M), surgical procedures (1.0M), radiological examinations (3.3M), vital signs (18.3M values), and survival data. We herein describe the data infrastructure and exemplify how DALY-CARE has been used for molecular studies, real-world evidence to evaluate the efficacy of care, and mAI deployed directly into EHR systems.</p><p><strong>Conclusion: </strong>The DALY-CARE data resource allows for the development of near real-time decision-support tools and extrapolation of clinical trial results to clinical practice, thereby improving care for patients with LC while facilitating streamlining of health data infrastructure across cohorts and medical specialties.</p>\",\"PeriodicalId\":10362,\"journal\":{\"name\":\"Clinical Epidemiology\",\"volume\":\"17 \",\"pages\":\"131-145\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11849980/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2147/CLEP.S479672\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/CLEP.S479672","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
The Danish Lymphoid Cancer Research (DALY-CARE) Data Resource: The Basis for Developing Data-Driven Hematology.
Background: Lymphoid-lineage cancers (LC; International Classification of Diseases, 10th edition [ICD10] C81.x-C90.x, C91.1-C91.9, C95.1, C95.7, C95.9, D47.2, D47.9B, and E85.8A) share many epidemiological and clinical features, which favor meta-learning when developing medical artificial intelligence (mAI). However, access to large, shared datasets is largely missing and limits mAI research.
Aim: Creating a large-scale data repository for patients with LC to develop data-driven hematology.
Methods: We gathered electronic health data and created open-source processing pipelines to create a comprehensive data resource for Danish LC Research (DALY-CARE) approved for epidemiological, molecular, and data-driven research.
Results: We included all Danish adults registered with LC diagnoses since 2002 (n=65,774) and combined 10 nationwide registers, electronic health records (EHR), and laboratory data on a high-powered cloud-computer to develop a secure research environment. Among other, data include treatments (ie 21,750 cytoreductive treatment plans, 21.3M outpatient prescriptions, and 12.7M in-hospital administrations), biochemical analyses (77.3M), comorbidity (14.8M ICD10 codes), pathology codes (4.5M), treatment procedures (8.3M), surgical procedures (1.0M), radiological examinations (3.3M), vital signs (18.3M values), and survival data. We herein describe the data infrastructure and exemplify how DALY-CARE has been used for molecular studies, real-world evidence to evaluate the efficacy of care, and mAI deployed directly into EHR systems.
Conclusion: The DALY-CARE data resource allows for the development of near real-time decision-support tools and extrapolation of clinical trial results to clinical practice, thereby improving care for patients with LC while facilitating streamlining of health data infrastructure across cohorts and medical specialties.
期刊介绍:
Clinical Epidemiology is an international, peer reviewed, open access journal. Clinical Epidemiology focuses on the application of epidemiological principles and questions relating to patients and clinical care in terms of prevention, diagnosis, prognosis, and treatment.
Clinical Epidemiology welcomes papers covering these topics in form of original research and systematic reviews.
Clinical Epidemiology has a special interest in international electronic medical patient records and other routine health care data, especially as applied to safety of medical interventions, clinical utility of diagnostic procedures, understanding short- and long-term clinical course of diseases, clinical epidemiological and biostatistical methods, and systematic reviews.
When considering submission of a paper utilizing publicly-available data, authors should ensure that such studies add significantly to the body of knowledge and that they use appropriate validated methods for identifying health outcomes.
The journal has launched special series describing existing data sources for clinical epidemiology, international health care systems and validation studies of algorithms based on databases and registries.