Cross-Database Evaluation of Deep Learning Methods for Intrapartum Cardiotocography Classification

IF 4.4 3区医学 Q2 ENGINEERING, BIOMEDICAL

IEEE Journal of Translational Engineering in Health and Medicine-Jtehm Pub Date : 2025-03-05 DOI:10.1109/JTEHM.2025.3548401

Lochana Mendis;Debjyoti Karmakar;Marimuthu Palaniswami;Fiona Brownfoot;Emerson Keenan

{"title":"Cross-Database Evaluation of Deep Learning Methods for Intrapartum Cardiotocography Classification","authors":"Lochana Mendis;Debjyoti Karmakar;Marimuthu Palaniswami;Fiona Brownfoot;Emerson Keenan","doi":"10.1109/JTEHM.2025.3548401","DOIUrl":null,"url":null,"abstract":"Continuous monitoring of fetal heart rate (FHR) and uterine contractions (UC), otherwise known as cardiotocography (CTG), is often used to assess the risk of fetal compromise during labor. However, interpreting CTG recordings visually is challenging for clinicians, given the complexity of CTG patterns, leading to poor sensitivity. Efforts to address this issue have focused on data-driven deep-learning methods to detect fetal compromise automatically. However, their progress is impeded by limited CTG training datasets and the absence of a standardized evaluation workflow, hindering algorithm comparisons. In this study, we use a private CTG dataset of 9,887 CTG recordings with pH measurements and 552 CTG recordings from the open-access CTU-UHB dataset to conduct a cross-database evaluation of six deep-learning models for fetal compromise detection. We explore the impact of input selection of FHR and UC signals, signal pre-processing, downsampling frequency, and the influence of removing intermediate pH samples from the training dataset. Our findings reveal that using only FHR and pre-processing FHR with artefact removal and interpolation provides a significant improvement to classification performance for some model architectures while excluding intermediate pH samples did not significantly improve performance for any model. From our comparison of the six models, ResNet exhibited the strongest fetal compromise classification performance across both databases at a downsampling rate of 1Hz. Finally, class activation maps from highly contributing signal regions in the ResNet model aligned with clinical knowledge of compromised FHR patterns, highlighting the model’s interpretability. These insights may serve as a standardized reference for developing and comparing future works in this domain. Clinical and Translational Impact: This study provides a standardized workflow for comparing deep-learning methods for CTG classification. Ensuring new methods show generalizability and interpretability will improve their robustness and applicability in clinical settings.","PeriodicalId":54255,"journal":{"name":"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm","volume":"13 ","pages":"123-135"},"PeriodicalIF":4.4000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10912500","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10912500/","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Continuous monitoring of fetal heart rate (FHR) and uterine contractions (UC), otherwise known as cardiotocography (CTG), is often used to assess the risk of fetal compromise during labor. However, interpreting CTG recordings visually is challenging for clinicians, given the complexity of CTG patterns, leading to poor sensitivity. Efforts to address this issue have focused on data-driven deep-learning methods to detect fetal compromise automatically. However, their progress is impeded by limited CTG training datasets and the absence of a standardized evaluation workflow, hindering algorithm comparisons. In this study, we use a private CTG dataset of 9,887 CTG recordings with pH measurements and 552 CTG recordings from the open-access CTU-UHB dataset to conduct a cross-database evaluation of six deep-learning models for fetal compromise detection. We explore the impact of input selection of FHR and UC signals, signal pre-processing, downsampling frequency, and the influence of removing intermediate pH samples from the training dataset. Our findings reveal that using only FHR and pre-processing FHR with artefact removal and interpolation provides a significant improvement to classification performance for some model architectures while excluding intermediate pH samples did not significantly improve performance for any model. From our comparison of the six models, ResNet exhibited the strongest fetal compromise classification performance across both databases at a downsampling rate of 1Hz. Finally, class activation maps from highly contributing signal regions in the ResNet model aligned with clinical knowledge of compromised FHR patterns, highlighting the model’s interpretability. These insights may serve as a standardized reference for developing and comparing future works in this domain. Clinical and Translational Impact: This study provides a standardized workflow for comparing deep-learning methods for CTG classification. Ensuring new methods show generalizability and interpretability will improve their robustness and applicability in clinical settings.

查看原文本刊更多论文

产时心脏学分类深度学习方法的跨数据库评价

连续监测胎儿心率（FHR）和子宫收缩（UC），也被称为心脏造影（CTG），通常用于评估分娩过程中胎儿妥协的风险。然而，考虑到CTG模式的复杂性，视觉上解释CTG记录对临床医生来说是具有挑战性的，导致灵敏度低。解决这一问题的努力集中在数据驱动的深度学习方法上，以自动检测胎儿的危害。然而，有限的CTG训练数据集和缺乏标准化的评估工作流程阻碍了他们的进展，阻碍了算法的比较。在这项研究中，我们使用一个私人CTG数据集，其中包含9,887条CTG记录，其中包括pH测量值，以及来自开放获取的CTU-UHB数据集的552条CTG记录，对胎儿损伤检测的六种深度学习模型进行了跨数据库评估。我们探讨了FHR和UC信号的输入选择、信号预处理、下采样频率以及从训练数据集中去除中间pH样本的影响。我们的研究结果表明，仅使用FHR和预处理FHR与伪影去除和插值可以显著提高某些模型架构的分类性能，而排除中间pH样本并不能显著提高任何模型的性能。从我们对六个模型的比较中，ResNet在两个数据库中表现出最强的胎儿损伤分类性能，降采样率为1Hz。最后，来自ResNet模型中高贡献信号区域的类激活图与受损FHR模式的临床知识一致，突出了模型的可解释性。这些见解可以作为开发和比较该领域未来工作的标准化参考。临床和转化影响：本研究为比较CTG分类的深度学习方法提供了一个标准化的工作流程。确保新方法具有普遍性和可解释性，将提高其在临床环境中的稳健性和适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Journal of Translational Engineering in Health and Medicine-Jtehm Engineering-Biomedical Engineering

CiteScore

7.40

自引率

2.90%

发文量

审稿时长

27 weeks

期刊介绍： The IEEE Journal of Translational Engineering in Health and Medicine is an open access product that bridges the engineering and clinical worlds, focusing on detailed descriptions of advanced technical solutions to a clinical need along with clinical results and healthcare relevance. The journal provides a platform for state-of-the-art technology directions in the interdisciplinary field of biomedical engineering, embracing engineering, life sciences and medicine. A unique aspect of the journal is its ability to foster a collaboration between physicians and engineers for presenting broad and compelling real world technological and engineering solutions that can be implemented in the interest of improving quality of patient care and treatment outcomes, thereby reducing costs and improving efficiency. The journal provides an active forum for clinical research and relevant state-of the-art technology for members of all the IEEE societies that have an interest in biomedical engineering as well as reaching out directly to physicians and the medical community through the American Medical Association (AMA) and other clinical societies. The scope of the journal includes, but is not limited, to topics on: Medical devices, healthcare delivery systems, global healthcare initiatives, and ICT based services; Technological relevance to healthcare cost reduction; Technology affecting healthcare management, decision-making, and policy; Advanced technical work that is applied to solving specific clinical needs.