MDGAIN-IFC: An intelligent construction method for full/refined benchmark dataset of postpartum hemorrhage based on MDGAIN and information fidelity

IF 6.8 2区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY
Xiaodan Li , Yue Zhou , Fengchun Gao , Di Cheng , Wushan Li , Kaijian Xia , Hongsheng Yin
{"title":"MDGAIN-IFC: An intelligent construction method for full/refined benchmark dataset of postpartum hemorrhage based on MDGAIN and information fidelity","authors":"Xiaodan Li ,&nbsp;Yue Zhou ,&nbsp;Fengchun Gao ,&nbsp;Di Cheng ,&nbsp;Wushan Li ,&nbsp;Kaijian Xia ,&nbsp;Hongsheng Yin","doi":"10.1016/j.aej.2025.08.022","DOIUrl":null,"url":null,"abstract":"<div><div>Postpartum hemorrhage (PPH) seriously affects the quality of life of parturients and their families, and imposes a huge economic and social burden on countries around the world. In this study, we propose a PPH Full/Refined (F/R) Dataset construction framework integrating Missing Data Generative Adversarial Imputation Networks (MDGAIN) and Information Fidelity Criterion (IFC). We perform direct coarse-value cleaning and restoration on raw PPH data, defining an outlier measure for data centroids and determining coarse values based on the 3σ criterion. We use the MDGAIN to generate data that conform to the distribution of real samples and impute missing data. We propose the IFC for constructing refined datasets, and under the guidance of the criterion, we investigate attribute-refined methods based on the mutual information method and Extreme Gradient Boosting (XGBoost). Additionally, we propose attribute-refined methods based on information fusion for constructing refined datasets. Using electronic medical records of 68,352 vaginal deliveries from Jinan Maternal and Child Health Hospital (Shandong, China), Using electronic medical records and manually curated data obtained from 68,352 vaginal deliveries of Jinan Maternal and Child Health Hospital (Shandong, China), we construct the PPH F/R benchmark dataset. Finally, we use PPH prediction methods such as the XGBoost, logistic regression (LR), and random forest to validate the consistency of the constructed PPH F/R dataset.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"128 ","pages":"Pages 1057-1072"},"PeriodicalIF":6.8000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825009196","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Postpartum hemorrhage (PPH) seriously affects the quality of life of parturients and their families, and imposes a huge economic and social burden on countries around the world. In this study, we propose a PPH Full/Refined (F/R) Dataset construction framework integrating Missing Data Generative Adversarial Imputation Networks (MDGAIN) and Information Fidelity Criterion (IFC). We perform direct coarse-value cleaning and restoration on raw PPH data, defining an outlier measure for data centroids and determining coarse values based on the 3σ criterion. We use the MDGAIN to generate data that conform to the distribution of real samples and impute missing data. We propose the IFC for constructing refined datasets, and under the guidance of the criterion, we investigate attribute-refined methods based on the mutual information method and Extreme Gradient Boosting (XGBoost). Additionally, we propose attribute-refined methods based on information fusion for constructing refined datasets. Using electronic medical records of 68,352 vaginal deliveries from Jinan Maternal and Child Health Hospital (Shandong, China), Using electronic medical records and manually curated data obtained from 68,352 vaginal deliveries of Jinan Maternal and Child Health Hospital (Shandong, China), we construct the PPH F/R benchmark dataset. Finally, we use PPH prediction methods such as the XGBoost, logistic regression (LR), and random forest to validate the consistency of the constructed PPH F/R dataset.
MDGAIN- ifc:基于MDGAIN和信息保真度的产后出血全/精基准数据集智能构建方法
产后出血(PPH)严重影响产妇及其家庭的生活质量,给世界各国造成了巨大的经济和社会负担。在本研究中,我们提出了一个整合缺失数据生成对抗Imputation网络(MDGAIN)和信息保真度标准(IFC)的PPH全/精(F/R)数据集构建框架。我们对原始PPH数据进行直接粗值清洗和恢复,定义数据质心的离群度量,并根据3σ准则确定粗值。我们使用MDGAIN生成符合真实样本分布的数据,并对缺失数据进行补全。提出了构建精细化数据集的IFC准则,并在IFC准则的指导下,研究了基于互信息法和极限梯度增强(XGBoost)的属性精化方法。此外,我们提出了基于信息融合的属性细化方法来构建精细化数据集。利用山东省济南市妇幼保健院68352例阴道分娩的电子病历和人工整理的数据,构建了PPH F/R基准数据集。最后,我们使用XGBoost、逻辑回归(LR)和随机森林等PPH预测方法来验证构建的PPH F/R数据集的一致性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
alexandria engineering journal
alexandria engineering journal Engineering-General Engineering
CiteScore
11.20
自引率
4.40%
发文量
1015
审稿时长
43 days
期刊介绍: Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification: • Mechanical, Production, Marine and Textile Engineering • Electrical Engineering, Computer Science and Nuclear Engineering • Civil and Architecture Engineering • Chemical Engineering and Applied Sciences • Environmental Engineering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信