Xiaodan Li , Yue Zhou , Fengchun Gao , Di Cheng , Wushan Li , Kaijian Xia , Hongsheng Yin
{"title":"MDGAIN- ifc:基于MDGAIN和信息保真度的产后出血全/精基准数据集智能构建方法","authors":"Xiaodan Li , Yue Zhou , Fengchun Gao , Di Cheng , Wushan Li , Kaijian Xia , Hongsheng Yin","doi":"10.1016/j.aej.2025.08.022","DOIUrl":null,"url":null,"abstract":"<div><div>Postpartum hemorrhage (PPH) seriously affects the quality of life of parturients and their families, and imposes a huge economic and social burden on countries around the world. In this study, we propose a PPH Full/Refined (F/R) Dataset construction framework integrating Missing Data Generative Adversarial Imputation Networks (MDGAIN) and Information Fidelity Criterion (IFC). We perform direct coarse-value cleaning and restoration on raw PPH data, defining an outlier measure for data centroids and determining coarse values based on the 3σ criterion. We use the MDGAIN to generate data that conform to the distribution of real samples and impute missing data. We propose the IFC for constructing refined datasets, and under the guidance of the criterion, we investigate attribute-refined methods based on the mutual information method and Extreme Gradient Boosting (XGBoost). Additionally, we propose attribute-refined methods based on information fusion for constructing refined datasets. Using electronic medical records of 68,352 vaginal deliveries from Jinan Maternal and Child Health Hospital (Shandong, China), Using electronic medical records and manually curated data obtained from 68,352 vaginal deliveries of Jinan Maternal and Child Health Hospital (Shandong, China), we construct the PPH F/R benchmark dataset. Finally, we use PPH prediction methods such as the XGBoost, logistic regression (LR), and random forest to validate the consistency of the constructed PPH F/R dataset.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"128 ","pages":"Pages 1057-1072"},"PeriodicalIF":6.8000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MDGAIN-IFC: An intelligent construction method for full/refined benchmark dataset of postpartum hemorrhage based on MDGAIN and information fidelity\",\"authors\":\"Xiaodan Li , Yue Zhou , Fengchun Gao , Di Cheng , Wushan Li , Kaijian Xia , Hongsheng Yin\",\"doi\":\"10.1016/j.aej.2025.08.022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Postpartum hemorrhage (PPH) seriously affects the quality of life of parturients and their families, and imposes a huge economic and social burden on countries around the world. In this study, we propose a PPH Full/Refined (F/R) Dataset construction framework integrating Missing Data Generative Adversarial Imputation Networks (MDGAIN) and Information Fidelity Criterion (IFC). We perform direct coarse-value cleaning and restoration on raw PPH data, defining an outlier measure for data centroids and determining coarse values based on the 3σ criterion. We use the MDGAIN to generate data that conform to the distribution of real samples and impute missing data. We propose the IFC for constructing refined datasets, and under the guidance of the criterion, we investigate attribute-refined methods based on the mutual information method and Extreme Gradient Boosting (XGBoost). Additionally, we propose attribute-refined methods based on information fusion for constructing refined datasets. Using electronic medical records of 68,352 vaginal deliveries from Jinan Maternal and Child Health Hospital (Shandong, China), Using electronic medical records and manually curated data obtained from 68,352 vaginal deliveries of Jinan Maternal and Child Health Hospital (Shandong, China), we construct the PPH F/R benchmark dataset. Finally, we use PPH prediction methods such as the XGBoost, logistic regression (LR), and random forest to validate the consistency of the constructed PPH F/R dataset.</div></div>\",\"PeriodicalId\":7484,\"journal\":{\"name\":\"alexandria engineering journal\",\"volume\":\"128 \",\"pages\":\"Pages 1057-1072\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"alexandria engineering journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110016825009196\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825009196","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
MDGAIN-IFC: An intelligent construction method for full/refined benchmark dataset of postpartum hemorrhage based on MDGAIN and information fidelity
Postpartum hemorrhage (PPH) seriously affects the quality of life of parturients and their families, and imposes a huge economic and social burden on countries around the world. In this study, we propose a PPH Full/Refined (F/R) Dataset construction framework integrating Missing Data Generative Adversarial Imputation Networks (MDGAIN) and Information Fidelity Criterion (IFC). We perform direct coarse-value cleaning and restoration on raw PPH data, defining an outlier measure for data centroids and determining coarse values based on the 3σ criterion. We use the MDGAIN to generate data that conform to the distribution of real samples and impute missing data. We propose the IFC for constructing refined datasets, and under the guidance of the criterion, we investigate attribute-refined methods based on the mutual information method and Extreme Gradient Boosting (XGBoost). Additionally, we propose attribute-refined methods based on information fusion for constructing refined datasets. Using electronic medical records of 68,352 vaginal deliveries from Jinan Maternal and Child Health Hospital (Shandong, China), Using electronic medical records and manually curated data obtained from 68,352 vaginal deliveries of Jinan Maternal and Child Health Hospital (Shandong, China), we construct the PPH F/R benchmark dataset. Finally, we use PPH prediction methods such as the XGBoost, logistic regression (LR), and random forest to validate the consistency of the constructed PPH F/R dataset.
期刊介绍:
Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification:
• Mechanical, Production, Marine and Textile Engineering
• Electrical Engineering, Computer Science and Nuclear Engineering
• Civil and Architecture Engineering
• Chemical Engineering and Applied Sciences
• Environmental Engineering