{"title":"利用链式方程回归方法处理基于xgboost的多次插值缺失数据。","authors":"Zhao Jinbo, Li Yufu, Mo Haitao","doi":"10.3389/frai.2025.1553220","DOIUrl":null,"url":null,"abstract":"<p><p>This study introduces an XGBoost-MICE (Multiple Imputation by Chained Equations) method for addressing missing data in mine ventilation parameters. Using historical ventilation system data from Shangwan Coal Mine, scenarios with different missing rates (5, 10, and 15%) and iteration numbers (30 and 50) were simulated to validate the accuracy and effectiveness of the approach. The results demonstrate that as the missing rate increased from 5 to 15%, the Mean Squared Error (MSE) rose from 0.0445 to 0.3254, while the Explained Variance decreased from 0.988309 to 0.943267. Additionally, the Mean Absolute Error (MAE) increased by 0.29. Iteration experiments on the \"frictional resistance per 100 meters\" attribute showed convergence of MSE and MAE after six iterations. Overall, the XGBoost-MICE method exhibited high imputation accuracy and stable convergence across various missing data scenarios, providing robust technical support for optimizing intelligent mine ventilation systems.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1553220"},"PeriodicalIF":3.0000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12003350/pdf/","citationCount":"0","resultStr":"{\"title\":\"Handling missing data of using the XGBoost-based multiple imputation by chained equations regression method.\",\"authors\":\"Zhao Jinbo, Li Yufu, Mo Haitao\",\"doi\":\"10.3389/frai.2025.1553220\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This study introduces an XGBoost-MICE (Multiple Imputation by Chained Equations) method for addressing missing data in mine ventilation parameters. Using historical ventilation system data from Shangwan Coal Mine, scenarios with different missing rates (5, 10, and 15%) and iteration numbers (30 and 50) were simulated to validate the accuracy and effectiveness of the approach. The results demonstrate that as the missing rate increased from 5 to 15%, the Mean Squared Error (MSE) rose from 0.0445 to 0.3254, while the Explained Variance decreased from 0.988309 to 0.943267. Additionally, the Mean Absolute Error (MAE) increased by 0.29. Iteration experiments on the \\\"frictional resistance per 100 meters\\\" attribute showed convergence of MSE and MAE after six iterations. Overall, the XGBoost-MICE method exhibited high imputation accuracy and stable convergence across various missing data scenarios, providing robust technical support for optimizing intelligent mine ventilation systems.</p>\",\"PeriodicalId\":33315,\"journal\":{\"name\":\"Frontiers in Artificial Intelligence\",\"volume\":\"8 \",\"pages\":\"1553220\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12003350/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frai.2025.1553220\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2025.1553220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
提出了一种求解矿井通风参数缺失数据的XGBoost-MICE (Multiple Imputation by Chained Equations)方法。利用上湾煤矿历史通风系统数据,模拟了不同缺失率(5%、10%和15%)和迭代次数(30和50)的场景,验证了该方法的准确性和有效性。结果表明,当缺失率从5%增加到15%时,均方误差(MSE)从0.0445增加到0.3254,解释方差从0.988309减少到0.943267。平均绝对误差(MAE)增加0.29。“每100米摩擦阻力”属性的迭代实验表明,经过6次迭代,MSE和MAE收敛。总体而言,XGBoost-MICE方法在各种缺失数据场景下具有较高的归算精度和稳定的收敛性,为优化智能矿井通风系统提供了强有力的技术支持。
Handling missing data of using the XGBoost-based multiple imputation by chained equations regression method.
This study introduces an XGBoost-MICE (Multiple Imputation by Chained Equations) method for addressing missing data in mine ventilation parameters. Using historical ventilation system data from Shangwan Coal Mine, scenarios with different missing rates (5, 10, and 15%) and iteration numbers (30 and 50) were simulated to validate the accuracy and effectiveness of the approach. The results demonstrate that as the missing rate increased from 5 to 15%, the Mean Squared Error (MSE) rose from 0.0445 to 0.3254, while the Explained Variance decreased from 0.988309 to 0.943267. Additionally, the Mean Absolute Error (MAE) increased by 0.29. Iteration experiments on the "frictional resistance per 100 meters" attribute showed convergence of MSE and MAE after six iterations. Overall, the XGBoost-MICE method exhibited high imputation accuracy and stable convergence across various missing data scenarios, providing robust technical support for optimizing intelligent mine ventilation systems.