{"title":"Handling missing data of using the XGBoost-based multiple imputation by chained equations regression method.","authors":"Zhao Jinbo, Li Yufu, Mo Haitao","doi":"10.3389/frai.2025.1553220","DOIUrl":null,"url":null,"abstract":"<p><p>This study introduces an XGBoost-MICE (Multiple Imputation by Chained Equations) method for addressing missing data in mine ventilation parameters. Using historical ventilation system data from Shangwan Coal Mine, scenarios with different missing rates (5, 10, and 15%) and iteration numbers (30 and 50) were simulated to validate the accuracy and effectiveness of the approach. The results demonstrate that as the missing rate increased from 5 to 15%, the Mean Squared Error (MSE) rose from 0.0445 to 0.3254, while the Explained Variance decreased from 0.988309 to 0.943267. Additionally, the Mean Absolute Error (MAE) increased by 0.29. Iteration experiments on the \"frictional resistance per 100 meters\" attribute showed convergence of MSE and MAE after six iterations. Overall, the XGBoost-MICE method exhibited high imputation accuracy and stable convergence across various missing data scenarios, providing robust technical support for optimizing intelligent mine ventilation systems.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1553220"},"PeriodicalIF":3.0000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12003350/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2025.1553220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
This study introduces an XGBoost-MICE (Multiple Imputation by Chained Equations) method for addressing missing data in mine ventilation parameters. Using historical ventilation system data from Shangwan Coal Mine, scenarios with different missing rates (5, 10, and 15%) and iteration numbers (30 and 50) were simulated to validate the accuracy and effectiveness of the approach. The results demonstrate that as the missing rate increased from 5 to 15%, the Mean Squared Error (MSE) rose from 0.0445 to 0.3254, while the Explained Variance decreased from 0.988309 to 0.943267. Additionally, the Mean Absolute Error (MAE) increased by 0.29. Iteration experiments on the "frictional resistance per 100 meters" attribute showed convergence of MSE and MAE after six iterations. Overall, the XGBoost-MICE method exhibited high imputation accuracy and stable convergence across various missing data scenarios, providing robust technical support for optimizing intelligent mine ventilation systems.