Tackling data scarcity in machine learning-based CFRP drilling performance prediction through a broad learning system with virtual sample generation (BLS-VSG)
Jia Ge , Zequan Yao , Ming Wu , José Humberto S. Almeida Jr , Yan Jin , Dan Sun
{"title":"Tackling data scarcity in machine learning-based CFRP drilling performance prediction through a broad learning system with virtual sample generation (BLS-VSG)","authors":"Jia Ge , Zequan Yao , Ming Wu , José Humberto S. Almeida Jr , Yan Jin , Dan Sun","doi":"10.1016/j.compositesb.2025.112701","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning (ML)-based data-driven method has emerged as a powerful tool for predicting the manufacturing performance of carbon fibre reinforced plastic (CFRP), particularly in CFRP machining, where physics-based models are computationally expensive. However, the effectiveness of ML models are often constrained by limited datasets, due to the high cost and time required for experimental data acquisition. To address this, this paper presents the first study to apply virtual sample generation (VSG) techniques to enlarge the training dataset and mitigate data scarcity in the prediction of CFRP drilling performance. A novel hybrid ML framework integrating Broad Learning System (BLS) and VSG (BLS-VSG) is proposed to combine the capability of BLS in small dataset prediction with the enlarged dataset generated by VSG. The model has been employed to predict the drilling thrust force and delamination damage under various drilling conditions (spindle speed, feed rate, point angle). Three different VSG methods (SMOTE, MD-MTD and CVT) and the number of virtual samples were evaluated in detail. Results show that VSG can effectively enlarge the training dataset and improve the prediction performance of the ML model. Specifically, VSG reduced the mean square error (MSE) and mean absolute percentage error (MAPE) for thrust force prediction by 39.0 % and 12.9 %, respectively, compared to the benchmark without VSG. For delamination factor F<sub>da</sub> prediction, MSE and MAPE were reduced by 22.6 % and 16.5 %, respectively. The proposed BLS-VSG model outperforms other conventional ML models (BPNN, ELM, SVR and RT) for both scenarios (with/without VSG), providing a robust and data-efficient solution for CFRP drilling performance prediction.</div></div>","PeriodicalId":10660,"journal":{"name":"Composites Part B: Engineering","volume":"305 ","pages":"Article 112701"},"PeriodicalIF":14.2000,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Composites Part B: Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S135983682500602X","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning (ML)-based data-driven method has emerged as a powerful tool for predicting the manufacturing performance of carbon fibre reinforced plastic (CFRP), particularly in CFRP machining, where physics-based models are computationally expensive. However, the effectiveness of ML models are often constrained by limited datasets, due to the high cost and time required for experimental data acquisition. To address this, this paper presents the first study to apply virtual sample generation (VSG) techniques to enlarge the training dataset and mitigate data scarcity in the prediction of CFRP drilling performance. A novel hybrid ML framework integrating Broad Learning System (BLS) and VSG (BLS-VSG) is proposed to combine the capability of BLS in small dataset prediction with the enlarged dataset generated by VSG. The model has been employed to predict the drilling thrust force and delamination damage under various drilling conditions (spindle speed, feed rate, point angle). Three different VSG methods (SMOTE, MD-MTD and CVT) and the number of virtual samples were evaluated in detail. Results show that VSG can effectively enlarge the training dataset and improve the prediction performance of the ML model. Specifically, VSG reduced the mean square error (MSE) and mean absolute percentage error (MAPE) for thrust force prediction by 39.0 % and 12.9 %, respectively, compared to the benchmark without VSG. For delamination factor Fda prediction, MSE and MAPE were reduced by 22.6 % and 16.5 %, respectively. The proposed BLS-VSG model outperforms other conventional ML models (BPNN, ELM, SVR and RT) for both scenarios (with/without VSG), providing a robust and data-efficient solution for CFRP drilling performance prediction.
期刊介绍:
Composites Part B: Engineering is a journal that publishes impactful research of high quality on composite materials. This research is supported by fundamental mechanics and materials science and engineering approaches. The targeted research can cover a wide range of length scales, ranging from nano to micro and meso, and even to the full product and structure level. The journal specifically focuses on engineering applications that involve high performance composites. These applications can range from low volume and high cost to high volume and low cost composite development.
The main goal of the journal is to provide a platform for the prompt publication of original and high quality research. The emphasis is on design, development, modeling, validation, and manufacturing of engineering details and concepts. The journal welcomes both basic research papers and proposals for review articles. Authors are encouraged to address challenges across various application areas. These areas include, but are not limited to, aerospace, automotive, and other surface transportation. The journal also covers energy-related applications, with a focus on renewable energy. Other application areas include infrastructure, off-shore and maritime projects, health care technology, and recreational products.