{"title":"Toward generalizable machine learning prediction of downskin surface roughness in laser powder bed fusion","authors":"Jigar Patel, Mihaela Vlasea, Sagar Patel","doi":"10.1016/j.aime.2025.100163","DOIUrl":null,"url":null,"abstract":"<div><div>Downskin surface quality of laser powder bed fusion (L-PBF) remains a challenge due to the complex, multi-scale physics governing it. While numerical or experimental approaches alone can be significantly resource intensive, data-driven approaches such as machine learning (ML) have the potential to be more practical. However, the generalizability of ML models currently reported in literature is unclear; few ML models can predict reliably outside of their training domain. This study addresses these challenges by (i) demonstrating a downskin surface roughness classification model, trained on the largest reported dataset for downskin roughness (<span><math><mo>∼</mo></math></span>400 downskin specimens spanning five builds and two ferrous alloys) and (ii) conducting a thorough investigation of the model’s generalizability. Additionally, this study highlights critical issues such as data imbalance, generalization to unseen data, and the importance of rigorous evaluation. By implementing robust ML practices, we focused on model performance across different training and evaluation domains. Our findings indicate satisfactory performance when using the more conservative balanced accuracy metric, achieving about 95% inter-domain and 83% intra-domain accuracy. Although there is still room for improvement, these results demonstrate a significant reduction in the risk of overfitting, thereby enhancing the classifier’s generalizability. This work underscores the importance of methodological rigor in machine learning applications, advocating for greater attention to data treatment and evaluation strategies. This approach may ultimately lead to more effective and usable ML models. The data-centric results indicated that (i) physics-informed features can improve performance during domain shifts, and (ii) increased the size and variety of datasets allows even computationally light models to achieve favorable performance.</div></div>","PeriodicalId":34573,"journal":{"name":"Advances in Industrial and Manufacturing Engineering","volume":"10 ","pages":"Article 100163"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Industrial and Manufacturing Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666912925000078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0
Abstract
Downskin surface quality of laser powder bed fusion (L-PBF) remains a challenge due to the complex, multi-scale physics governing it. While numerical or experimental approaches alone can be significantly resource intensive, data-driven approaches such as machine learning (ML) have the potential to be more practical. However, the generalizability of ML models currently reported in literature is unclear; few ML models can predict reliably outside of their training domain. This study addresses these challenges by (i) demonstrating a downskin surface roughness classification model, trained on the largest reported dataset for downskin roughness (400 downskin specimens spanning five builds and two ferrous alloys) and (ii) conducting a thorough investigation of the model’s generalizability. Additionally, this study highlights critical issues such as data imbalance, generalization to unseen data, and the importance of rigorous evaluation. By implementing robust ML practices, we focused on model performance across different training and evaluation domains. Our findings indicate satisfactory performance when using the more conservative balanced accuracy metric, achieving about 95% inter-domain and 83% intra-domain accuracy. Although there is still room for improvement, these results demonstrate a significant reduction in the risk of overfitting, thereby enhancing the classifier’s generalizability. This work underscores the importance of methodological rigor in machine learning applications, advocating for greater attention to data treatment and evaluation strategies. This approach may ultimately lead to more effective and usable ML models. The data-centric results indicated that (i) physics-informed features can improve performance during domain shifts, and (ii) increased the size and variety of datasets allows even computationally light models to achieve favorable performance.