{"title":"Prediction of Tandem Cold-Rolled Strip Flatness Based on the BiGRU-Attention-iTransformer Model","authors":"Ming-hua Liu, Ya-han Li, Da-yuan Wu, Zi-xuan Zhu","doi":"10.1007/s11837-025-07456-2","DOIUrl":null,"url":null,"abstract":"<div><p>Cold rolling strip production is a multi-stand continuous rolling process, so flatness prediction is a typical spatiotemporal series data prediction problem, which requires considering various complex factors affecting flatness and paying attention to the correlation of its spatiotemporal dimensions. Based on this, a cold rolled strip flatness prediction model is proposed, integrating a Bidirectional Gated Recurrent Unit (BiGRU), an Attention mechanism, and an Inverted Transformer (iTransformer). The model adopts a parallel structure, where one branch utilizes a BiGRU-Attention module designed to capture the spatiotemporal correlations in strip production data, with the BiGRU’s hidden layer dimension set to 128; the other branch employs an iTransformer module with a feature dimension of 256 and 8 attention heads to effectively extract key features and model the relationships between parameters using the self-attention mechanism. The features extracted from both branches are fused into a 128-dimensional vector, which is then passed through a fully connected layer for flatness prediction. The prediction results show that the error indicators MSE, RMSE and MAE of the proposed model are 0.937, 0.968 and 0.774 respectively, and the fitting performance indicator <span>\\(R^{2}\\)</span> is 0.974, which are better than the comparison models Random Forest (RF), Deep Neural Network (DNN), Long Short-Term Memory (LSTM), BiGRU, iTransformer, and BiGRU-iTransformer models. Feature ablation experiments show that for flatness, the importance of parameters is ranked as follows: forward tension (Tf), roll gap difference (Gd), work roll bending force (WR), exit thickness (Hb), rolling force (<span>\\(F\\)</span>), intermediate roll bending force (IR), strip yield strength (<span>\\(Y\\)</span>), entrance thickness (Hf), rolling speed (<span>\\(V\\)</span>), backward tension (Tb), strip width (<span>\\(B\\)</span>). The experimental results are consistent with expert knowledge and underlying mechanisms, verifying the effectiveness of the attention mechanism while enhancing interpretability and credibility.</p></div>","PeriodicalId":605,"journal":{"name":"JOM","volume":"77 8","pages":"6245 - 6259"},"PeriodicalIF":2.3000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOM","FirstCategoryId":"88","ListUrlMain":"https://link.springer.com/article/10.1007/s11837-025-07456-2","RegionNum":4,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Cold rolling strip production is a multi-stand continuous rolling process, so flatness prediction is a typical spatiotemporal series data prediction problem, which requires considering various complex factors affecting flatness and paying attention to the correlation of its spatiotemporal dimensions. Based on this, a cold rolled strip flatness prediction model is proposed, integrating a Bidirectional Gated Recurrent Unit (BiGRU), an Attention mechanism, and an Inverted Transformer (iTransformer). The model adopts a parallel structure, where one branch utilizes a BiGRU-Attention module designed to capture the spatiotemporal correlations in strip production data, with the BiGRU’s hidden layer dimension set to 128; the other branch employs an iTransformer module with a feature dimension of 256 and 8 attention heads to effectively extract key features and model the relationships between parameters using the self-attention mechanism. The features extracted from both branches are fused into a 128-dimensional vector, which is then passed through a fully connected layer for flatness prediction. The prediction results show that the error indicators MSE, RMSE and MAE of the proposed model are 0.937, 0.968 and 0.774 respectively, and the fitting performance indicator \(R^{2}\) is 0.974, which are better than the comparison models Random Forest (RF), Deep Neural Network (DNN), Long Short-Term Memory (LSTM), BiGRU, iTransformer, and BiGRU-iTransformer models. Feature ablation experiments show that for flatness, the importance of parameters is ranked as follows: forward tension (Tf), roll gap difference (Gd), work roll bending force (WR), exit thickness (Hb), rolling force (\(F\)), intermediate roll bending force (IR), strip yield strength (\(Y\)), entrance thickness (Hf), rolling speed (\(V\)), backward tension (Tb), strip width (\(B\)). The experimental results are consistent with expert knowledge and underlying mechanisms, verifying the effectiveness of the attention mechanism while enhancing interpretability and credibility.
期刊介绍:
JOM is a technical journal devoted to exploring the many aspects of materials science and engineering. JOM reports scholarly work that explores the state-of-the-art processing, fabrication, design, and application of metals, ceramics, plastics, composites, and other materials. In pursuing this goal, JOM strives to balance the interests of the laboratory and the marketplace by reporting academic, industrial, and government-sponsored work from around the world.