{"title":"Adaptive model-agnostic meta-learning network for cross-machine fault diagnosis with limited samples","authors":"Mingzhe Mu, Hongkai Jiang, Xin Wang, Yutong Dong","doi":"10.1016/j.engappai.2024.109748","DOIUrl":"10.1016/j.engappai.2024.109748","url":null,"abstract":"<div><div>Deep learning-based methods have been extensively studied in rotating machinery defect diagnosis. However, training an accurate and robust diagnostic model is still a challenge under severe domain bias and limited samples. For this reason, a new adaptive model-agnostic meta-learning (AMAML) is proposed for cross-machine fault diagnosis with limited samples. First, a novel adaptive feature encode network is built, incorporating lightweight spatial-bilateral channel attention. This enables the network to extract critical fault information in multiple dimensions adaptively within limited samples, which improves the learning efficiency of generalized diagnostic knowledge. Then, an adaptive loss computation (ALC) method is devised, which inventively realizes the interaction between loss computation and model performance. The underfitting and overfitting dilemmas under few-shot conditions are tackled by ALC. Finally, an adaptive meta-optimization strategy is proposed for dynamically adapting the update strategy of the base learner, so that the model is always optimized in the direction of strong generalizability while obtaining high performance. Six cross-machine diagnosis tasks are conducted to verify the effectiveness of AMAML. The average diagnostic accuracy of the AMAML under the 5-shot setting reached 97.42%. Experiments confirm that AMAML is superior to other prevailing methods and is potentially promising for engineering applications.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"141 ","pages":"Article 109748"},"PeriodicalIF":7.5,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142758778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuguang Zhao , Jiang Wang , Ping Huang , Fa Zhao , Fudong Zhang , Yadongyang Zhu
{"title":"A multi-scale feature fusion network based on semi-channel attention for seismic phase picking","authors":"Shuguang Zhao , Jiang Wang , Ping Huang , Fa Zhao , Fudong Zhang , Yadongyang Zhu","doi":"10.1016/j.engappai.2024.109739","DOIUrl":"10.1016/j.engappai.2024.109739","url":null,"abstract":"<div><div>In the field of seismic data processing, deep learning technologies have been widely used for seismic phase picking. However, it is difficult to take full advantage of the features extracted at different stages in existing models. In this paper, a multi-scale feature fusion network was proposed for seismic phase picking to address this problem. In the stage of feature extraction, semi-channel attention is introduced. It improves the representation ability of the model by efficiently utilizing the feature information extracted from the encoder. In the stage of decoding, a channel compression module is designed to reduce the number of feature channels. It improves the receptive field of channels. Additionally, a multi-feature fusion module is presented to integrate features at multiple scales. It reduces the loss of useful information and improves the accuracy of phase picking. The effectiveness of our network is validated on Stanford earthquake dataset, where the picking errors for phase picking are 2 ms. The parameter of our network is only 52,100. Compared with earthquake transformer, it has 42.1% fewer time costs to process 12,656 test samples on Graphics Processing Unit.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"141 ","pages":"Article 109739"},"PeriodicalIF":7.5,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep interval type-2 generalized fuzzy hyperbolic tangent system for nonlinear regression prediction","authors":"Jianjian Zhao, Tao Zhao","doi":"10.1016/j.engappai.2024.109737","DOIUrl":"10.1016/j.engappai.2024.109737","url":null,"abstract":"<div><div>Recently, due to the rapid rise of artificial intelligence (AI), considerable progress has been made in the field of nonlinear regression prediction. However, many existing methods suffer from the issues of rule and parameter explosion and poor accuracy, particularly for high-dimensional data with uncertainty. To address these limitations, this paper proposes a deep interval type-2 generalized fuzzy hyperbolic tangent system (DIT2GFHS). First, a novel neural network-based implementation of the interval type-2 fuzzy generalized fuzzy hyperbolic tangent system (IT2GFHS) is introduced to improve the efficiency of system parameter updates and optimization. Then, using a hierarchical and block-based framework, multiple IT2GFHSs are stacked layer by layer from bottom to top to construct the DIT2GFHS, with each layer’s fuzzy subsystems being independent of the others. Additionally, DIT2GFHS incorporates optimization algorithms and the Adam optimizer for training, thereby avoiding the tedious manual parameter tuning process. The detailed analysis of the construction manner and internal mechanisms for DIT2GFHS indicates that it features a reduced number of parameters, a transparent and clear structure, strong capability in handling uncertainty, and favorable accuracy. Notably, the small number of parameters and the explicit structure reduce computational and hardware burdens while maintaining interpretability. Finally, extensive experimental studies on both relatively low-dimensional and high-dimensional datasets are conducted. The results demonstrate that DIT2GFHS achieves excellent performance with fewer parameters compared to many deep-structured models, including deep fuzzy systems and deep learning models. This highlights its potential impact in addressing practical nonlinear regression problems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"141 ","pages":"Article 109737"},"PeriodicalIF":7.5,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142758779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing camouflaged object detection through contrastive learning and data augmentation techniques","authors":"Cunhan Guo , Heyan Huang","doi":"10.1016/j.engappai.2024.109703","DOIUrl":"10.1016/j.engappai.2024.109703","url":null,"abstract":"<div><div>Camouflaged object detection (COD) aims to locate and segment objects that blend into their surroundings, presenting significant challenges due to the high similarity between the objects and their background. This work introduces a novel approach, Contrastive Learning with Augmented Data (CLAD), which enhances COD performance by leveraging contrastive learning and data augmentation. Our method formulates a simplified task by placing camouflaged objects in new environments, creating positive and negative samples for contrast learning. This process strengthens the model’s ability to differentiate camouflaged objects from complex backgrounds. Furthermore, we introduce a concatenated feature enhancement module to integrate and enrich multi-scale features, improving the overall expressive power of the model. Extensive experiments on four benchmark datasets demonstrate that CLAD outperforms state-of-the-art COD methods, and its effectiveness extends to salient object detection tasks, achieving competitive results across multiple metrics.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"141 ","pages":"Article 109703"},"PeriodicalIF":7.5,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xi Luo , Mingyang Zhang , Yi Han , Ran Yan , Shuaian Wang
{"title":"Ship fuel consumption prediction based on transfer learning: Models and applications","authors":"Xi Luo , Mingyang Zhang , Yi Han , Ran Yan , Shuaian Wang","doi":"10.1016/j.engappai.2024.109769","DOIUrl":"10.1016/j.engappai.2024.109769","url":null,"abstract":"<div><div>Data-driven fuel consumption rate (FCR) prediction models largely depend on the amount of training data, which can be scarce for new ships with limited operating time. To tackle this issue, we implement three transfer learning strategies to leverage knowledge from another seven container ships to construct artificial neural network (ANN)-based FCR prediction models for a target ship with limited data. Numerical experiments reveal that the ANN models incorporating the three transfer strategies outperform the model trained solely on the target ship data, reducing mean absolute percentage error by 12.57%, 6.44%, and 16.03%, respectively. This study also investigates the impacts of target dataset size on the performance of transfer strategies using ship FCR prediction as an example, revealing that the smaller amount of available data, the greater improvement in prediction accuracy using the transfer strategy. These insights contribute to the development of effective operational solutions for enhancing ship energy efficiency and promoting sustainable shipping practices.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"141 ","pages":"Article 109769"},"PeriodicalIF":7.5,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lan Na , Baigen Cai , Chongzhen Zhang , Jiang Liu , Zhengjiao Li
{"title":"A heterogeneous transfer learning method for fault prediction of railway track circuit","authors":"Lan Na , Baigen Cai , Chongzhen Zhang , Jiang Liu , Zhengjiao Li","doi":"10.1016/j.engappai.2024.109740","DOIUrl":"10.1016/j.engappai.2024.109740","url":null,"abstract":"<div><div>Prediction and identification of faults in track circuits are crucial for improving the safety and efficiency of railway transportation. However, due to the absence of real data, the task of track circuit fault prediction through deep learning methods facing significant challenges. This paper proposed a novel heterogeneous transfer learning network structure for track circuit deep learning fault prediction. The proposed transfer learning network can reduce the reliance on track circuit data in the process of deep learning models training by utilizing public datasets from other similar tasks. In this paper, an index describing the data distribution is used to demonstrate the transferability between heterogeneous data firstly. Then a heterogeneous transfer learning network structure is proposed to help the deep learning model training on the track circuit fault prediction task. Finally, the effect of transfer learning is comprehensively examined. The simulation experimental results show that the proposed heterogeneous transfer learning network structure can transfer useful knowledge in other similar fields for tasks in track circuit fault prediction, and the resulting model can distinguish between nine different classes with a high accuracy level over 99% on the test dataset while reducing the amount of required training data to 10% of the traditional training methods.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109740"},"PeriodicalIF":7.5,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142757173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal transformer for early alarm prediction","authors":"Nika Strem , Devendra Singh Dhami , Benedikt Schmidt , Kristian Kersting","doi":"10.1016/j.engappai.2024.109643","DOIUrl":"10.1016/j.engappai.2024.109643","url":null,"abstract":"<div><div>Alarms are an essential part of distributed control systems designed to help plant operators keep the processes stable and safe. In reality, however, alarms are often noisy and thus can be easily overlooked. Early alarm prediction can give the operator more time to assess the situation and introduce corrective actions to avoid downtime and negative impact on human safety and environment. Existing studies on alarm prediction typically rely on signals directly coupled with these alarms. However, using more sources of information could benefit early prediction by letting the model learn characteristic patterns in the interactions of signals and events. Meanwhile, multimodal deep learning has recently seen impressive developments. Combination (or fusion) of modalities has been shown to be a key success factor, yet choosing the best fusion method for a given task introduces a new degree of complexity, in addition to existing architectural choices and hyperparameter tuning. This is one of the reasons why real-world problems are still typically tackled with unimodal approaches. To bridge this gap, we introduce a multimodal Transformer model for early alarm prediction based on a combination of recent events and signal data. The model learns the optimal representation of data from multiple fusion strategies automatically. The model is validated on real-world industrial data. We show that our model is capable of predicting alarms with the given horizon and that the proposed multimodal fusion method yields state-of-the-art predictive performance while eliminating the need to choose among conventional fusion techniques, thus reducing tuning costs and training time.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109643"},"PeriodicalIF":7.5,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142747059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A large-scale group decision making model with a clustering algorithm based on a locality sensitive hash function","authors":"Zhangqian Mu , Yuanyuan Liu , Youlong Yang","doi":"10.1016/j.engappai.2024.109697","DOIUrl":"10.1016/j.engappai.2024.109697","url":null,"abstract":"<div><div>With the development of science and technology, an expanding array of decision-makers across various fields, including engineering and medicine, have been participating in collaborative decision-making for complex scenarios, such as earthquake relief and disease containment. The rapidly changing dynamics of real-world decision-making and the high complexity of consensus reaching among decision-makers require the development of more sophisticated models to handle these challenges. Considering the diversity and stability of group categories, this study proposes a large-scale group decision-making model based on a locality sensitive hash function. First, the volatility of attributes in real scenarios is considered, and a time-series decision matrix is constructed based on the average growth rates to make the results closer to reality. Then, hash functions are used to map the decision opinions to different dimensions and express the similarity through the Hamming distance, yielding clustering results with high stability and cohesion. To determine whether the decision-making group can reach a consensus, this study conducts hypothesis testing, adopting the idea of small probability counterfactuals to provide objective and fair standards for threshold judgment. Finally, through the case study and comparative analysis, it is proved that the proposed method improved 26.4% and 4.2% under the criteria of integrated cohesion and global consensus degree, respectively, with better clustering effect.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109697"},"PeriodicalIF":7.5,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Solving dynamic multi-objective optimization problem of immersed tunnel elements via multi-source evolutionary information clustering method","authors":"Qinqin Fan , Wentao Huang , Moduo Yu , Qirong Tang , Qingchao Jiang","doi":"10.1016/j.engappai.2024.109741","DOIUrl":"10.1016/j.engappai.2024.109741","url":null,"abstract":"<div><div>Dynamic multi-objective optimization problems (DMOPs) are time- and space-varying, thus maintaining/improving the uncertainty degree of evolutionary information (i.e., information entropy) in the population and providing useful knowledge are two important tasks to make dynamic multi-objective evolutionary algorithms (DMOEAs) adapt to changing environments. To achieve the above objectives, a multi-source population clustering (MPC) method is proposed to assist DMOEAs in improving their tracking performance during the full-cycle optimization in the current study. In the MPC, three different information sources are used to provide diverse spatiotemporal evolutionary information, aiding DMOEAs in adapting to various changing environments. Subsequently, an enhanced spectral clustering approach is employed to group all evolutionary individuals from different information sources into many clusters/subspaces. Finally, the selected DMOEA is employed to search all subspaces in parallel via the high-performing computing method. The MPC is incorporated into a regularity model-based multi-objective estimation of distribution algorithm (called as MPC-RM-MEDA) and is compared with six famous DMOEAs on 14 10- and 30-dimensional DMOPs, which are proposed in IEEE Congress on Evolutionary computation 2018. Experimental results demonstrate that the overall tracking performance of the proposed MPC-RM-MEDA is significantly superior to that of other selected competitors in various dynamic environments. Additionally, the MPC-RM-MEDA is utilized to address a real-world DMOP involving an immersed tunnel element. The obtained results and comparison with the knee point-based transfer learning method verify that the MPC is an efficient and dependable approach for enhancing the tracking performance of other DMOEAs in solving actual DMOPs.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109741"},"PeriodicalIF":7.5,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lightweight advanced deep-learning models for stress detection on social media","authors":"Mohammed Qorich, Rajae El Ouazzani","doi":"10.1016/j.engappai.2024.109720","DOIUrl":"10.1016/j.engappai.2024.109720","url":null,"abstract":"<div><div>Nowadays, stress reveals itself as a ubiquitous presence, manifesting in novel forms in our modern daily life. Indeed, digital platforms and social media collect various impressions, reactions, and feelings that could provide valuable real-time sentiment data. Nevertheless, understanding stress and mental states among people is difficult because it relies on self-reporting and detecting related expressions, statements, and articulations. In this paper, we consider extracting nuanced insights and stress expressions from Reddit and Twitter posts using lightweight advanced deep-learning methods and Bidirectional Encoder Representations from Transformers (BERT) embeddings. Our findings highlight the potency of transformer BERT models, whether utilized as embedding feature extractors or as text sentiment classifiers. Moreover, the proposed lightweight deep architectural models promoted the field of stress detection in social media, achieving high classification performance. Practically, the BERT Electra model reached 85.67% accuracy on the small Reddit dataset, while our Convolutional Neural Network (CNN) model obtained 97.62% on the large Twitter dataset. Our contributions are not only restricted to the scientific understanding of stress but also extend to the well-being of individuals and global mental health.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109720"},"PeriodicalIF":7.5,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}