{"title":"Multi-view deep reciprocal nonnegative matrix factorization","authors":"","doi":"10.1016/j.engappai.2024.109508","DOIUrl":"10.1016/j.engappai.2024.109508","url":null,"abstract":"<div><div>Multi-view deep matrix factorization has recently gained popularity for extracting high-quality representations from multi-view data to improve the processing performance of multi-view data in pattern recognition, data mining, and machine learning. It explores the hierarchical semantics of data by performing a multi-layer decomposition on representation matrices after decomposing the data into basis and representation matrices but ignoring the basis matrices, which also contain valuable information about data. Extracting high-quality bases during the deep representation learning process can facilitate the learning of high-quality representations for multi-view data. To this end, this paper proposes a novel deep nonnegative matrix factorization architecture, named <em><strong>M</strong>ulti-view <strong>D</strong>eep <strong>R</strong>eciprocal <strong>N</strong>onnegative <strong>M</strong>atrix <strong>F</strong>actorization</em> (<strong>MDRNMF</strong>), that collaborates with high-quality basis extraction, allowing the deep representation learning and high-quality basis extraction to promote each other. Based on the representations at the top layer, this paper adaptively learns the intrinsic local similarities of data within each view to capture the view-specific information. In addition, to explore the high-order data consistency across views, this paper introduces a Schatten <span><math><mi>p</mi></math></span>-norm-based low-rank regularization on the similarity tensor stacked by the view-specific similarity matrices. In this way, the proposed method can effectively explore and leverage the view-specific and consistent information of multi-view data simultaneously. Finally, extensive experiments demonstrate the superiority of the proposed model over several state-of-the-art methods.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing particulate matter risk assessment with novel machine learning-driven toxicity threshold prediction","authors":"","doi":"10.1016/j.engappai.2024.109531","DOIUrl":"10.1016/j.engappai.2024.109531","url":null,"abstract":"<div><div>Airborne particulate matter (PM) poses significant health risks, necessitating accurate toxicity threshold determination for effective risk assessment. This study introduces a novel machine-learning (ML) approach to predict PM toxicity thresholds and identify the key physico-chemical and exposure characteristics. Five machine learning algorithms — logistic regression, support vector classifier, decision tree, random forest, and extreme gradient boosting — were employed to develop predictive models using a comprehensive dataset from existing studies. We developed models using the initial dataset and a class weight approach to address data imbalance. For the imbalanced data, the Random Forest classifier outperformed others with 87% accuracy, 81% recall, and the fewest false negatives (23). In the class weight approach, the Support Vector Classifier minimized false negatives (21), while the Random Forest model achieved superior overall performance with 86% accuracy, 80% recall, and an F1-score of 82%. Furthermore, eXplainable Artificial Intelligence (XAI) techniques, specifically SHAP (SHapley Additive exPlanations) values, were utilized to quantify feature contributions to predictions, offering insights beyond traditional laboratory approaches. This study represents the first application of machine learning for predicting PM toxicity thresholds, providing a robust tool for health risk assessment. The proposed methodology offers a time- and cost-effective alternative to classical laboratory tests, potentially revolutionizing PM toxicity threshold determination in scientific and epidemiological research. This innovative approach has significant implications for shaping regulatory policies and designing targeted interventions to mitigate health risks associated with airborne PM.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid Convolutional Autoencoder training algorithm for unsupervised bearing health indicator construction","authors":"","doi":"10.1016/j.engappai.2024.109477","DOIUrl":"10.1016/j.engappai.2024.109477","url":null,"abstract":"<div><div>Conventional Deep Learning (DL) methods for bearing health indicator (HI) adopt supervised approaches, requiring expert knowledge of the component degradation trend. Since bearings experience various failure modes, assuming a particular degradation trend for HI is suboptimal. Unsupervised DL methods are scarce in this domain. They generally maximise the HI monotonicity built in the middle layer of an Autoencoder (AE) trained to reconstruct the run-to-failure signals. The backpropagation (BP) training algorithm is unable to perform this maximisation since the monotonicity of HI subsections corresponding to input sample batches does not guarantee the monotonicity of the whole HI. Therefore, existing methods achieve this by searching AE hyperparameters so that its BP training to minimise the reconstruction error also leads to a highly monotonic HI in its middle layer. This is done using expensive search algorithms where the AE is trained numerous times using various hyperparameter settings, rendering them impractical for large datasets. To address this limitation, a small Convolutional Autoencoder (CAE) architecture and a hybrid training algorithm combining Particle Swarm Optimisation and BP are proposed in this work to enable simultaneous maximisation of the HI monotonicity and minimisation of the reconstruction error. As a result, the HI is built by training the CAE only once. The results from three case studies demonstrate this method’s lower computational burden compared to other unsupervised DL methods. Furthermore, the CAE-based HIs outperform the indicators built by equivalent and significantly larger models trained with a BP-based supervised approach, leading to 85% lower remaining useful life prediction errors.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-stage guided code generation for Large Language Models","authors":"","doi":"10.1016/j.engappai.2024.109491","DOIUrl":"10.1016/j.engappai.2024.109491","url":null,"abstract":"<div><div>Currently, although Large Language Models (LLMs) have shown significant performance in the field of code generation, their effectiveness in handling complex programming tasks remains limited. This is primarily due to the substantial distance between the problem description and the correct code, making it difficult to ensure accuracy when directly generating code. Human programmers, when faced with a complex programming problem, usually use multiple stages to solve it in order to reduce the difficulty of development. First, they analyze the problem and think about a solution plan, then they design a code architecture based on that plan, and finally they finish writing the detailed code. Based on this, we propose a multi-stage guided code generation strategy that aims to gradually shorten the transformation distance between the problem description and the correct code, thus improving the accuracy of code generation. Specifically, the approach consists of three stages: planning, design and implementation. In the planning phase, the Large Language Model (LLM) generates a solution plan based on the problem description; in the design phase, the code architecture is further designed based on the solution plan; and in the implementation phase, the previous solution plan and code architecture are utilized to guide the LLM in generating the final code. Additionally, we found that existing competition-level code generation benchmarks may overlap with the training data of the Chat Generative Pre-trained Transformer (ChatGPT), posing a risk of data leakage. To validate the above findings and circumvent this risk, we created a competition-level code generation dataset named CodeC, which contains data never used for training ChatGPT. Experimental results show that our method outperforms the most advanced baselines. On the CodeC dataset, our approach achieves a 34.7% relative improvement on the Pass@1 metric compared to the direct generation method of ChatGPT. We have published the relevant dataset at <span><span>https://github.com/hcode666/MSG</span><svg><path></path></svg></span> for further academic research and validation.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Crude oil price forecasting with multivariate selection, machine learning, and a nonlinear combination strategy","authors":"","doi":"10.1016/j.engappai.2024.109510","DOIUrl":"10.1016/j.engappai.2024.109510","url":null,"abstract":"<div><div>Crude oil price forecasting has been one of the research hotspots in the field of energy economics, which plays a crucial role in energy supply and economic development. However, numerous influencing factors bring serious challenges to crude oil price forecasting, and existing research has room for further improvement in terms of an integrated research roadmap that combines impact factor analysis with predictive modelling. This study aims to examine the impact of financial market factors on the crude oil market and to propose a nonlinear combined forecasting framework based on common variables. Four types of daily exogenous financial market variables are introduced: commodity prices, exchange rates, stock market indices, and macroeconomic indicators for ten indicators. First, various variable selection methods generate different variable subsets, providing more diversity and reliability. Next, common variables in the subset of variables are selected as key features for subsequent models. Then, four models predict crude oil prices using common features as inputs and obtain the prediction results for each model. Finally, the nonlinear mechanism of the deep learning technology is introduced to combine above single prediction results. Experimental results reveal that commodity and foreign exchange factors in financial markets are critical determinants of crude oil market volatility over the long term, as observed in experiments conducted on the West Texas Intermediate and Brent oil price datasets. The proposed model demonstrates strong performance regarding average absolute percentage error, recorded at 2.9962% and 2.4314%, respectively, indicating high forecasting accuracy and robustness. This forecasting framework offers an effective methodology for predicting crude oil prices and enhances understanding the crude oil market.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-driven drift detection and diagnosis framework for predictive maintenance of heterogeneous production processes: Application to a multiple tapping process","authors":"","doi":"10.1016/j.engappai.2024.109552","DOIUrl":"10.1016/j.engappai.2024.109552","url":null,"abstract":"<div><div>The rise of Industry 4.0 technologies has revolutionized industries, enabled seamless data access, and fostered data-driven methodologies for improving key production processes such as maintenance. Predictive maintenance has notably advanced by aligning decisions with real-time system degradation. However, data-driven approaches confront challenges such as data availability and complexity, particularly at the system level. Most approaches address component-level issues, but system complexity exacerbates problems. In the realm of predictive maintenance, this paper proposes a framework for addressing drift detection and diagnosis in heterogeneous manufacturing processes. The originality of the paper is twofold. First, this paper proposes algorithms for handling drift detection and diagnosing heterogeneous processes. Second, the proposed framework leverages several machine learning techniques (e.g., novelty detection, ensemble learning, and continuous learning) and algorithms (e.g., K-Nearest Neighbors, Support Vector Machine, Random Forest and Long-Short Term Memory) for enabling the concrete implementation and scalability of drift detection and diagnostics on industrial processes. The effectiveness of the proposed framework is validated through metrics such as accuracy, precision, recall, F1-score, and variance. Furthermore, this paper demonstrates the relevance of combining machine learning and deep learning algorithms in a production process of SEW USOCOME, a French manufacturer of electric gearmotors and a market leader. The results indicate a satisfactory level of accuracy in detecting and diagnosing drifts, and the adaptive learning loop effectively identifies new drift and nominal profiles, thereby validating the robustness of the framework in real industrial settings.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Seafloor topography inversion from multi-source marine gravity data using multi-channel convolutional neural network","authors":"","doi":"10.1016/j.engappai.2024.109567","DOIUrl":"10.1016/j.engappai.2024.109567","url":null,"abstract":"<div><div>Seafloor topography is extremely important for marine scientific surveys and research. Current physical methods have difficulties in integrating multi-source marine gravity data and recovering non-linear features. To overcome this challenge, a multi-channel convolutional neural network (MCCNN) is employed to establish the seafloor topography model. Firstly, the MCCNN model is trained using the input data from the 64 × 64 grid points centered around the control points. The input data includes the differences in position between calculation points and surrounding grid points, gravity anomaly, vertical gravity gradient, east component of deflection of the vertical and north component of deflection of the vertical, as well as the reference terrain information. Then, the data from the 64 × 64 grid points centered around the predicted points is inputted into the trained MCCNN model to obtain the predicted depth at those points. Finally, the predicted depth is utilized to establish the seafloor topography model of the study area. This method is tested in a local area located in the southern part of the Emperor Seamount Chain in the Northwest Pacific (31°N −37°N, 169°E −175°E). The root mean square of the differences between the resultant seafloor topography model and ship-borne bathymetric values at the check points is 88.48 m. This performance is commendable compared to existing models.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated metaheuristic approaches for estimation of fracture porosity derived from fullbore formation micro-imager logs: Reaping the benefits of stand-alone and ensemble machine learning models","authors":"","doi":"10.1016/j.engappai.2024.109545","DOIUrl":"10.1016/j.engappai.2024.109545","url":null,"abstract":"<div><div>Fracture porosity is one of the most effective parameters for reservoir productivity and recovery efficiency. This study aims to predict and improve the accuracy of fracture porosity estimation through the application of advanced machine learning (ML) algorithms. A novel approach was introduced for the first time to estimate fracture porosity by reaping the benefits of petrophysical and fullbore formation micro-imager (FMI) data based on employing various stand-alone, ensemble, optimisation and multi-variable linear regression (MVLR) algorithms. This study proposes a ground-breaking two-step committee machine (CM) model. Petrophysical data containing compressional sonic-log travel time, deep resistivity, neutron porosity and bulk density (inputs), along with FMI-derived fracture porosity values (outputs), were employed. Nine stand-alone ML algorithms, including back-propagation neural network, Takagi and Sugeno fuzzy system, adaptive neuro-fuzzy inference system, decision tree, radial basis function, extreme gradient boosting, least-squares boosting, least squares support vector regression and k-nearest neighbours, were trained for initial estimation. To improve the efficacy of stand-alone algorithms, their outputs were combined in CM structures using optimisation algorithms. This integration was applied through five optimisation algorithms, including genetic algorithm, ant colony, particle swarm, covariance matrix adaptation evolution strategy (CMA-ES) and Coyote optimisation algorithm. Considering the lowest error, the CM with CMA-ES showed superior performance. Subsequently, MVLR was employed to improve the CMs further. Employing MVLR to combine the CMs yielded a 57.85% decline in mean squared error and a 4.502% improvement in the correlation coefficient compared to the stand-alone algorithms. The results of the benchmark analysis validated the efficacy of this approach.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A graph convolutional neural network model based on fused multi-subgraph as input and fused feature information as output","authors":"","doi":"10.1016/j.engappai.2024.109542","DOIUrl":"10.1016/j.engappai.2024.109542","url":null,"abstract":"<div><div>The graph convolution neural network (GCN)-based node classification model tackles the challenge of classifying nodes in graph data through learned feature representations. However, most existing graph neural networks primarily focus on the same type of edges, which might not accurately reflect the intricate real-world graph structure. This paper introduces a novel graph neural network model, MF-GCN, which integrates subgraphs with various edge types as input and combines feature information from each graph convolutional neural network layer to produce the final output. This model learns node feature representations by separately feeding subgraphs with different edge types into the graph convolutional layer. It then computes the weight vectors for fusing various edge type subgraphs based on the learned node features. Additionally, to efficiently extract feature information, the outputs of each graph convolution layer, without an activation function, are weighted and summed to obtain the final node features. This approach resolves the challenges of determining fusion weights and effectively extracting feature information during subgraph fusion. Experimental results show that the proposed model significantly improves performance on all three datasets, highlighting its effectiveness in node representation learning tasks.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power transformer health index and life span assessment: A comprehensive review of conventional and machine learning based approaches","authors":"","doi":"10.1016/j.engappai.2024.109474","DOIUrl":"10.1016/j.engappai.2024.109474","url":null,"abstract":"<div><div>Power transformers play a critical role within the electrical power system, making their health assessment and the prediction of their remaining lifespan paramount for the purpose of ensuring efficient operation and facilitating effective maintenance planning. This paper undertakes a comprehensive examination of existent literature, with a primary focus on both conventional and cutting-edge techniques employed within this domain. The merits and demerits of recent methodologies and techniques are subjected to meticulous scrutiny and explication. Furthermore, this paper expounds upon intelligent fault diagnosis methodologies and delves into the most widely utilized intelligent algorithms for the assessment of transformer conditions. Diverse Artificial Intelligence (AI) approaches, including Artificial Neural Networks (ANN) and Convolutional Neural Network (CNN), Support Vector Machine (SVM), Random Forest (RF), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO), are elucidated offering pragmatic solutions for enhancing the performance of transformer fault diagnosis. The amalgamation of multiple AI methodologies and the exploration of time-series analysis further contribute to the augmentation of diagnostic precision and the early detection of faults in transformers. By furnishing a comprehensive panorama of AI applications in the field of transformer fault diagnosis, this study lays the groundwork for future research endeavors and the progression of this critical area of study.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142539823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}