Idrees A. Zahid , Shahad Sabbar Joudar , A.S. Albahri , O.S. Albahri , A.H. Alamoodi , Jose Santamaría , Laith Alzubaidi
{"title":"Unmasking large language models by means of OpenAI GPT-4 and Google AI: A deep instruction-based analysis","authors":"Idrees A. Zahid , Shahad Sabbar Joudar , A.S. Albahri , O.S. Albahri , A.H. Alamoodi , Jose Santamaría , Laith Alzubaidi","doi":"10.1016/j.iswa.2024.200431","DOIUrl":"10.1016/j.iswa.2024.200431","url":null,"abstract":"<div><p>Large Language Models (LLMs) have become a hot topic in AI due to their ability to mimic human conversation. This study compares the open artificial intelligence generative pretrained transformer-4 (GPT-4) model, based on the (GPT), and Google's artificial intelligence (AI), which is based on the Bidirectional Encoder Representations from Transformers (BERT) framework in terms of the defined capabilities and the built-in architecture. Both LLMs are prominent in AI applications. First, eight different capabilities were identified to evaluate these models, i.e. translation accuracy, text generation, factuality, creativity, intellect, deception avoidance, sentiment classification, and sarcasm detection. Next, each capability was assessed using instructions. Additionally, a categorized LLM evaluation system has been developed by means of using ten research questions per category based on this paper's main contributions from a prompt engineering perspective. It should be highlighted that GPT-4 and Google AI successfully answered 85 % and 68,7 % of the study prompts, respectively. It has been noted that GPT-4 better understands prompts than Google AI, even with verbal flaws, and tolerates grammatical errors. Moreover, the GPT-4 based approach was more precise, accurate, and succinct than Google AI, which was sometimes verbose and less realistic. While GPT-4 beats Google AI in terms of translation accuracy, text generation, factuality, intellectuality, creativity, and deception avoidance, Google AI outperforms the former when considering sarcasm detection. Both sentiment classification models did work properly. More importantly, a human panel of judges was used to assess and evaluate the model comparisons. Statistical analysis of the judges' ratings revealed more robust results based on examining the specific uses, limitations, and expectations of both GPT-4 and Google AI-based approaches. Finally, the two approaches' transformers, parameter sizes, and attention mechanisms have been examined.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200431"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324001054/pdfft?md5=b6c9fa39bd05b579aebb48986c20b9ec&pid=1-s2.0-S2667305324001054-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-scale transformer network for super-resolution of visible and thermal air images","authors":"Hèdi Fkih , Abdelaziz Kallel , Zied Chtourou","doi":"10.1016/j.iswa.2024.200429","DOIUrl":"10.1016/j.iswa.2024.200429","url":null,"abstract":"<div><p>Reference image-based Super-Resolution (RefSR) is introduced to improve the quality of a Low-resolution (LR) input image by leveraging the additional information provided by a High-Resolution (HR) reference image (Ref). While existing RefSR methods focus on thermal or visible flows separately, they often struggle to enhance the resolution of small objects such as Mini/Micro UAVs (Unmanned Aerial Vehicle) due to the resolution disparities between the input and reference images. To cope with these challenges when dealing with UAV early detection in context of video surveillance, we propose ThermoVisSR, a multiscale texture transformer for enhancing the Super-Resolution (SR) of visible and thermal images of Mini/Micro UAVs. Our approach tries to reconstruct the fine details of these objects while preserving their approximation (the body form and color of the different scene objects) already contained in the LR image. Hence, our model is divided up into two streams dealing separately with approximation and detail reconstruction. In the first one, we introduce a Convolution Neural Network (CNN) fusion backbone to extract the Low-Frequency (LF) approximation from the original LR image pairs. In the second one and to extract the details from the Ref image, our approach involves blending features from both visible and thermal sources to make the most of what each offer. Subsequently, we introduce the High-Frequency Texture Transformer (HFTT) across various resolutions of the merged features to ensure an accurate correspondence matching and significant transfer of High-Frequency (HF) patches from Ref to LR images. Moreover, to adapt the injection to the different bands well, we incorporate the separable software decoder (SSD) into the HFTT allowing to capture channel-specific details during the reconstruction phase. We validated our approach using a newly created dataset of Air images of Mini/Micro UAVs. Experimental results demonstrate that the proposed model consistently outperforms the state-of-the-art approaches on both qualitative and quantitative assessments.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200429"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324001030/pdfft?md5=708b3e9003aec9aee059364d6ad6c586&pid=1-s2.0-S2667305324001030-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing production efficiency in semiconductor enterprises by an improve and optimized biogeographical optimization algorithm based on three-layer coding","authors":"Jiaqi Liu","doi":"10.1016/j.iswa.2024.200432","DOIUrl":"10.1016/j.iswa.2024.200432","url":null,"abstract":"<div><p>Aiming at the problem of low production efficiency and inability to fully unleash production capacity in current semiconductor enterprises, a production efficiency optimization model for semiconductor enterprises has been studied and constructed. This model transforms the problem of low production efficiency into a problem of locating and solving the decoupling point of enterprise customer orders, and comprehensively considers the situation of sudden changes in enterprise production orders when locating and solving the decoupling point of customer orders. Propose to use a three-layer coding (TLC) mechanism to improve and optimize the biogeographical optimization algorithm, and use the improved biogeographical optimization (IOBO) algorithm to solve the production efficiency optimization problem of semiconductor enterprises. The results show that the proposed IBBO-TCL algorithm has a fast convergence speed and the minimum root mean square error after convergence. And this method can accurately solve the decoupling point of customer orders for semiconductor enterprises. The method proposed in the study has effectively improved the production efficiency of semiconductor enterprises and has guiding significance for optimizing enterprise structure.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"24 ","pages":"Article 200432"},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324001066/pdfft?md5=e9ab62490d3350f471d60b1cd26f905c&pid=1-s2.0-S2667305324001066-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142164057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harol Mauricio Gámez-Albán, Ruben Guisson, Annelies De Meyer
{"title":"Optimizing the organization of the first mile in agri-food supply chains with a heterogeneous fleet using a mixed-integer linear model","authors":"Harol Mauricio Gámez-Albán, Ruben Guisson, Annelies De Meyer","doi":"10.1016/j.iswa.2024.200426","DOIUrl":"10.1016/j.iswa.2024.200426","url":null,"abstract":"<div><p>Consumers are increasingly demanding high-quality food, which presents significant challenges for agricultural supply chains. While the majority of research in the agri-food sector has concentrated on optimizing logistics costs and meeting demand by focusing on minimizing the last mile, the complexity of the first mile in the agricultural supply chain has been less explored. Farmers must efficiently manage the harvesting process and the transportation of harvested produce to consolidation centers to ensure the delivery of high-quality products. This paper addresses this research gap by introducing a mixed-integer programming model that leverages vehicle routing problem concepts to optimize the logistics processes involved in transporting harvested products from various fields to a central depot. The primary objective is to minimize total logistics costs associated with visiting different fields during a pick-up round using multiple vehicles. The model has been applied to a case study involving an agricultural cooperative in Greece as part of the European BBTWINS project, which aims to enhance agri-food value chain digitalization for improved resource efficiency. The results demonstrate that organizing the first mile of the agri-food supply chain with a cooled vehicle for pick-up rounds can reduce logistics costs by up to 40% compared to the current practices.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200426"},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324001005/pdfft?md5=e0c6a4a25c889be39eb6364bf1524305&pid=1-s2.0-S2667305324001005-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating explainable machine learning and user-centric model for diagnosing cardiovascular disease: A novel approach","authors":"Gangani Dharmarathne , Madhusha Bogahawaththa , Upaka Rathnayake , D.P.P. Meddage","doi":"10.1016/j.iswa.2024.200428","DOIUrl":"10.1016/j.iswa.2024.200428","url":null,"abstract":"<div><p>Conventional machine learning techniques in diagnosing cardiovascular disease have a limitation owing to the lack of interpretability of models. This study utilised an explainable machine learning approach to predict the likelihood of having CVD. Four machine learning models were employed for CVD diagnosis: Decision Tree (DT), K-Nearest Neighbor (KNN), Random Forest (RF), and Extreme Gradient Boost (XGB). Shapley Additive Explanations (SHAP) were used to provide reasoning for the models' predictions. Using these models and explanations, a user interface was developed to assist in diagnosing CVD. All four classification models demonstrated good accuracy in diagnosing CVD, with the KNN model showcasing the best performance (Accuracy: 71 %). SHAP provided the reasoning behind KNN predictions, and the predictive interface was developed by embedding these explanations to provide transparency behind the model's decisions.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200428"},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324001029/pdfft?md5=40d5d256c670f94ade9890d52463a6a1&pid=1-s2.0-S2667305324001029-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention mechanism enhanced LSTM networks for latency prediction in deterministic MEC networks","authors":"Zhonglu Zou, Xin Yan, Yongshi Yuan, Zilin You, Liming Chen","doi":"10.1016/j.iswa.2024.200425","DOIUrl":"10.1016/j.iswa.2024.200425","url":null,"abstract":"<div><p>In deterministic mobile edge computing (MEC) networks, accurately predicting latency is critical for optimizing resource allocation and enhancing quality of service (QoS). This paper introduces a novel approach leveraging attention mechanism enhanced long short-term memory (LSTM) networks to predict latency in MEC networks. The proposed model integrates attention mechanisms into LSTM networks to capture temporal dependency and emphasize relevant features in the input data, thereby improving the prediction accuracy. T extensive experiments are conducted by using practical MEC network data, demonstrating that the proposed approach significantly outperforms traditional LSTM and other baseline models in terms of prediction accuracy and computational efficiency. Additionally, we analyze the impact of various configurations in the attention mechanism and LSTM on the model performance, providing insights into the optimal settings. The findings of this study contribute to the advancement of latency prediction techniques in deterministic MEC networks, facilitating more efficient and reliable network management.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200425"},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000991/pdfft?md5=ee47f3714c07656cbf13489f3b8c15dd&pid=1-s2.0-S2667305324000991-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multi-source heterogeneous data fusion method for intelligent systems in the Internet of Things","authors":"Rongrong Sun , Yuemei Ren","doi":"10.1016/j.iswa.2024.200424","DOIUrl":"10.1016/j.iswa.2024.200424","url":null,"abstract":"<div><p>The advent of the Internet of Things (IoT) has revolutionized the field of intelligent system development by providing an extensive amount of data from IoT devices, essential for the management of these systems and the creation of innovative services. This data covers various aspects, including creation at the physical layer, transmission through the network layer, and processing within the application layer. This study presents a groundbreaking approach to amalgamating multi-source and varied data within intelligent systems leveraging IoT technology. Our approach seeks to optimize the integration of diverse datasets by examining the correlations between different data types using a novel mixed information gain strategy, leading to effective data fusion. It capitalizes on the computational and storage capacities of systems for seamless integration and augments the analysis of information, thereby improving the useability of data in intelligent systems. Simulation tests confirm the superiority of our method, demonstrating a remarkable improvement in performance in the fusion of dynamic, multi-source heterogeneous data compared to conventional techniques.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200424"},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266730532400098X/pdfft?md5=2b7b7b15f5cc697370e951edb65b1983&pid=1-s2.0-S266730532400098X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md Al Amin Sarker, Bharanidharan Shanmugam, Sami Azam, Suresh Thennadil
{"title":"Enhancing smart grid load forecasting: An attention-based deep learning model integrated with federated learning and XAI for security and interpretability","authors":"Md Al Amin Sarker, Bharanidharan Shanmugam, Sami Azam, Suresh Thennadil","doi":"10.1016/j.iswa.2024.200422","DOIUrl":"10.1016/j.iswa.2024.200422","url":null,"abstract":"<div><p>Smart grid is a transformative advancement that modernized the traditional power system for effective electricity management, and involves optimized energy distribution by load forecasting. Precise load forecasting provides the best utilization of energy resources and increases sustainability. Dynamic changes of several connected factors, such as temporal and geographical variability, pose challenges to accurate load prediction. Integrating Artificial Intelligence (AI) in the smart grid can enhance the performance of the forecasting process by capturing these changes. This study investigated load forecasting tasks on four different datasets. Several preprocessing and augmentation techniques are applied to increase the data quality. An attention-based 1D-CNN-GRU model is proposed to capture the temporal patterns from the time-series data, and the hyperparameters of the model are optimized using a particle swarm optimization (PSO) algorithm that also accelerates the convergence and results in an efficient training session. Empirical evaluations highlight that the proposed model substantially minimizes the loss, reflecting the ability to make accurate predictions. It obtains MAE values of 0.12, 0.8, 16.48, and 82.64 for the four datasets. Moreover, the explainable AI (XAI) technique is applied using Shapley Additive explanations (SHAP) to interpret the model prediction, providing the feature ranking based on their prediction score. Moreover, this study utilizes federated learning, enables collaborative training, maintains the privacy of the grid data, and secures the process comprehensively. The aggregation mechanism in federated learning is modified using pruning-based methods that reduce the parameters and computational cost, resulting in a more efficient framework. Integrating all these approaches provides valuable insights for developing a load forecasting model and outlines potential contributions in the smart grid domain.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200422"},"PeriodicalIF":0.0,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000966/pdfft?md5=6bef1c0253b216dd874359b6617d6b66&pid=1-s2.0-S2667305324000966-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141962031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan G. Colonna , Ahmed A. Fares , Márcio Duarte , Ricardo Sousa
{"title":"Process mining embeddings: Learning vector representations for Petri nets","authors":"Juan G. Colonna , Ahmed A. Fares , Márcio Duarte , Ricardo Sousa","doi":"10.1016/j.iswa.2024.200423","DOIUrl":"10.1016/j.iswa.2024.200423","url":null,"abstract":"<div><p>Process Mining offers a powerful framework for uncovering, analyzing, and optimizing real-world business processes. Petri nets provide a versatile means of modeling process behavior. However, traditional methods often struggle to effectively compare complex Petri nets, hindering their potential for process enhancement. To address this challenge, we introduce PetriNet2Vec, an unsupervised methodology inspired by Doc2Vec. This approach converts Petri nets into embedding vectors, facilitating the comparison, clustering, and classification of process models. We validated our approach using the PDC Dataset, comprising 96 diverse Petri net models. The results demonstrate that PetriNet2Vec effectively captures the structural properties of process models, enabling accurate process classification and efficient process retrieval. Specifically, our findings highlight the utility of the learned embeddings in two key downstream tasks: process classification and process retrieval. In process classification, the embeddings allowed for accurate categorization of process models based on their structural properties. In process retrieval, the embeddings enabled efficient retrieval of similar process models using cosine distance. These results demonstrate the potential of PetriNet2Vec to significantly enhance process mining capabilities.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200423"},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000978/pdfft?md5=c5510ffdcb881b0b3214984ffc9b29e1&pid=1-s2.0-S2667305324000978-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141962773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ali Albada , Muataz Salam Al-Daweri , Rabie A. Ramadan , Khalid Al. Qatiti , Li Haoyang , Peng Shutong
{"title":"Determinates of investor opinion gap around IPOs: A machine learning approach","authors":"Ali Albada , Muataz Salam Al-Daweri , Rabie A. Ramadan , Khalid Al. Qatiti , Li Haoyang , Peng Shutong","doi":"10.1016/j.iswa.2024.200420","DOIUrl":"10.1016/j.iswa.2024.200420","url":null,"abstract":"<div><p>The current study examines the factors influencing investor opinions on issues related to listed firms during the first day of Initial Public Offerings (IPOs), focusing on a sample of 350 fixed-priced IPOs listed on the Malaysian stock exchange (Bursa Malaysia) from 2004 to 2021. This research contributes to existing literature by employing various machine learning methods, which address the limitations of traditional linear regression models commonly used in previous studies. Specifically, five methods—extra tree regressor (ETR), single feature selection (SFS), reverse single feature (RSF), recursive feature elimination (RFE), and sequential modelling feature adding (SMFA)—are utilized to assess the importance of features in predicting the investor opinion gap within the dataset.</p><p>The study's experiments indicate that these methods effectively mitigate noisy data, enhancing their reliability for this type of analysis. The findings provide valuable insights for regulators regarding safeguarding investors' rights to information disclosed in prospectuses.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200420"},"PeriodicalIF":0.0,"publicationDate":"2024-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000942/pdfft?md5=f261c79633047d7bfb957563e6c75844&pid=1-s2.0-S2667305324000942-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141846813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}