{"title":"GIRCS: An effective evolutionary scheme to solve SBM-RCPSP scheduling problems for industrial production planning","authors":"Loc Nguyen The , Huu Dang Quoc , Hao Nguyen Thi","doi":"10.1016/j.iswa.2025.200522","DOIUrl":"10.1016/j.iswa.2025.200522","url":null,"abstract":"<div><div>Resource Constrained Project Scheduling Problem (RCPSP) is a fundamental scheduling problem that has attracted much attention from researchers for many years. Many variants of this problem have been modeled, and many different approaches have been proposed and published in journals. However, the classical mathematical models of RCPSP still have some limitations that make it not really suitable for application in practical projects. This paper introduces practical applications and classifications of the RCPSP problem. After describing some common extensions of the original RCPSP problem, we briefly introduce three approaches that have been used to solve those extensions, including exact, heuristic, and metaheuristic algorithms. We define a novel scheduling problem named SBM-RCPSP (Skill-Based Makespan-RCPSP) which overcomes the limitations of previous variants of the RCPSP problem. The Graham representation of the SBM-RCPSP problem is introduced, and then the problem is proven to be NP-Hard. To solve the SBM-RCPSP problem, we propose an evolutionary algorithm called GIRCS inspired by Cuckoo Search and improved to reduce the total project execution time. Experimental results on datasets have demonstrated that the proposed scheme finds more efficient schedules than previous solutions.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200522"},"PeriodicalIF":0.0,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143894997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced AI techniques for root disease classification in dental X-rays using deep learning and metaheuristic approach","authors":"Prem Enkvetchakul , Surajet Khonjun , Rapeepan Pitakaso , Thanatkij Srichok , Peerawat Luesak , Chutchai Kaewta , Sarayut Gonwirat , Chawis Boonmee , Matus Noowattana , Thitinon Srisuwandee","doi":"10.1016/j.iswa.2025.200526","DOIUrl":"10.1016/j.iswa.2025.200526","url":null,"abstract":"<div><div>Root dental diseases remain among the most diagnostically challenging conditions in oral healthcare, often leading to treatment delays and suboptimal outcomes. This study is motivated by the limitations of existing automated diagnostic systems, which tend to focus on superficial abnormalities and overlook complex root pathologies such as pulpal infections, periapical lesions, and progressive periodontitis. To bridge this critical gap, we propose an advanced AI-based classification model that integrates ensemble deep learning architectures with a hybrid metaheuristic optimization strategy-namely, the non-population-based Artificial Multiple Intelligence System (np-AMIS) for image augmentation and the population-based AMIS (pop-AMIS) for adaptive decision fusion. This dual-phase approach enhances feature diversity, classification robustness, and computational efficiency. The model was trained and validated on two proprietary datasets, TD-1 and TD-2, achieving classification accuracies of 98.87 % and 98.41 %, respectively. It was further implemented in a real-world application via the Automated Teeth Disease and Abnormality Classification System (A-TD-A-CS), demonstrating 98.95 % accuracy, a rapid response time of 1.5 s, and a System Usability Scale (SUS) score of 94.5 from dental professionals. The system's ability to accurately identify multiple root disease categories highlights its clinical viability and transformative potential. In addition to its current performance, this study lays the groundwork for future extensions to multi-center datasets and cross-modality diagnostics using cone-beam CT or intraoral scans, further advancing intelligent dental care.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200526"},"PeriodicalIF":0.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OPT-IQA: Automated camera parameters tuning framework with IQA-guided optimization","authors":"Jan-Henner Roberg, Vladyslav Mosiichuk, Ricardo Silva, Luís Rosado","doi":"10.1016/j.iswa.2025.200520","DOIUrl":"10.1016/j.iswa.2025.200520","url":null,"abstract":"<div><div>In industrial visual inspection, computer vision-based AI systems play a pivotal role, with performances dependent on the quality of the acquired images and changes in environmental conditions. Modern cameras adapt to these varying environments by allowing the tuning of a wide range of camera parameters that significantly change the characteristics of the acquired images. While some parameters are already automatically adjusted in most cameras (e.g., exposure, focus, white balance), others are static and remain at their default values (e.g., brightness, contrast, color-saturation, sharpness). Adaptably adjusting these non-automatic (NAUTO) parameters significantly influences both image quality and the performance of automated visual inspection systems. This work introduces OPT-IQA, a novel framework to automate NAUTO parameter tuning. The proposed approach is based on an optimization process guided by Image Quality Assessment (IQA) metrics that measure human-understandable image quality characteristics, thus enhancing the interpretability of the parameters’ selection process. The framework is built modularly, including a Camera Abstraction Layer to ensure its camera-agnostic nature and a Region-of-Interest Selection Module to select the target region of the inspected object. It also facilitates the seamless integration of supplementary IQA metrics and optimization algorithms to support additional use cases. By using an IQA-guided optimization process based on a reference image, our results show that OPT-IQA alleviates the burden of manually adjusting NAUTO parameters in response to varying illumination conditions, whether caused by shifts in natural elements (e.g., weather) or human-induced changes (e.g., reconfiguration of assembly lines).</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200520"},"PeriodicalIF":0.0,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving speaker-independent visual language identification using deep neural networks with training batch augmentation","authors":"Jacob L. Newman","doi":"10.1016/j.iswa.2025.200517","DOIUrl":"10.1016/j.iswa.2025.200517","url":null,"abstract":"<div><div>Visual Language Identification (VLID) is concerned with using the appearance and movement of the mouth to determine the identity of spoken language. VLID has applications where conventional audio based approaches are ineffective due to acoustic noise, or where an audio signal is unavailable, such as remote surveillance. The main challenge associated with VLID is the speaker-dependency of image based visual recognition features, which bear little meaningful correspondence between speakers.</div><div>In this work, we examine a novel VLID task using video of 53 individuals reciting the Universal Declaration of Human Rights in their native languages of Arabic, English or Mandarin. We describe a speaker-independent, five fold cross validation experiment, where the task is to discriminate the language spoken in 10 s videos of the mouth. We use the YOLO object detection algorithm to track the mouth through time, and we employ an ensemble of 3D Convolutional and Recurrent Neural Networks for this classification task. We describe a novel approach to the construction of training batches, in which samples are duplicated, then reversed in time to form a <em>distractor</em> class. This method encourages the neural networks to learn the discriminative temporal features of language rather than the identity of individual speakers.</div><div>The maximum accuracy obtained across all three language experiments was 84.64%, demonstrating that the system can distinguish languages to a good degree, from just 10 s of visual speech. A 7.77% improvement on classification accuracy was obtained using our distractor class approach compared to normal batch selection. The use of ensemble classification consistently outperformed the results of individual networks, increasing accuracies by up to 7.27%. In a two language experiment intended to provide a comparison with our previous work, we observed an absolute improvement in classification accuracy of 3.6% (90.01% compared to 83.57%).</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200517"},"PeriodicalIF":0.0,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zulqurnain Sabir , Hafiz Abdul Wahab , Mohamed R. Ali , Shahid Ahmad Bhat
{"title":"A Meyer wavelet neural networks procedure for prediction, pantograph and delayed singular models","authors":"Zulqurnain Sabir , Hafiz Abdul Wahab , Mohamed R. Ali , Shahid Ahmad Bhat","doi":"10.1016/j.iswa.2024.200457","DOIUrl":"10.1016/j.iswa.2024.200457","url":null,"abstract":"<div><div>This work aims the numerical solutions of the nonlinear form of prediction, pantograph, and delayed differential singular models (NPPD-DSMs) by exploiting the Meyer wavelet neural networks (MWNNs). The optimization is accomplished using the local and global search paradigms of active-set approach (ASA) and genetic algorithm (GA), i.e., MWNNs-GA-ASA. An objective function is designed using the NPPD-MSMs and the corresponding boundary conditions, which is optimized through the GA-ASA paradigms. The obtained numerical outcomes of the NPPD-MSMs are compared with the true results to observe the correctness of the designed MWNNs-GA-ASA. The absolute error in good measures, i.e., negligible, for solving the NPPD-DSMs is plotted, which shows the stability and effectiveness of the MWNNs-GA-ASA. For the reliability of the procedure, the performances through different statistical operators have been presented for multiple trials to solve the NPPD-NSMs.</div><div>Mathematics Subject Classification. Primary 68T07; Secondary 03D15, 90C60.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200457"},"PeriodicalIF":0.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143855189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GAN-ViT-CMFD: A novel framework integrating generative adversarial networks and vision transformers for enhanced copy-move forgery detection and classification with spectral clustering","authors":"Jyothsna Ravula, Nilu Singh","doi":"10.1016/j.iswa.2025.200524","DOIUrl":"10.1016/j.iswa.2025.200524","url":null,"abstract":"<div><div>Copy-move forgery detection (CMFD) is a critical task in digital forensics to ensure the authenticity of visual content, as the prevalence of advanced editing tools has made it increasingly easy to tamper with images. Such forgeries can have severe implications in fields like journalism, legal evidence, and cybersecurity. The motivation for adopting a hybrid Generative Adversarial Network (GAN)-Vision Transformer (ViT) approach arises from the need for robust models capable of handling the complexities of forgery patterns while ensuring high detection accuracy. This study proposes a hybrid framework, GAN-ViT-CMFD, integrating GANs and ViTs to address these challenges. GANs are employed to generate realistic forged images, creating an augmented dataset that enhances model robustness. ViTs extract powerful feature representations, leveraging their competence to capture long-range dependencies and intricate patterns in image data. Spectral clustering is then applied to the feature space to segregate forged and original image features, which are subsequently fed into a Convolutional Neural Network (CNN)-based classifier for forgery detection and classification.</div><div>The proposed model demonstrates superior performance, achieving a training accuracy of 99.62 % and a validation accuracy of 99.0 %, with training and validation losses of 0.0352 and 0.0269, respectively. Evaluation metrics further affirm its effectiveness, with an accuracy of 99.02 %, precision of 97.92 %, recall of 99.89 %, and F1-score of 98.95 %. Additionally, the model achieves an exceptional ROC-AUC score of 99.88 %. These outcomes emphasize the ability of the GAN-ViT method in CMFD, highlighting its potential impact in reinforcing the reliability of image authenticity verification across various domains.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200524"},"PeriodicalIF":0.0,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention-enhanced LSTM for high-value customer behavior prediction: Insights from Thailand’s E-commerce sector","authors":"Rattapol Kasemrat, Tanpat Kraiwanit","doi":"10.1016/j.iswa.2025.200523","DOIUrl":"10.1016/j.iswa.2025.200523","url":null,"abstract":"<div><div>The rapid growth of e-commerce in emerging markets like Thailand has presented businesses with both opportunities and challenges. One critical challenge lies in accurately identifying high-value customers amidst vast amounts of transactional data. Effective predictive models must not only deliver high accuracy but also provide transparency to guide actionable business decisions. Predicting high-value customers is particularly important in these markets due to evolving consumer behaviors and increasing competition.</div><div>This study introduces an attention-enhanced Long Short-Term Memory (LSTM) model to predict high-value customer behavior in Thailand's e-commerce sector, addressing the challenges of achieving high predictive accuracy while ensuring interpretability. The novelty of this research lies in integrating an attention mechanism within the LSTM framework, enabling the identification of key customer behaviors—such as total purchase amount, purchase frequency, and monthly purchase frequency—that significantly influence high-value customer classification. By leveraging transactional data from a leading Thai e-commerce platform, the model delivers outstanding predictive performance with accuracy rates of 99.75 % (training), 99.77 % (validation), and 99.83 % (testing), coupled with low error metrics (RMSE: 0.0391, MAE: 0.0039).</div><div>The attention mechanism enhances model transparency by identifying influential behavioral features, thereby enabling actionable insights that align with customer segmentation and targeted marketing strategies. Compared to traditional LSTM models, this approach demonstrates superior predictive power and interpretability, making it an effective tool for e-commerce platforms seeking to optimize customer retention and engagement strategies.</div><div>This study significantly contributes to advancing machine learning applications in e-commerce by showcasing how attention mechanisms can address the dual needs of predictive accuracy and transparency. The practical benefits of this model are particularly relevant for emerging markets like Thailand, where consumer behaviors and competitive dynamics are evolving rapidly. Future research should investigate the scalability of this approach across diverse datasets and markets, incorporating additional data sources such as demographic and social media information, to further enhance its applicability and robustness.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200523"},"PeriodicalIF":0.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143855190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative analysis on using GPT and BERT for automated vulnerability scoring","authors":"Seyedeh Leili Mirtaheri , Andrea Pugliese , Narges Movahed , Reza Shahbazian","doi":"10.1016/j.iswa.2025.200515","DOIUrl":"10.1016/j.iswa.2025.200515","url":null,"abstract":"<div><div>Large language models and transformers such as GPT and BERT have shown great improvements in many domains including cybersecurity. A constantly increasing number of vulnerabilities necessitate automated vulnerability scoring systems. Therefore, a deeper understanding of GPT and BERT compatibility with the requirements of the cybersecurity domain seems inevitable for system designers. The BERT model’s family is known to be optimized in understanding the contextual relationships with a bidirectional approach, while the GPT models perform unidirectional processing with generative capabilities. Automated vulnerability scoring systems require both the features to analyze the vulnerability and to augment the vulnerability descriptions. On the other hand, powerful GPT models are often more “resource-intensive in comparison with the BERT family. This paper presents a comprehensive comparison analysis of GPT and BERT in terms of their text classification performance, utilizing the vulnerability description classification task. We outline a thorough theoretical and experimental comparison of the models, regarding their architectures, training objectives, and fine-tuning, as well as their text classification performance. We evaluate these models on the vulnerability description classification task and employ rigorous evaluation metrics to shed light on their relative strengths and shortcomings. We also evaluate the hybrid architectures that benefit from combining GPT and BERT at the same time. Our experiment results show that they can effectively leverage the complementary strengths of both GPT and BERT, namely generative and comprehension, leading to further improvements in classification performance.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200515"},"PeriodicalIF":0.0,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143855188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oluwadamilare Harazeem Abdulganiyu , Taha Ait Tchakoucht , Ahmed El Hilali Alaoui , Yakub Kayode Saheed
{"title":"Attention-driven multi-model architecture for unbalanced network traffic intrusion detection via extreme gradient boosting","authors":"Oluwadamilare Harazeem Abdulganiyu , Taha Ait Tchakoucht , Ahmed El Hilali Alaoui , Yakub Kayode Saheed","doi":"10.1016/j.iswa.2025.200519","DOIUrl":"10.1016/j.iswa.2025.200519","url":null,"abstract":"<div><div>Network Intrusion Detection Systems (NIDS) face significant challenges in identifying rare attack instances due to the inherent class imbalance and diversity in network traffic. This imbalance, often characterized by a dominance of benign network traffic data, reduces the effectiveness of traditional detection methods. To address this, we proposed CWFLAM-VAE, an attention-driven multi-model architecture that combines Class-Wise Focal Loss, Variational Autoencoder, and Extreme Gradient Boosting. CWFLAM-VAE generates synthetic rare-class attack data while preserving the original feature distribution, mitigating imbalance and improving classification performance. The effectiveness of our proposed system was evaluated by employing two datasets, one of which is the NSL-KDD, which exhibits a skewed distribution of network traffic favoring the majority class, and CSE-CIC-IDS2018 dataset, where approximately 83 % of the data consists of benign network traffic. We compared our method with existing sampling techniques (SMOTE, ROS, ADASYN, RUS) and existing classifiers (Logistic Regression, KNN, SVM, Decision Tree, LSTM, CNN). The experimental findings distinctly reveal the efficacy of the CWFLAM-VAE in resolving class imbalance concerns, with Extreme Gradient Boosting surpassing alternative machine learning techniques particularly in the detection of rare instances of attack traffic with an f-score of 97.6 % and 98.1 %, as well as a false positive rate of 0.17 and 0.27 for both data respectively.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200519"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143829648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advancing Forex prediction through multimodal text-driven model and attention mechanisms","authors":"Fatima Dakalbab , Ayush Kumar , Manar Abu Talib , Qassim Nasir","doi":"10.1016/j.iswa.2025.200518","DOIUrl":"10.1016/j.iswa.2025.200518","url":null,"abstract":"<div><div>The Forex market, characterized by high volatility and complexity, presents a significant challenge for accurate prediction of currency price movements. Traditional approaches often rely on either technical indicators or sentiment analysis, limiting their ability to capture the interplay between diverse data modalities. This research work introduces a novel multimodal deep learning framework that integrates technical analysis and sentiment analysis through a cross-modal attention mechanism, enabling a comprehensive understanding of market dynamics. The proposed model leverages innovative alignment techniques to synchronize sentiment from news articles with historical price trends, facilitating robust multiclass prediction of Forex price directions. To evaluate its effectiveness, the model was tested on three major currency pairs—EUR/USD, GBP/USD, and USD/JPY—using k-fold cross-validation. Multiple attention configurations, including no attention, self-attention, bi-cross attention, and a hybrid approach, were implemented to assess the impact of attention mechanisms on prediction performance. Experimental results highlight the superiority of the hybrid attention mechanism, which consistently outperformed single-modality models and other configurations across key metrics, such as Matthew's correlation coefficient, accuracy, directional accuracy, and F1-score. These findings underscore the importance of integrating sentiment and technical data for enhanced Forex prediction. This study contributes to the growing field of multimodal financial forecasting, offering a foundation for future research incorporating advanced risk metrics, real-time trading systems, and broader market applications.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200518"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143839014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}