{"title":"A hybrid three-way recommendation considering users variability","authors":"Yu Xie , Jilin Yang , Youlei Meng , Xianyong Zhang","doi":"10.1016/j.engappai.2025.111610","DOIUrl":"10.1016/j.engappai.2025.111610","url":null,"abstract":"<div><div>Hybrid recommender systems leverage diverse information sources and techniques to enhance performance. Nevertheless, integrating users’ multifaceted preferences remains challenging due to the uneven data. Meanwhile, information insufficiency introduces uncertainty in recommendations while existing strategies (i.e., recommend or not-recommend) lack the flexibility to address it. Additionally, these works mainly overlook that ratings not only reflect preferences but imply users’ attitudes toward the strategies, leading to the same recommendation rule despite users distinctly. To solve these issues, a Hybrid Three-Way Recommender (HTWR) system is proposed to formulate personalized three-way rules. Specifically, users’ historical and predictive preferences are captured via tags and ratings while integrated based on the user’s data distribution. Then, the theory of three-way decision is introduced to address such uncertainty by offering the option of defer-recommend. Finally, the users variability is formally given and incorporated into the loss function to obtain personalized rules. Experiments on three public datasets validate the superiority and flexibility of the proposed HTWR.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111610"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hyperspectral image denoising via group-sparsity constrained low-rank matrix triple factorization and spatial–spectral residual total variation","authors":"Xiaozhen Xie, Yangyang Song","doi":"10.1016/j.engappai.2025.111600","DOIUrl":"10.1016/j.engappai.2025.111600","url":null,"abstract":"<div><div>Mixed noise, such as Gaussian noise, impulse noise, deadline noise, stripe noise, and many others, distorts the hyperspectral image (HSI), usually causing severe difficulties in subsequent applications. Due to the rise of artificial intelligence technology, matrix triple factorization is attached importance again in the field of HSI denoising. However, for convenient computations, these factor matrices are commonly imposed by the orthogonality, which is inconsistent with the physical meanings in practice. To address this issue, this article proposes a group-sparsity constrained triple factorization method to explore the shared sparse pattern and yields a tighter approximation to the low-rank prior. Specifically, the Casorati matrix of each local cube in HSIs, is firstly decomposed into a core matrix and two factor matrices. Then, the group-sparsity regularization is imposed on the factor matrices and the core matrix, simultaneously representing the low-rank and sparse prior in local cubes. Moreover, we also use the tensor group-sparsity based spatial–spectral residual total variation to globally explore the shared sparse pattern in both spatial and spectral difference images of HSIs. Ultimately, the group-sparsity constrained local low-rank matrix triple factorization and global spatial–spectral residual total variation model is proposed for HSI denoising. In the framework of the alternating direction method of multipliers, the proposed model can be solved efficiently. Simulated and real HSI experiments demonstrate the effectiveness of the proposed model. Across all datasets and noise conditions, our method achieves an average increase of nearly 1.93 decibels in overall peak signal-to-noise ratio compared to state-of-the-art HSI denoising methods.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111600"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanyu Zhang , Mengping Dong , Fei Li , Zhenbo Li , Ping Hu
{"title":"An attention-guided multi-scale feature cascade network for underwater fish counting","authors":"Hanyu Zhang , Mengping Dong , Fei Li , Zhenbo Li , Ping Hu","doi":"10.1016/j.engappai.2025.111608","DOIUrl":"10.1016/j.engappai.2025.111608","url":null,"abstract":"<div><div>Visual counting is essential for advancing fisheries intelligence, but fish scale variation in open underwater environments has made underwater fish counting a constant challenge. Therefore, we propose an Attention-guided Multi-scale Feature Cascade Network, named AMFCNet, which resolves scale variation and improves the accuracy of fish counting in complex underwater environments. AMFCNet utilizes a multi-scale attention gate for multi-scale feature fusion, and integrates a multi-scale convolution module to capture complex spatial relationships. It also employs a multi-head supervision fusion strategy to mask irrelevant regions, ensuring targeted learning for each scale and generating high-quality multi-scale density maps. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on the proposed dataset with the lowest computational cost, significantly outperforming 11 mainstream counting methods. It also achieves excellent results on other publicly available underwater datasets, with Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Normalized Absolute Error (NAE) values of 1.26, 1.71, and 0.08, respectively. This method shows significant potential for practical applications in aquaculture, such as in marine ranching and pond farming, to assess fish growth conditions and adjust feeding strategies accordingly.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111608"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenjie Mao, Hu Wu, Shilong Xie, Linyuxuan Li, Xianhai Yang
{"title":"Artificial intelligence-enabled defect detection method and engineering application of ceramic mug","authors":"Wenjie Mao, Hu Wu, Shilong Xie, Linyuxuan Li, Xianhai Yang","doi":"10.1016/j.engappai.2025.111648","DOIUrl":"10.1016/j.engappai.2025.111648","url":null,"abstract":"<div><div>In the manufacturing process of ceramic mugs, the detection of micro-surface defects faces technical challenges of high difficulty and low efficiency, and efficient and high-quality production lines are crucial to maintaining market competitiveness. The goal of this study is to improve the accuracy and efficiency of detection by developing a defect detection algorithm and equipment based on deep learning, thereby improving product quality and reducing production costs. The research uses the visual algorithm You Only Look Once version 8 (YOLOv8) in the field of artificial intelligence as the baseline model. Firstly, a slice pre-training layer is designed to reduce the memory loss of large images to the graphics card. Secondly, the model structure is reconstructed to adapt to small target detection. In addition, a mixed local channel cross-stage feature fusion module is proposed to enhance the recognition ability of small targets. Finally, a detection head with shared parameters is designed to further reduce the number of parameters. In terms of engineering application, a set of test equipment was developed and the corresponding software was written. Experiments show that the accuracy of the algorithm is 21.3 % higher than that of YOLOv8, and the parameter amount is reduced by 67 %. Compared with manual detection, the equipment efficiency is increased by 47.06 %, and the detection success rate is 99.6 %. Therefore, the research in this paper provides an efficient and reliable solution for industrial automation detection.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111648"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yalin Li , Zhen Sun , Sujith Mangalathu , Yaqi Li , Hao Yang , Weidong He
{"title":"Seismic damage states prediction of in-service bridges using feature-enhanced swin transformer without reliance on damage indicators","authors":"Yalin Li , Zhen Sun , Sujith Mangalathu , Yaqi Li , Hao Yang , Weidong He","doi":"10.1016/j.engappai.2025.111651","DOIUrl":"10.1016/j.engappai.2025.111651","url":null,"abstract":"<div><div>To achieve real-time and efficient evaluation of seismic damage to structures, this study proposes an improved deep learning-based model, the deep feature-enhanced Swin Transformer (CC-SwinT). This model overcomes the influence of service life on the time-varying damage indicators of bridges. By eliminating the reliance on seismic performance and damage indicators, it predicts the seismic damage state of in-service bridges based solely on the response of structure. The CC-SwinT model integrates continuous wavelet transform (CWT) technology and the context anchored attention (CAA) mechanism to enhance the extraction of structure response features of bridge piers. This integration enables the model to effectively mine time-frequency characteristics and capture non-local long-term dependencies in structure responses. To comprehensively train the CC-SwinT model, a structure response database for in-service bridges was constructed based on a data-driven objectives, analyzing the impacts of service conditions on the seismic performance of bridges. Subsequently, transfer learning methods were applied, and the performance of the CC-SwinT framework was evaluated using various metrics to highlight its exceptional feature extraction and prediction capabilities. Furthermore, the Gradient-weighted Class Activation Mapping (Grad-CAM) interpretability technique was used to explore the decision-making process and feature focus of CC-SwinT. The findings of this study provide a valuable reference for seismic damage prediction of in-service structures and rapid post-earthquake rescue response.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111651"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transformer-based auto-encoder with combined multi-head-attention for industrial soft-sensor modeling","authors":"Yanhong Li, Shiwei Gao, Wenfeng Zhao","doi":"10.1016/j.engappai.2025.111681","DOIUrl":"10.1016/j.engappai.2025.111681","url":null,"abstract":"<div><div>Soft-sensor modeling is common in industrial production, but high data dimensionality, a lack of labeled features, and inadequate methods complicate extracting nonlinear feature representations. This paper proposes a Transformer-based auto-encoder with a combined multi-head-attention approach (TAE-CMHA) for soft-sensor modeling, which offers advantages for nonlinear feature representation. It introduces a combined multi-head-attention mechanism (CMHA) that improves feature-extraction accuracy and robustness. The Transformer's global feature extraction capabilities are leveraged in the auto-encoder for better nonlinear feature extraction. Additionally, label information optimizes the auto-encoder's reconstruction loss function which improves feature acquisition for predicting target outputs. Compared to supervised methods, the unsupervised auto-encoder uses abundant unlabeled industrial data to improve generalizability. Experiments were conducted on the industrial steam flow and debutanizer column datasets. The results show that the mean squared error (MSE) of the proposed method reaches a minimum of 0.00297 and the coefficient of determination (R<sup>2</sup>) is 0.881 in debutanizer column datasets, which shows the advantages of the model.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111681"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sketch & Fetch: Draw a logo sketch to fetch the suspect from surveillance footage","authors":"Yogameena Balasubramanian , Nagavani Chandrasekaran","doi":"10.1016/j.engappai.2025.111663","DOIUrl":"10.1016/j.engappai.2025.111663","url":null,"abstract":"<div><div>Logo Sketch-assisted Suspect Retrieval (LSSR) enables rapid identification of individuals based on eyewitness-drawn logo sketches, especially when facial details are obscured or unavailable in surveillance footage. The proposed Sketch-Fetch Generative Adversial Network (SF-GAN) translates sketches into realistic logo images, while the enhanced E-YOLOv7 (Elite-You Only Look Once version 7) detects logos on clothing in real-time. Local self-similarity descriptors with Euclidean matching are used to retrieve the query person. SF-GAN is trained on 100 sketch-image logo classes and shows adaptability to new designs. It achieves a low FID (Fréchet Inception Distance) score of 0.2, indicating high-quality generation. The system is tested on benchmark datasets under challenging conditions, including blur, occlusion, and low resolution. Achieving 95.6 % accuracy, the LSSR framework outperforms state-of-the-art approaches in logo-based suspect retrieval.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111663"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jue Wang , Jianzhi Fan , Tianshuo Yuan , Dong Luo , Guohua Jiao , Wei Chen
{"title":"Contrast gas detection: Improving infrared gas semantic segmentation with static background","authors":"Jue Wang , Jianzhi Fan , Tianshuo Yuan , Dong Luo , Guohua Jiao , Wei Chen","doi":"10.1016/j.engappai.2025.111604","DOIUrl":"10.1016/j.engappai.2025.111604","url":null,"abstract":"<div><div>Accurate and efficient identification and segmentation of gases play an important role in industrial processes and public safety. Although Optical Gas Imaging (OGI) has proven effective in acquiring images and videos of gas leaks, accurately determining the leakage points and their diffusion areas in real-world conditions remains a significant challenge. To address this, we propose a novel gas segmentation annotation methodology. This approach applies differencing between gas leakage scenes and corresponding static background to delineate precise leakage regions, which are subsequently annotated manually. The resulting dataset, consisting of 926 static background-gas image pairs extracted from 38 comprehensive video sequences, provides a robust foundation for advancing gas segmentation research. Based on this dataset, we present a novel semantic segmentation model specifically designed for the unique characteristics of gas leakage scenarios. Our model integrates a gas contrast attention mechanism to capitalize on static background information, resulting in improved segmentation precision. Comparative evaluations demonstrate that the proposed model achieves state-of-the-art performance on our dataset, outperforming widely adopted semantic segmentation models. All code and the dataset will be made publicly available at <span><span>https://github.com/ProAlize/CGNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111604"},"PeriodicalIF":7.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A trustworthy and explainable deep learning framework for skin lesion detection in smart dermatology","authors":"Mohammad A. Eita , Hamada Rizk","doi":"10.1016/j.engappai.2025.111594","DOIUrl":"10.1016/j.engappai.2025.111594","url":null,"abstract":"<div><div>The rapid evolution of artificial intelligence (AI) and deep learning profoundly impacts medical imaging, where it significantly enhances diagnostic accuracy. However, the effective deployment of AI systems in clinical settings, especially for skin lesion detection and diagnosis, requires not only high accuracy but also transparency and robustness to gain the trust of healthcare professionals. This is particularly crucial considering the challenges posed by varying sensor quality, lighting conditions, and lesion diversity. In this paper, we introduce a novel framework based on the You Only Look Once (YOLO) model that addresses these critical needs by enhancing both the explainability and performance of skin lesion detection models. Early and accurate identification of skin lesions is essential for the timely treatment and management of dermatological conditions. Traditional diagnostic methods, such as visual assessments by dermatologists, are often labor-intensive, subject to interpretative variability, and prone to inaccuracies, especially in cases involving atypical or subtle lesions. Our approach incorporates advanced data augmentation techniques to improve the model’s generalization capabilities across diverse clinical conditions. Additionally, we integrate saliency maps to provide visual explanations of the model’s predictions, allowing clinicians to understand the decision-making process and ensuring alignment with established clinical knowledge. Comparative analyses with the state-of-the-art models highlight the superior performance of our proposed framework, with significant improvements in the harmonic mean of precision and recall (F1-Score), and the Mean Average Precision (mAP50). The results underscore the effectiveness of our framework and how it advances the application of trustworthy AI in dermatology, paving the way for more reliable and informed clinical decisions in the diagnosis and treatment of skin conditions.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111594"},"PeriodicalIF":7.5,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144563350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prior-driven refinement network for small organ-at-risk segmentation in head and neck cancer","authors":"Taibao Wang , Yifan Gao , Bingyu Liang , Qin Wang","doi":"10.1016/j.engappai.2025.111605","DOIUrl":"10.1016/j.engappai.2025.111605","url":null,"abstract":"<div><div>Accurate segmentation of small organs-at-risk (OARs) in computed tomography (CT) images is crucial for radiotherapy treatment planning in head and neck cancer. However, the low soft tissue contrast, small spatial structures, and the limited training data pose significant challenges for automated segmentation methods. This paper proposes prior-driven refinement network (PRNet), a novel deep learning-based approach that leverages the foundation model’s general-purpose representations and domain-specific knowledge to tackle these challenges. PRNet builds upon the initial coarse segmentation and refines small organs by utilizing the coarse segmentation as prior knowledge. PRNet inherits its architecture from the Segment Anything Model (SAM) but incorporates a novel prior encoder and mask refinement transformer, enabling the fusion of domain-specific knowledge with SAM’s robust representations.The architecture of PRNet is inherited from the Segment Anything Model (SAM), with the addition of the prior encoder and the mask refinement transformer, allowing for the fusion of domain-specific knowledge with SAM’s robust representations. Experiments on three public datasets demonstrate PRNet’s superior performance, with average Dice scores of 75.14% ± 12.81%, 76.56% ± 12.90%, and 82.83 ± 13.49% respectively. These results represent improvements of 3.61%, 3.64%, and 5.14% over current state-of-the-art methods. Moreover, experiments on four diverse datasets demonstrate PRNet’s generalizability across different anatomical regions and imaging modalities, including liver tumors, myocardial pathologies, and thoracic organs. Our proposed method shows potential for improving clinical radiotherapy planning workflows and contributing to more precise treatment delivery in head and neck cancer patients.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111605"},"PeriodicalIF":7.5,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144563920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}