Intelligent Systems with Applications最新文献

筛选
英文 中文
Learning how to transfer: A lifelong domain knowledge distillation framework for continual MRC
Intelligent Systems with Applications Pub Date : 2025-03-08 DOI: 10.1016/j.iswa.2025.200497
Songze Li , Zhijing Wu , Runmin Cao , Xiaohan Zhang , Yifan Wang , Hua Xu , Kai Gao
{"title":"Learning how to transfer: A lifelong domain knowledge distillation framework for continual MRC","authors":"Songze Li ,&nbsp;Zhijing Wu ,&nbsp;Runmin Cao ,&nbsp;Xiaohan Zhang ,&nbsp;Yifan Wang ,&nbsp;Hua Xu ,&nbsp;Kai Gao","doi":"10.1016/j.iswa.2025.200497","DOIUrl":"10.1016/j.iswa.2025.200497","url":null,"abstract":"<div><div>Machine Reading Comprehension (MRC) has attracted wide attention in recent years. It can reflect how well a machine understands human language. Benefitting from the increasing large-scale benchmark and pre-trained language models, a lot of MRC models have achieved remarkable success and even exceeded human performance. However, real-world MRC systems need incrementally learn from a continuous data stream across time without accessing the previously seen data, called Continual MRC system. It is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MK-MRC (an extension of MA-MRC), a continual MRC framework with uncertainty-aware fixed <strong>M</strong>emory and lifelong domain <strong>K</strong>nowledge distillation, is proposed. MK-MRC is a memory replaying based method, in which a fixed-size memory buffer stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MK-MRC fully uses the domain adaptation and transfer relationship between memory and new domain data through several domain knowledge distillation strategies.</div><div>Compared with MA-MRC, MK-MRC additionally introduces more strategies to strengthen the ability of continual learning, such as data augmentation and special task-related knowledge distillation. Experimental results show that MK-MRC yields consistent improvement compared with strong baselines and has a substantial incremental learning ability without catastrophically forgetting under four continual span-extractive and multiple-choice MRC settings.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200497"},"PeriodicalIF":0.0,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143619117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimum cost of job assignment in polynomial time by adaptive unbiased filtering and branch-and-bound algorithm with the best predictor
Intelligent Systems with Applications Pub Date : 2025-03-04 DOI: 10.1016/j.iswa.2025.200502
Jeeraporn Werapun, Witchaya Towongpaichayont, Anantaporn Hanskunatai
{"title":"Minimum cost of job assignment in polynomial time by adaptive unbiased filtering and branch-and-bound algorithm with the best predictor","authors":"Jeeraporn Werapun,&nbsp;Witchaya Towongpaichayont,&nbsp;Anantaporn Hanskunatai","doi":"10.1016/j.iswa.2025.200502","DOIUrl":"10.1016/j.iswa.2025.200502","url":null,"abstract":"<div><div>The minimum cost of job assignment (Min-JA) is one of the practical NP-hard problems to manage the optimization in science-and-engineering applications. Formally, the optimal solution of the Min-JA can be computed by the branch-and-bound (BnB) algorithm (with the efficient predictor) in O(<em>n</em>!), <em>n</em> = problem size, and O(<em>n</em><sup>3</sup>) in the best case but that best case hardly occurs. Currently, metaheuristic algorithms, such as genetic algorithms (GA) and swarm-optimization algorithms, are extensively studied, for polynomial-time solutions. Recently, unbiased filtering (in search-space reduction) could solve some NP-hard problems, such as 0/1-knapsack and multiple 0/1-knapsacks with Latin square (LS) of m-capacity ranking, for the ideal solutions in polynomial time. To solve the Min-JA problem, we propose the adaptive unbiased-filtering (AU-filtering) in O(<em>n</em><sup>3</sup>) with a new hybrid (search-space) reduction (of the indirect metaheuristic strategy and the exact BnB). Innovation-and-contribution of our AU-filtering is achieved through three main steps: 1. find 9 + <em>n</em> effective job-orders for the good initial solutions (by the indirect assignment with UP: unbiased predictor), 2. improve top 9-solutions by the indirect improvement of the significant job-orders (by Latin square of <em>n</em> permutations plus <em>n</em> complex mod-functions), and 3. classify objects (from three of the best solutions) for AU-filtering (on large <em>n</em>) with deep-reduction (on smaller <em>n</em>’) and repeat (1)-(3) until <em>n</em>’ &lt; 6, the exact BnB is applied. In experiments, the proposed AU-filtering was evaluated by a simulation study, where its ideal results outperformed the best results of the hybrid swarm-GA algorithm on a variety of 2D datasets (<em>n</em> ≤ 1000).</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200502"},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143580565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified prompt-based framework for few-shot multimodal language analysis
Intelligent Systems with Applications Pub Date : 2025-03-01 DOI: 10.1016/j.iswa.2025.200498
Xiaohan Zhang , Runmin Cao , Yifan Wang , Songze Li , Hua Xu , Kai Gao , Lunsong Huang
{"title":"A unified prompt-based framework for few-shot multimodal language analysis","authors":"Xiaohan Zhang ,&nbsp;Runmin Cao ,&nbsp;Yifan Wang ,&nbsp;Songze Li ,&nbsp;Hua Xu ,&nbsp;Kai Gao ,&nbsp;Lunsong Huang","doi":"10.1016/j.iswa.2025.200498","DOIUrl":"10.1016/j.iswa.2025.200498","url":null,"abstract":"<div><div>Multimodal language analysis is a trending topic in NLP. It relies on large-scale annotated data, which is scarce due to its time-consuming and labor-intensive nature. Multimodal prompt learning has shown promise in low-resource scenarios. However, previous works either cannot handle semantically complex tasks, or involve too few modalities. In addition, most of them only focus on prompting language modality, disregarding the untapped potential of other modalities. We propose a unified prompt-based framework for few-shot multimodal language analysis . Specifically, based on pretrained language model, our model can handle semantically complex tasks involving text, audio and video modalities. To enable more effective utilization of video and audio modalities by the language model, we introduce semantic alignment pre-training to bridge the semantic gap between them and the language model, alongside effective fusion method for video and audio modalities. Additionally, we introduce a novel effective prompt method — Multimodal Prompt Encoder — to prompt the entirety of multimodal information. Extensive experiments conducted on six datasets under four multimodal language subtasks demonstrate the effectiveness of our approach.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200498"},"PeriodicalIF":0.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive fusion model for improved pneumonia prediction based on KNN-wavelet-GLCM and a residual network
Intelligent Systems with Applications Pub Date : 2025-02-28 DOI: 10.1016/j.iswa.2025.200492
Asmaa Shati , Ghulam Mubashar Hassan , Amitava Datta
{"title":"A comprehensive fusion model for improved pneumonia prediction based on KNN-wavelet-GLCM and a residual network","authors":"Asmaa Shati ,&nbsp;Ghulam Mubashar Hassan ,&nbsp;Amitava Datta","doi":"10.1016/j.iswa.2025.200492","DOIUrl":"10.1016/j.iswa.2025.200492","url":null,"abstract":"<div><div>Pneumonia is a severe disease that contributes to global mortality rates, emphasizing the critical need for early detection to improve patient survival. Chest radiography (X-ray) images serve as a fundamental diagnostic tool in clinical practice to detect various lung abnormalities. However, medical images, particularly X-rays, contain crucial data that are often imperceptible to the human eye. This study presents a novel fusion model (Res-WG-KNN) based on a soft voting ensemble strategy to predict pneumonia from chest X-ray images. It utilizes 2D-discrete wavelet decomposition and texture features from the Gray Level Co-occurrence Matrix (GLCM) with supervised machine learning, alongside raw X-ray images using a modified Residual Network ResNet-50. The proposed model was evaluated using two public pneumonia X-ray image datasets: one for adult patients, called the Radiological Society of North America (RSNA) dataset, and one for pediatric patients, called the Kermany dataset. These datasets differ in both size and image format, with the RSNA dataset using DICOM images and the Kermany dataset using JPEG images. The use of a soft voting technique in the proposed model effectively enhances classification performance beyond current benchmarks, achieving 97.0% accuracy and 0.97 AUC on the RSNA dataset, and 99.0% accuracy with 0.99 AUC on the Kermany dataset for pneumonia prediction.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200492"},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emulating fundamental analysts: Analytical stage-based multi-agent framework enhanced with expert guidance and Preference-Anchored Likelihood Adjustment 模拟基本面分析师:基于分析阶段的多代理框架,通过专家指导和偏好附加可能性调整得到增强
Intelligent Systems with Applications Pub Date : 2025-02-27 DOI: 10.1016/j.iswa.2025.200496
Tao Xu , Zhe Piao , Tadashi Mukai , Yuri Murayama , Kiyoshi Izumi
{"title":"Emulating fundamental analysts: Analytical stage-based multi-agent framework enhanced with expert guidance and Preference-Anchored Likelihood Adjustment","authors":"Tao Xu ,&nbsp;Zhe Piao ,&nbsp;Tadashi Mukai ,&nbsp;Yuri Murayama ,&nbsp;Kiyoshi Izumi","doi":"10.1016/j.iswa.2025.200496","DOIUrl":"10.1016/j.iswa.2025.200496","url":null,"abstract":"<div><div>With the rapid advancement of large language models (LLMs), some studies have explored their potential for predicting stock prices based on financial texts. However, previous research often overlooked the depth of analysis generated by LLMs, resulting in reasoning processes inferior to those of human analysts. In fundamental investing, which requires in-depth company analysis, conclusions from imperfect reasoning lack persuasiveness. In this study, inspired by the analysis process of human analysts, we propose an “Analytical Stage-Based Multi-Agent Framework” to enable LLMs to perform in-depth fundamental analysis. This framework divides the analysis into multiple stages, assigning an LLM agent to each. We enhance each agent’s capabilities for its specific task through expert guidance or fine-tuning, allowing them to collectively emulate the workflow of human analysts. Furthermore, we introduce Preference-Anchored Likelihood Adjustment, a new method for fine-tuning LLMs. This approach addresses the decline in likelihood of generating correct responses that occurs after using existing preference alignment methods. It employs an objective function with two terms: one to increase likelihood and another to preserve aligned preference. We conducted experiments using our framework to analyze company earnings releases. We evaluated the analysis quality based on comprehensiveness and logical soundness, while correctness was assessed by using stock prices as the ground truth to calculate the Matthews correlation coefficient and F1 score. Results demonstrate that even without expert guidance and fine-tuning, our multi-agent framework can enhance LLMs in both analysis quality and correctness. When combined with expert guidance and fine-tuning, the performance is further improved.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200496"},"PeriodicalIF":0.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Agent Reinforcement Learning for Cybersecurity: Classification and survey
Intelligent Systems with Applications Pub Date : 2025-02-25 DOI: 10.1016/j.iswa.2025.200495
Salvo Finistrella, Stefano Mariani, Franco Zambonelli
{"title":"Multi-Agent Reinforcement Learning for Cybersecurity: Classification and survey","authors":"Salvo Finistrella,&nbsp;Stefano Mariani,&nbsp;Franco Zambonelli","doi":"10.1016/j.iswa.2025.200495","DOIUrl":"10.1016/j.iswa.2025.200495","url":null,"abstract":"<div><div>In the face of a rapidly evolving threat landscape, traditional cybersecurity measures – such as signature-based detection and static rules on firewalls, intrusion detection systems (IDS) and antivirus software – often lag behind sophisticated cyber attacks. Through a review of existing literature, we examine the shortcomings of traditional cybersecurity methods and how these can be surpassed with the application of Reinforcement Learning (RL) based methods. This study classifies RL-based approaches to cybersecurity, aimed at enhancing detection, mitigation and response to cyber attacks, along two orthogonal dimensions: the RL Frameworks used (e.g. single-agent vs. multi-agent) and the network configuration where they are deployed (e.g. host-based, or network-based cybersecurity). The goal is that of aiding researchers and practitioners interested in the field to quickly understand what are the opportunities for RL-based cybersecurity depending on the network environment to be protected and point them to the representative articles in the field. Finally, we emphasize the importance of further research and development to address challenges such as computational complexity, generalization and data quality.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200495"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EHR-protect: A steganographic framework based on data-transformation to protect electronic health records
Intelligent Systems with Applications Pub Date : 2025-02-24 DOI: 10.1016/j.iswa.2025.200493
Adifa Widyadhani Chanda D'Layla , Ntivuguruzwa Jean De La Croix , Tohari Ahmad , Fengling Han
{"title":"EHR-protect: A steganographic framework based on data-transformation to protect electronic health records","authors":"Adifa Widyadhani Chanda D'Layla ,&nbsp;Ntivuguruzwa Jean De La Croix ,&nbsp;Tohari Ahmad ,&nbsp;Fengling Han","doi":"10.1016/j.iswa.2025.200493","DOIUrl":"10.1016/j.iswa.2025.200493","url":null,"abstract":"<div><div>The increasing digitization of healthcare systems and the shift to Electronic Health Records (EHRs) have introduced critical security challenges, including unauthorized access, data breaches, and confidentiality risks. For example, the rapid exchange of sensitive health data between stakeholders highlights the need for secure data-sharing mechanisms. To address these challenges, steganography emerges as a critical solution by embedding sensitive information within other data forms, reducing the likelihood of unauthorized access and ensuring patient confidentiality. This study presents EHR-Protect, an innovative steganographic framework designed to secure EHRs by embedding them within medical images. Unlike general-purpose images, medical images are susceptible to distortions as they serve as diagnostic tools. EHR-Protect uses logarithmic pixel transformation and adaptive techniques such as difference expansion and EHR magnitude reduction to minimize distortions in carrier medical images. The results of EHR-Protect demonstrate its effectiveness in securely embedding EHRs into medical images with minimal distortions. The proposed method achieves a high Peak Signal-to-Noise Ratio (PSNR) of 91.90 dB and a perfect Structural Similarity Index Measure (SSIM) of 1, ensuring image quality is maintained. MSE values across different cover images show minimal increases, even as secret data payloads rise from 10 to 100 kilobits, indicating controlled distortion. The results confirm that EHR-Protect outperforms existing methods, offering a robust solution for securing the EHR data without compromising medical image integrity.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200493"},"PeriodicalIF":0.0,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Binary classification with Fuzzy-Bayesian logistic regression using Gaussian fuzzy numbers
Intelligent Systems with Applications Pub Date : 2025-02-24 DOI: 10.1016/j.iswa.2025.200494
Georgios Charizanos , Haydar Demirhan , Duygu İçen
{"title":"Binary classification with Fuzzy-Bayesian logistic regression using Gaussian fuzzy numbers","authors":"Georgios Charizanos ,&nbsp;Haydar Demirhan ,&nbsp;Duygu İçen","doi":"10.1016/j.iswa.2025.200494","DOIUrl":"10.1016/j.iswa.2025.200494","url":null,"abstract":"<div><div>Binary classification is a critical task in pattern recognition applications in artificial intelligence and machine learning. The main weakness of binary classifiers is their sensitivity towards the imbalance in the number of observations in the binary classes and separation by a subset of features. Although various robust approaches are introduced against these issues, they need prolonged runtimes, limiting their applicability in artificial intelligence applications or for large datasets. In this study, we introduce a new binary classification framework called the fuzzy-Bayesian logistic regression, which incorporates robust Bayesian logistic regression with fuzzy classification using Gaussian fuzzy numbers. The proposed method improves classification performance while providing significant gains in computation time. We benchmark the proposed method with eight fuzzy, Bayesian, and machine learning classifiers using seventeen datasets. The results indicate that the fuzzy-Bayesian logistic regression outperforms all benchmark methods across all datasets in terms of six performance indicators. Moreover, the proposed method is shown to be significantly more efficient than its closest competitor, improving computational efficiency. The proposed method provides a promising binary classifier for a wide range of applications with its computational efficiency and robustness towards imbalance and separation issues in the data.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200494"},"PeriodicalIF":0.0,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143512461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-class hybrid variational autoencoder and vision transformer model for enhanced plant disease identification
Intelligent Systems with Applications Pub Date : 2025-02-19 DOI: 10.1016/j.iswa.2025.200490
Folasade Olubusola Isinkaye , Michael Olusoji Olusanya , Ayobami Andronicus Akinyelu
{"title":"A multi-class hybrid variational autoencoder and vision transformer model for enhanced plant disease identification","authors":"Folasade Olubusola Isinkaye ,&nbsp;Michael Olusoji Olusanya ,&nbsp;Ayobami Andronicus Akinyelu","doi":"10.1016/j.iswa.2025.200490","DOIUrl":"10.1016/j.iswa.2025.200490","url":null,"abstract":"<div><div>Agriculture is considered as the propeller of economic growth as it accounts for 6.4 % of global gross domestic product (GDP) and in low-income countries, it can account for more than 25 % of GDP. Plants supply more than 80 % of the food consumed by humans and are the main source of nutrition for animals. Plant diseases pose a major risk to global food security as they account for losses of between 10 to 30 % of the global harvest every year. Deep learning techniques like convolutional neural networks successfully identify image-based diseases but struggle with capturing long-range contextual information. This makes them less robust in noisy or high-resolution images. Their high computational and memory demands also limit scalability for large datasets. To overcome these issues, we propose a hybrid model with the potential to combine Variational Autoencoders and Vision Transformers for enhanced accuracy and robustness of plant disease classification. Variational Autoencoder reduces image dimensionality while preserving essential features, and Vision Transformer captures global relationships to enhance accuracy and scalability, especially in multi-class disease classification. The experiment used images of corn, potato, and tomato plant leaves from the publicly available PlantVillage dataset. On-the-fly data augmentation was applied to further increase the robustness of the model. The proposed model achieved a classification accuracy of 93.2 %. This technique provides a reliable and efficient solution for identifying multiple plant diseases across various crops. It enhances agricultural productivity and supports food security efforts.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200490"},"PeriodicalIF":0.0,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Content moderation assistance through image caption generation
Intelligent Systems with Applications Pub Date : 2025-02-16 DOI: 10.1016/j.iswa.2025.200489
Liam Kearns
{"title":"Content moderation assistance through image caption generation","authors":"Liam Kearns","doi":"10.1016/j.iswa.2025.200489","DOIUrl":"10.1016/j.iswa.2025.200489","url":null,"abstract":"<div><div>The rapid growth in digital media creation has led to an increased challenge in content moderation. Manual and automated moderation are susceptible to risks associated with a slower response time and false positives arising from unpredictable user inputs respectively. Image caption generation has been suggested as a viable content moderation tool, but there is a lack of real world deployment in this context. In this work, a collaborative approach is taken, where a machine learning model is used to assist human moderators in the approval and rejection of media within a scavenger hunt game. The proposed model is trained on the Flickr30k and MS Coco datasets to generate captions for images. The results demonstrate a 13% reduction in review times, indicating that human–machine collaboration contributes to mitigating the risk of unsustainable review backlog growth. Furthermore, fine-tuning the model led to a 28% reduction in review times when compared to the untuned model. Notably, this paper contributes to knowledge by demonstrating caption generation as a viable content moderation tool in addition to its sensitivity to accurate captions, whereby false positives risk a deterioration in moderator response time.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"25 ","pages":"Article 200489"},"PeriodicalIF":0.0,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143436567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信