Ehtesham Hashmi, Sule Yildirim Yayilgan, Muhammad Mudassar Yamin, Mohib Ullah
{"title":"Enhancing misogyny detection in bilingual texts using explainable AI and multilingual fine-tuned transformers","authors":"Ehtesham Hashmi, Sule Yildirim Yayilgan, Muhammad Mudassar Yamin, Mohib Ullah","doi":"10.1007/s40747-024-01655-1","DOIUrl":"https://doi.org/10.1007/s40747-024-01655-1","url":null,"abstract":"<p>Gendered disinformation undermines women’s rights, democratic principles, and national security by worsening societal divisions through authoritarian regimes’ intentional weaponization of social media. Online misogyny represents a harmful societal issue, threatening to transform digital platforms into environments that are hostile and inhospitable to women. Despite the severity of this issue, efforts to persuade digital platforms to strengthen their protections against gendered disinformation are frequently ignored, highlighting the difficult task of countering online misogyny in the face of commercial interests. This growing concern underscores the need for effective measures to create safer online spaces, where respect and equality prevail, ensuring that women can participate fully and freely without the fear of harassment or discrimination. This study addresses the challenge of detecting misogynous content in bilingual (English and Italian) online communications. Utilizing FastText word embeddings and explainable artificial intelligence techniques, we introduce a model that enhances both the interpretability and accuracy in detecting misogynistic language. To conduct an in-depth analysis, we implemented a range of experiments encompassing classic machine learning methodologies and conventional deep learning approaches to the recent transformer-based models incorporating both language-specific and multilingual capabilities. This paper enhances the methodologies for detecting misogyny by incorporating incremental learning for cutting-edge datasets containing tweets and posts from different sources like Facebook, Twitter, and Reddit, with our proposed approach outperforming these datasets in metrics such as accuracy, F1-score, precision, and recall. This process involved refining hyperparameters, employing optimization techniques, and utilizing generative configurations. By implementing Local Interpretable Model-agnostic Explanations (LIME), we further elucidate the rationale behind the model’s predictions, enhancing understanding of its decision-making process.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"216 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep weighted survival neural networks to survival risk prediction","authors":"Hui Yu, Qingyong Wang, Xiaobo Zhou, Lichuan Gu, Zihao Zhao","doi":"10.1007/s40747-024-01670-2","DOIUrl":"https://doi.org/10.1007/s40747-024-01670-2","url":null,"abstract":"<p>Survival risk prediction models have become important tools for clinicians to improve cancer treatment decisions. In the medical field, using gene expression data to build deep survival neural network models significantly improves accurate survival prognosis. However, it still poses a challenge in building an efficient method to improve the accuracy of cancer-specific survival risk prediction, such as data noise problem. In order to solve the above problem, we propose a <u>d</u>iversity <u>r</u>eweighted deep survival neural <u>net</u>work method with <u>g</u>rid <u>o</u>ptimization (DRGONet) to improve the accuracy of cancer-specific survival risk prediction. Specifically, reweighting can be employed to adjust the weights assigned to each data point in the dataset based on their importance or relevance, thereby mitigating the impact of noisy or irrelevant data and improving model performance. Incorporating diversity into the goal of multiple learning models can help minimize bias and improve learning outcomes. Furthermore, hyperparameters can be optimized with grid optimization. Experimental results have demonstrated that our proposed approach has significant advantages (improved about 5%) in real-world medical scenarios, outperforming state-of-the-art comparison methods by a large margin. Our study highlights the significance of using DRGONet to overcome the limitations of building accurate survival prediction models. By implementing our technique in cancer research, we hope to reduce the suffering experienced by cancer patients and improve the effectiveness of treatment.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"4 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Influence maximization under imbalanced heterogeneous networks via lightweight reinforcement learning with prior knowledge","authors":"Kehong You, Sanyang Liu, Yiguang Bai","doi":"10.1007/s40747-024-01666-y","DOIUrl":"https://doi.org/10.1007/s40747-024-01666-y","url":null,"abstract":"<p>Influence Maximization (IM) stands as a central challenge within the domain of complex network analysis, with the primary objective of identifying an optimal seed set of a predetermined size that maximizes the reach of influence propagation. Over time, numerous methodologies have been proposed to address the IM problem. However, one certain network referred to as Imbalanced Heterogeneous Networks (IHN), which widely used in social situation, urban and rural areas, and merchandising, presents challenges in achieving high-quality solutions. In this work, we introduce the Lightweight Reinforcement Learning algorithm with Prior knowledge (LRLP), which leverages the Struc2Vec graph embedding technique that captures the structural similarity of nodes to generate vector representations for nodes within the network. In details, LRLP incorporates prior knowledge based on a group of centralities, into the initial experience pool, which accelerates the reinforcement learning training for better solutions. Additionally, the node embedding vectors are input into a Deep Q Network (DQN) to commence the lightweight training process. Experimental evaluations conducted on synthetic and real networks showcase the effectiveness of the LRLP algorithm. Notably, the improvement seems to be more pronounced when the the scale of the network is larger. We also analyze the effect of different graph embedding algorithms and prior knowledge on algorithmic results. Moreover, we conduct an analysis about some parameters, such as number of seed set selections <i>T</i>, embedding dimension <i>d</i> and network update frequency <i>C</i>. It is significant that the reduction of number of seed set selections <i>T</i> not only keeps the quality of solutions, but lowers the algorithm’s computational cost.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"11 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ATBHC-YOLO: aggregate transformer and bidirectional hybrid convolution for small object detection","authors":"Dandan Liao, Jianxun Zhang, Ye Tao, Xie Jin","doi":"10.1007/s40747-024-01652-4","DOIUrl":"https://doi.org/10.1007/s40747-024-01652-4","url":null,"abstract":"<p>Object detection using UAV images is a current research focus in the field of computer vision, with frequent advancements in recent years. However, many methods are ineffective for challenging UAV images that feature uneven object scales, sparse spatial distribution, and dense occlusions. We propose a new algorithm for detecting small objects in UAV images, called ATBHC-YOLO. Firstly, the MS-CET module has been introduced to enhance the model’s focus on global sparse features in the spatial distribution of small objects. Secondly, the BHC-FB module is proposed to address the large-scale variance of small objects and enhance the perception of local features. Finally, a more appropriate loss function, WIoU, is used to penalise the quality variance of small object samples and further enhance the model’s detection accuracy. Comparison experiments on the DIOR and VEDAI datasets validate the effectiveness and robustness of the improved method. By conducting experiments on the publicly available UAV benchmark dataset Visdrone, ATBHC-YOLO outperforms the state-of-the-art method(YOLOv7) by 3.5%.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"128 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DADNet: text detection of arbitrary shapes from drone perspective based on boundary adaptation","authors":"Jun Liu, Jianxun Zhang, Ting Tang, Shengyuan Wu","doi":"10.1007/s40747-024-01617-7","DOIUrl":"https://doi.org/10.1007/s40747-024-01617-7","url":null,"abstract":"<p>The rapid development of drone technology has made drones one of the essential tools for acquiring aerial information. The detection and localization of text information through drones greatly enhance their understanding of the environment, enabling tasks of significant importance such as community commercial planning and autonomous navigation in intelligent environments. However, the unique perspective and complex environment during drone photography lead to various challenges in text detection, including diverse text shapes, large-scale variations, and background interference, making traditional methods inadequate. To address this issue, we propose a drone-based text detection method based on boundary adaptation. We first conduct an in-depth analysis of text characteristics from a drone’s perspective. Using ResNet50 as the backbone network, we introduce the proposed Hybrid Text Attention Mechanism into the backbone network to enhance the perception of text regions in the feature extraction module. Additionally, we propose a Spatial Feature Fusion Module to adaptively fuse text features of different scales, thereby enhancing the model’s adaptability. Furthermore, we introduce a text detail transformer by incorporating a local feature extractor into the transformer of the text detail boundary iteration optimization module. This enables the precise optimization and localization of text boundaries by reducing the interference of complex backgrounds, eliminating the need for complex post-processing. Extensive experiments on challenging text detection datasets and drone-based text detection datasets validate the high robustness and state-of-the-art performance of our proposed method, laying a solid foundation for practical applications.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"26 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segment anything model for few-shot medical image segmentation with domain tuning","authors":"Weili Shi, Penglong Zhang, Yuqin Li, Zhengang Jiang","doi":"10.1007/s40747-024-01625-7","DOIUrl":"https://doi.org/10.1007/s40747-024-01625-7","url":null,"abstract":"<p>Medical image segmentation constitutes a crucial step in the analysis of medical images, possessing extensive applications and research significance within the realm of medical research and practice. Convolutional neural network achieved great success in medical image segmentation. However, acquiring large labeled datasets remains unattainable due to the substantial expertise and time required for image labeling, as well as heightened patient privacy concerns. To solve scarce medical image data, we propose a powerful network Domain Tuning SAM for Medical images (DT-SAM). We construct an encoder utilizing a parameter-effective fine-tuning strategy and SAM. This strategy selectively updates a small fraction of the weight increments while preserving the majority of the pre-training weights in the SAM encoder, consequently reducing the required number of training samples. Meanwhile, our approach leverages only SAM encoder structure while incorporating a decoder similar to U-Net decoder structure and redesigning skip connections to concatenate encoder-extracted features, which effectively decode the features extracted by the encoder and preserve edge information. We have conducted comprehensive experiments on three publicly available medical image segmentation datasets. The combined experimental results show that our method can effectively perform few shot medical image segmentation. With just one labeled data, achieving a Dice score of 63.51%, a HD of 17.94 and an IoU score of 73.55% on Heart Task, on Prostate Task, an average Dice score of 46.01%, a HD of 10.25 and an IoU score of 65.92% were achieved, and the Dice, HD, and IoU score reaching 88.67%, 10.63, and 90.19% on BUSI. Remarkably, with few training samples, our method consistently outperforms various based on SAM and CNN.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"163 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mix-layers semantic extraction and multi-scale aggregation transformer for semantic segmentation","authors":"Tianping Li, Xiaolong Yang, Zhenyi Zhang, Zhaotong Cui, Zhou Maoxia","doi":"10.1007/s40747-024-01650-6","DOIUrl":"https://doi.org/10.1007/s40747-024-01650-6","url":null,"abstract":"<p>Recently, a number of vision transformer models for semantic segmentation have been proposed, with the majority of these achieving impressive results. However, they lack the ability to exploit the intrinsic position and channel features of the image and are less capable of multi-scale feature fusion. This paper presents a semantic segmentation method that successfully combines attention and multiscale representation, thereby enhancing performance and efficiency. This represents a significant advancement in the field. Multi-layers semantic extraction and multi-scale aggregation transformer decoder (MEMAFormer) is proposed, which consists of two components: mix-layers dual channel semantic extraction module (MDCE) and semantic aggregation pyramid pooling module (SAPPM). The MDCE incorporates a multi-layers cross attention module (MCAM) and an efficient channel attention module (ECAM). In MCAM, horizontal connections between encoder and decoder stages are employed as feature queries for the attention module. The hierarchical feature maps derived from different encoder and decoder stages are integrated into key and value. To address long-term dependencies, ECAM selectively emphasizes interdependent channel feature maps by integrating relevant features across all channels. The adaptability of the feature maps is reduced by pyramid pooling, which reduces the amount of computation without compromising performance. SAPPM is comprised of several distinct pooled kernels that extract context with a deeper flow of information, forming a multi-scale feature by integrating various feature sizes. The MEMAFormer-B0 model demonstrates superior performance compared to SegFormer-B0, exhibiting gains of 4.8%, 4.0% and 3.5% on the ADE20K, Cityscapes and COCO-stuff datasets, respectively.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"98 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-stage deep reinforcement learning method for agile optical satellite scheduling problem","authors":"Zheng Liu, Wei Xiong, Zhuoya Jia, Chi Han","doi":"10.1007/s40747-024-01667-x","DOIUrl":"https://doi.org/10.1007/s40747-024-01667-x","url":null,"abstract":"<p>This paper investigates the agile optical satellite scheduling problem, which aims to arrange an observation sequence and observation actions for observation tasks. Existing research mainly aims to maximize the number of completed tasks or the total priorities of the completed tasks but ignores the influence of the observation actions on the imaging quality. Besides, the conventional exact methods and heuristic methods can hardly obtain a high-quality solution in a short time due to the complicated constraints and considerable solution space of this problem. Thus, this paper proposes a two-stage scheduling framework with two-stage deep reinforcement learning to address this problem. First, the scheduling process is decomposed into a task sequencing stage and an observation scheduling stage, and a mathematical model with complex constraints and two-stage optimization objectives is established to describe the problem. Then, a pointer network with a local selection mechanism and a rough pruning mechanism is constructed as the sequencing network to generate an executable task sequence in the task sequencing stage. Next, a decomposition strategy decomposes the executable task sequence into multiple sub-sequences in the observation scheduling stage, and the observation scheduling process of these sub-sequences is modeled as a concatenated Markov decision process. A neural network is designed as the observation scheduling network to determine observation actions for the sequenced tasks, which is well trained by the soft actor-critic algorithm. Finally, extensive experiments show that the proposed method, along with the designed mechanisms and strategy, is superior to comparison algorithms in terms of solution quality, generalization performance, and computation efficiency.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"127 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junsan Zhang, Sini Wu, Te Wang, Fengmei Ding, Jie Zhu
{"title":"Relieving popularity bias in recommendation via debiasing representation enhancement","authors":"Junsan Zhang, Sini Wu, Te Wang, Fengmei Ding, Jie Zhu","doi":"10.1007/s40747-024-01649-z","DOIUrl":"https://doi.org/10.1007/s40747-024-01649-z","url":null,"abstract":"<p>The interaction data used for training recommender systems often exhibit a long-tail distribution. Such highly imbalanced data distribution results in an unfair learning process among items. Contrastive learning alleviates the above issue by data augmentation. However, it lacks consideration of the significant disparity in popularity between items and may even introduce false negatives during the data augmentation, misleading user preference prediction. To address this issue, we combine contrastive learning with a weighted model for negative validation. By penalizing identified false negatives during training, we limit their potential harm within the training process. Meanwhile, to tackle the scarcity of supervision signals for unpopular items, we design Popularity Associated Modeling to mine the correlation among items. Then we guide unpopular items to learn hidden features favored by specific users from their associated popular items, which provides effective supplementary information for their representation modeling. Extensive experiments on three real-world datasets demonstrate that our proposed model outperforms state-of-the-art baselines in recommendation performance, with Recall@20 improvements of 4.2%, 2.4% and 3.6% across the datasets, but also shows significant effectiveness in relieving popularity bias.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"24 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meng Wang, Guanci Yang, Kexin Luo, Yang Li, Ling He
{"title":"Early stroke behavior detection based on improved video masked autoencoders for potential patients","authors":"Meng Wang, Guanci Yang, Kexin Luo, Yang Li, Ling He","doi":"10.1007/s40747-024-01610-0","DOIUrl":"https://doi.org/10.1007/s40747-024-01610-0","url":null,"abstract":"<p>Stroke is the prevalent cerebrovascular disease characterized by significant incidence and disability rates. To enhance the early perceive and detection of potential stroke patients, the early stroke behavior detection based on improved Video Masked Autoencoders (VideoMAE) for potential patients (EPBR-PS) is proposed. The proposed method begins with novel time interval-based sampling strategy, capturing video frame sequences enriched with sparse motion features. On the basis of establishing the masking mechanism for adjacent frames and pixel blocks within these sequences, The EPBR-PS employes pipeline mask strategy to extract spatiotemporal features effectively. Then, the local convolution attention mechanism is designed to capture local dynamic feature information, and central to the EPBR-PS is the integration of local convolutional attention mechanism with VideoMAE's multi-head attention mechanism. This integration facilitates the simultaneous leveraging of global high-level semantics and local dynamic feature information. Dual attention mechanism-based method for the fusion of these global and local features is proposed. After that, the optimal parameters of EPBR-PS were determined through the experiment of learning rate and fusion weights of different features. On the NTU-ST dataset, comparative analysis with eight models demonstrated the superiority of EPBR-PS, evidenced by the average recognition accuracy of 89.61%, surpassing that 1.67% over the benchmark VideoMAE. On the HMDB51 dataset, EPBR-PS has Top1 of 71.31%, which is 0.73% higher than that of the VideoMAE, providing the viable behavior detection for perception early signs of potential stroke in the home environment. This code is available at https://github.com/wang-325/EPBR-PS/.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"72 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}