Qingshan Chen;Moyan Zhang;Zhenzhen Quan;Yumeng Zhang;Mikhail G. Mozerov;Chao Zhai;Hongjuan Li;Yujun Li
{"title":"MSSA: Multispectral Semantic Alignment for Cross-Modality Infrared-RGB Person Reidentification","authors":"Qingshan Chen;Moyan Zhang;Zhenzhen Quan;Yumeng Zhang;Mikhail G. Mozerov;Chao Zhai;Hongjuan Li;Yujun Li","doi":"10.1109/TCSS.2024.3403691","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3403691","url":null,"abstract":"The widespread deployment of dual-camera systems has laid a solid foundation for practical applications of infrared (IR)-RGB cross-modality person reidentification (ReID). However, the inherent modality differences between RGB and IR images cause significant intra-class variances in the feature space for individuals of the same identity. Current methods typically employ various network architectures for the image style transfer or extracting modality-invariant features, yet they overlook the information extraction from the most fundamental spectral semantic features. Based on the existing approaches, we propose a multi-spectral semantic alignment (MSSA) architecture aimed at aligning fine-grained spectral semantic features across both intra-modality and inter-modality perspectives. Through modality center semantic alignment (MCSA) learning, we comprehensively mitigate differences in identity features of different modalities. Moreover, to attenuate the discriminative information unique to a single modality, we introduce the modality reliability intensification (MRI) loss to enhance the reliability of identity information. Finally, to tackle the challenge that inter-modality intra-class disparities surpass inter-modality inter-class differences, we leverage the dynamic discriminative center (DDC) loss to further bolster the discriminability of reliable information. Through an extensive experiments conducted on SYSU-MM01, RegDB, and LLCM datasets, we demonstrate the substantial advantages of the proposed MSSA over other state-of-the-art methods.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 6","pages":"7568-7583"},"PeriodicalIF":4.5,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142761530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biao Yang;Miaomiao Cao;Xianlin Zhu;Suhong Wang;Changchun Yang;Rongrong Ni;Xiaofeng Liu
{"title":"MMPF: Multimodal Purification Fusion for Automatic Depression Detection","authors":"Biao Yang;Miaomiao Cao;Xianlin Zhu;Suhong Wang;Changchun Yang;Rongrong Ni;Xiaofeng Liu","doi":"10.1109/TCSS.2024.3411616","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3411616","url":null,"abstract":"Depression is a common mental disorder that requires objective and valid assessment tools. However, purely data-driven methods cannot satisfy the clinical diagnostic criteria for automatic depression detection (ADD), and the instability and heterogeneity of multimodal data have not been fully resolved. Therefore, we propose a novel auxiliary tool for ADD based on multimodal purification fusion (MMPF). Initially, a prior constraint gating (PCG) strategy is used to inject doctors’ constraints into depression data to guide and constrain the learning process. Then, we introduce text and audio encoders to extract unpurified features from preprocessed depression data. Afterward, multimodal purification refinement is proposed to extract unintersected common and specific features from unpurified features, generating purified features. Meanwhile, we leverage a multiperspective contrastive learning (MCL) strategy to enhance unpurified and purified features. Finally, modality interaction (MI) based on the transformer is proposed to conduct multimodal fusion. A dynamic corrective learning (DCL) strategy is introduced to tackle modality imbalances and inconsistent sentiment. MMPF is evaluated on the Distress Analysis Interview Corpus Wizard of Oz and performs promisingly in unimodal and multimodal depression detection, indicating its significant role in ADD.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 6","pages":"7421-7434"},"PeriodicalIF":4.5,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142761451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SIA-Net: Sparse Interactive Attention Network for Multimodal Emotion Recognition","authors":"Shuzhen Li;Tong Zhang;C. L. Philip Chen","doi":"10.1109/TCSS.2024.3409715","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3409715","url":null,"abstract":"Multimodal emotion recognition (MER) integrates multiple modalities to identify the user's emotional state, which is the core technology of natural and friendly human–computer interaction systems. Currently, many researchers have explored comprehensive multimodal information for MER, but few consider that comprehensive multimodal features may contain noisy, useless, or redundant information, which interferes with emotional feature representation. To tackle this challenge, this article proposes a sparse interactive attention network (SIA-Net) for MER. In SIA-Net, the sparse interactive attention (SIA) module mainly consists of intramodal sparsity and intermodal sparsity. The intramodal sparsity provides sparse but effective unimodal features for multimodal fusion. The intermodal sparsity adaptively sparses intramodal and intermodal interactive relations and encodes them into sparse interactive attention. The sparse interactive attention with a small number of nonzero weights then act on multimodal features to highlight a few but important features and suppress numerous redundant features. Furthermore, the intramodal sparsity and intermodal sparsity are deep sparse representations that make unimodal features and multimodal interactions sparse without complicated optimization. The extensive experimental results show that SIA-Net achieves superior performance on three widely used datasets.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 5","pages":"6782-6794"},"PeriodicalIF":4.5,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VLOG: Vehicle Identity Verification Based on Local and Global Behavior Analysis","authors":"Zhong Li;Yubo Kong;Jie Luo;Yifei Meng;Changjun Jiang","doi":"10.1109/TCSS.2024.3414587","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3414587","url":null,"abstract":"Internet of Vehicles (IoV) improves traffic safety and efficiency by wireless communications among vehicles and infrastructures. To ensure secure communications in IoV, the problem of vehicle identity security must be solved before deployment. In this article, we propose a quick-response behavior-based vehicle identity verification method, called VLOG, for solving identity theft in IoV. This method is based on the idea of a vehicle usually having relatively stable traveling habit/behaivor. If we detect unusual behavior, the vehicle's identity may be stolen. VLOG captures vehicles’ latent behavior models from local and global two aspects, and further merges local and global models into a comprehensive behavior-based identity verification model. In the local part, we give a 2-D Gaussian model to fit the behavior data. In the global part, we learn vehicles’ traveling preferences under secure multiparty computation framework with considering the behavior volatility. The results of experiments based on a real-world vehicular trace dataset show the best performance of VLOG in terms of accuracy, F1 score, and cost. Meanwhile, VLOG also performs well in the area under the curve and precision-recall curve. Besides, since our model is preprepared, when a vehicle is required to be detected, the verification response time is short.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 5","pages":"7032-7044"},"PeriodicalIF":4.5,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RAH! RecSys–Assistant–Human: A Human-Centered Recommendation Framework With LLM Agents","authors":"Yubo Shu;Haonan Zhang;Hansu Gu;Peng Zhang;Tun Lu;Dongsheng Li;Ning Gu","doi":"10.1109/TCSS.2024.3404039","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3404039","url":null,"abstract":"The rapid evolution of the web has led to an exponential growth in content. Recommender systems play a crucial role in human–computer interaction (HCI) by tailoring content based on individual preferences. Despite their importance, challenges persist in balancing recommendation accuracy with user satisfaction, addressing biases while preserving user privacy, and solving cold-start problems in cross-domain situations. This research argues that addressing these issues is not solely the recommender systems’ responsibility, and a human-centered approach is vital. We introduce the recommender system, assistant, and human (RAH) framework, an innovative solution with large language model (LLM)-based agents such as perceive, learn, act, critic, and reflect, emphasizing the alignment with user personalities. The framework utilizes the learn-act-critic loop and a reflection mechanism for improving user alignment. Using the real-world data, our experiments demonstrate the RAH framework's efficacy in various recommendation domains, from reducing human burden to mitigating biases and enhancing user control. Notably, our contributions provide a human-centered recommendation framework that partners effectively with various recommendation models.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 5","pages":"6759-6770"},"PeriodicalIF":4.5,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hybridized Approach for Enhanced Fake Review Detection","authors":"Shu Xu;Haoqi Cuan;Zhichao Yin;Chunyong Yin","doi":"10.1109/TCSS.2024.3411635","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3411635","url":null,"abstract":"User reviews on online consumption platforms are crucial for both consumers and merchants, serving as a reference for purchase decisions and product improvement. However, fake reviews can mislead consumers and harm merchant profits and reputation. Developing effective methods for detecting deceptive reviews is crucial to protecting the interests of both parties. In recent years, research on fake review detection has focused on improving machine learning and neural network methods to enhance the accuracy of fake review detection, neglecting the fundamental and necessary work of text feature representation for reviews. High-quality review text feature representation affects or even determines the quality and performance of fake review detection methods. The increasing prevalence of fake reviews results in a more complex distribution within the feature space of review texts, thus necessitating review embedding methods that exhibit comprehensive semantic comprehension and contextual awareness of review texts. To improve the quality of textual feature representation, we propose a review-embedding attention-based long short-term memory (A-LSTM) method that can encode the global semantics of reviews and detect the deception of the review content. A-LSTM uses attention gates to discover the importance of words, and by analyzing the importance of words, it can help distinguish the characteristics of real and fake reviews, and we propose an attention loss function to solve the problem of class imbalance. On the Yelp dataset, the accuracy of deceptive review detection has increased to 90.9%.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 6","pages":"7448-7466"},"PeriodicalIF":4.5,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142761531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guangxu Mei;Ziyu Guo;Li Pan;Qian Li;Feng Li;Shijun Liu
{"title":"LIHAN: A Lattice-Guided Incomplete Heterogeneous Information Network Embedding Model for Node Classification","authors":"Guangxu Mei;Ziyu Guo;Li Pan;Qian Li;Feng Li;Shijun Liu","doi":"10.1109/TCSS.2024.3405569","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3405569","url":null,"abstract":"Real-world heterogeneous information networks (HINs) are modeled as heterogeneous graphs, in which features and structures are often incomplete. Existing models employ manual imputation or dynamic adjustment to populate the incomplete data. However, there are some limitations in incomplete heterogeneous graph representation learning: 1) using populated data may lose content and high-level interaction information of HINs, even lead to negative impacts on the performance of downstream tasks; and 2) existing models fail to utilize the high-order heterogeneous structures in original incomplete network data. To resolve the above issues, in this article, we proposed a lattice-based incomplete heterogeneous structural attention network (LIHAN) for learning incomplete heterogeneous node embeddings. LIHAN first constructs characteristic lattice and structure lattice by mining characteristic sets and structure sets according to the partial order relations in between. Then, an improved lattice-based heterogeneous dual-attention mechanism is used to learn the heterogeneous node representations. Extensive node classification experiments are conducted on five open datasets to verify the superior performance of the proposed LIHAN model over the state-of-the-art models. Experimental results illustrate that LIHAN outperforms other methods on the micro-F1 and macro-F1 in node classification tasks. Moreover, experiments on different levels of lattices and the parameter sensitivity analysis shows the great stability during the process of experiments.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 6","pages":"7411-7420"},"PeriodicalIF":4.5,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142761447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decomposing Neuroanatomical Heterogeneity of Autism Spectrum Disorder Across Different Developmental Stages Using Morphological Multiplex Network Model","authors":"Xiang Fu;Ying Wang;Jialong Li;Hongmin Cai;Xinyan Zhang;Zhijun Yao;Minqiang Yang;Weihao Zheng","doi":"10.1109/TCSS.2024.3411113","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3411113","url":null,"abstract":"Autism spectrum disorder (ASD) is accompanied by impaired social cognition and behavior. The expense of supporting patients with ASD turns into a significant problem for society. Parsing neurobiological subtypes is a crucial way for delineating the heterogeneity in autistic brains, with significant implications for improving ASD diagnosis and promoting the development of personalized intervention models. Nevertheless, a comprehensive understanding of the heterogeneity in cortical morphology of ASD is still lacking, and the question of whether neuroanatomical subtypes remain stable during cortical development remains unclear. Here, we used T1-weighted images of 515 male patients with ASD, including 216 autistic children (6–11 years), 187 adolescents (12–17 years), and 112 young adults (18–29 years), along with 595 age and gender-matched typically developing (TD) individuals. Cortical thickness (CT), surface area (SA), and volumes of cortical (CV) and subcortical (SV) regions were extracted. A single network layer was established by calculating the covariance of each feature across brain regions between participants, thereby constructing a multilayer intersubject covariance network. Applying a community detection algorithm to multilayer networks derived from different feature combinations, we observed that the network comprising CT and CV layers exhibited the most prominent modular organization, resulting in three subtypes of ASD for each of the three age groups. Subtypes within the corresponding age group significantly differed in terms of brain morphology and clinical scales. Furthermore, the subtypes of children with ASD underwent reorganization with development, transitioning from childhood to adolescence and adulthood, rather than consistently persist. Additionally, subtype categorization largely improved the diagnostic accuracy of ASD compared to diagnosing the entire ASD cohort. These findings demonstrated distinct neuroanatomical manifestations of ASD subtypes across various developmental periods, highlighting the significance of age-related subtyping in facilitating the etiology and diagnosis of ASD.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 5","pages":"6557-6567"},"PeriodicalIF":4.5,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Iterative Role Negotiation via the Bilevel GRA++ With Decision Tolerance","authors":"Qian Jiang;Dongning Liu;Haibin Zhu;Shijue Wu;Naiqi Wu;Xin Luo;Yan Qiao","doi":"10.1109/TCSS.2024.3409893","DOIUrl":"https://doi.org/10.1109/TCSS.2024.3409893","url":null,"abstract":"Role negotiation (RN) is situated at the initial stage of the role-based collaboration (RBC) methodology and is independent of the subsequent agent evaluation and role assignment (RA) processes. RN is to determine the roles and the resource requirements for each role. In existing RBC-related research, RN is assumed to be static. This means that the roles and the resource requirements for each role are predetermined by decision-makers. However, the resources allocated to each role can vary. At this time, iterative RN outcomes will have different RA results. There may not be a direct dominant relationship between different RA outcomes, especially when solving group role assignment (GRA) with multiple objectives (GRA++) problems, which makes it even more complex. To address these concerns, we introduce the original bilevel GRA++ (BGRA++) model. Specifically, at the lower level of BGRA++, a strategy is designed for quantifying iterative RNs. For the upper level, we introduce the novel GRA-NSGA-II algorithm for the RA process. Finally, we introduce the concept of decision tolerance to assist decision-makers in selecting the optimal solution from the multiple RNs. Last, simulation experiments are conducted to verify the robustness and practicability of the proposed method. Comparisons and discussions show that the proposed solution is highly competitive for solving the GRA++ problem with iterative RN.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 6","pages":"7484-7499"},"PeriodicalIF":4.5,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142761448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}