Information Fusion最新文献

筛选
英文 中文
Span-based syntactic feature fusion for aspect sentiment triplet extraction
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-14 DOI: 10.1016/j.inffus.2025.103078
Guangtao Xu, Zhihao Yang, Bo Xu, Ling Luo, Hongfei Lin
{"title":"Span-based syntactic feature fusion for aspect sentiment triplet extraction","authors":"Guangtao Xu,&nbsp;Zhihao Yang,&nbsp;Bo Xu,&nbsp;Ling Luo,&nbsp;Hongfei Lin","doi":"10.1016/j.inffus.2025.103078","DOIUrl":"10.1016/j.inffus.2025.103078","url":null,"abstract":"<div><div>Aspect sentiment triplet extraction (ASTE) is a particularly challenging subtask in aspect-based sentiment analysis. The span-based method is currently one of the mainstream solutions in this area. However, existing span-based methods focus only on semantic information, neglecting syntactic information, which has been proven effective in aspect-based sentiment classification. In this work, we combine syntactic information with the span-based method according to task characteristics and propose a span-based syntactic feature fusion (SSFF) model for ASTE. Firstly, we introduce part-of-speech information to assist span category prediction. Secondly, we introduce dependency distance information to assist sentiment polarity category prediction. By introducing the aforementioned syntactic information, the learning objectives of the first and second stages of the span-based method are clearly distinguished, which effectively improves the performance of the span-based method. We conduct experiments on the widely used public dataset ASTE-V2. The experimental results demonstrate that SSFF significantly improves the performance of the span-based method and outperforms all baseline models, achieving new state-of-the-art performance.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103078"},"PeriodicalIF":14.7,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Behavior-Pred: A semantic-enhanced trajectory pre-training framework for motion forecasting
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-13 DOI: 10.1016/j.inffus.2025.103086
Jianxin Shi , Jinhao Chen , Yuandong Wang , Tao Feng , Zhen Yang , Tianyu Wo
{"title":"Behavior-Pred: A semantic-enhanced trajectory pre-training framework for motion forecasting","authors":"Jianxin Shi ,&nbsp;Jinhao Chen ,&nbsp;Yuandong Wang ,&nbsp;Tao Feng ,&nbsp;Zhen Yang ,&nbsp;Tianyu Wo","doi":"10.1016/j.inffus.2025.103086","DOIUrl":"10.1016/j.inffus.2025.103086","url":null,"abstract":"<div><div>Predicting the future movements of dynamic traffic agents is crucial for autonomous systems. Effectively understanding the behavioral patterns of traffic agents is key to accurately predicting their future movements.</div><div>Inspired by the success of the pre-training and fine-tuning paradigm in artificial intelligence, we develop a semantic-enhanced trajectory pre-training framework for motion forecasting in the autonomous driving domain, named <strong>Behavior-Pred</strong>. In detail, we design two kinds of tasks during the pre-training phase: fine-grained reconstruction and coarse-grained contrastive tasks, to learn a better representation of both historical and future behaviors, as well as their pattern consistency. In fine-grained reconstruction learning, we utilize a time-dimensional masking strategy based on the timestep level, which reserves historical and future patterns compared to agent-based masking. In coarse-grained contrastive learning, we design a similarity-based loss function to grasp the relationship/consistency between history patterns and the future. Overall, Behavior-Pred learns more comprehensive behavioral semantics via multi-granularity pre-training tasks. Experimental results demonstrate that our framework outperforms various baselines.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103086"},"PeriodicalIF":14.7,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DUET: Dually guided knowledge distillation from explicit feedback
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-13 DOI: 10.1016/j.inffus.2025.103098
Hong-Kyun Bae , Jiyeon Kim , Jongwuk Lee , Sang-Wook Kim
{"title":"DUET: Dually guided knowledge distillation from explicit feedback","authors":"Hong-Kyun Bae ,&nbsp;Jiyeon Kim ,&nbsp;Jongwuk Lee ,&nbsp;Sang-Wook Kim","doi":"10.1016/j.inffus.2025.103098","DOIUrl":"10.1016/j.inffus.2025.103098","url":null,"abstract":"<div><div>Various <em>knowledge distillation</em> (KD) methods for recommender systems have been recently introduced to achieve two goals: (i) obtaining an inference time shorter than the cumbersome model (<em>i.e.</em>, <em>teacher</em>) and (ii) providing accuracy higher than the compact model (<em>i.e.</em>, <em>student</em>). Despite their success, they solely focus on developing KD methods with <em>implicit feedback</em>. We argue that handling CF with <em>explicit feedback</em> is also crucial, representing the different degrees of user preferences. Towards this goal, we propose a novel KD framework for recommender systems, namely <em>Dually gUided knowlEdge disTillation</em> (<span>DUET</span>). We first observe that explicit feedback is interpreted as two types of user preferences, <em>i.e.</em>, <em>pre-use preference</em> and <em>post-use preference</em>. Motivated by such characteristics of explicit feedback, we aim to <em>fuse knowledge</em> from the teacher’s pre- and post-use preferences by employing <em>two teachers</em> (<em>i.e.</em>, teacher #1 and teacher #2). <em>Teacher #1</em>, trained with <em>pre-use preferences</em>, selects some items among unrated ones. <em>Teacher #2</em>, trained with <em>post-use preferences</em>, determines the <em>soft labels</em> (<em>i.e.</em>, predicted post-use preferences) of those items chosen by teacher #1. Finally, the student is trained with both the hard labels (<em>i.e.</em>, observed post-use preferences) of rated items and the soft labels (<em>i.e.</em>, predicted post-use preferences by teacher #2) of the items selected by teacher #1. Extensive experimental results show that our <span>DUET</span> framework consistently outperforms state-of-the-art KD methods on three benchmark datasets. Notably, it beats RD, CD, DE-RRD, BD, and TD up to 13.6%, 18.6%, 16.8%, 9.6%, and 18.6% in terms of NDCG@10, respectively.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103098"},"PeriodicalIF":14.7,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
D2Fusion: Dual-domain fusion with feature superposition for Deepfake detection
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-13 DOI: 10.1016/j.inffus.2025.103087
Xueqi Qiu , Xingyu Miao , Fan Wan , Haoran Duan , Tejal Shah , Varun Ojha , Yang Long , Rajiv Ranjan
{"title":"D2Fusion: Dual-domain fusion with feature superposition for Deepfake detection","authors":"Xueqi Qiu ,&nbsp;Xingyu Miao ,&nbsp;Fan Wan ,&nbsp;Haoran Duan ,&nbsp;Tejal Shah ,&nbsp;Varun Ojha ,&nbsp;Yang Long ,&nbsp;Rajiv Ranjan","doi":"10.1016/j.inffus.2025.103087","DOIUrl":"10.1016/j.inffus.2025.103087","url":null,"abstract":"<div><div>Deepfake detection is crucial for curbing the harm it causes to society. However, current Deepfake detection methods fail to thoroughly explore artifact information across different domains due to insufficient intrinsic interactions. These interactions refer to the fusion and coordination after feature extraction processes across different domains, which are crucial for recognizing complex forgery clues. Focusing on more generalized Deepfake detection, in this work, we introduce a novel bi-directional attention module to capture the local positional information of artifact clues from the spatial domain. This enables accurate artifact localization, thus addressing the coarse processing with artifact features. To further address the limitation that the proposed bi-directional attention module may not well capture global subtle forgery information in the artifact feature (e.g., textures or edges), we employ a fine-grained frequency attention module in the frequency domain. By doing so, we can obtain high-frequency information in the fine-grained features, which contains the global and subtle forgery information. Although these features from the diverse domains can be effectively and independently improved, fusing them directly does not effectively improve the detection performance. Therefore, we propose a feature superposition strategy that complements information from spatial and frequency domains. This strategy turns the feature components into the form of wave-like tokens, which are updated based on their phase, such that the distinctions between authentic and artifact features can be amplified. Our method demonstrates significant improvements over state-of-the-art (SOTA) methods on five public Deepfake datasets in capturing abnormalities across different manipulated operations and real-life. Specifically, in intra-dataset evaluations, <span><math><msup><mrow><mi>D</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>Fusion surpasses the baseline accuracy by nearly 2.5%. In cross-manipulation evaluations, it exceeds the baseline AUC by up to 6.15%. In multi-source manipulation evaluations, it exceeds the SOTA methods by up to 14.62% in <span><math><mi>P</mi></math></span>-value, 10.26% in <span><math><mrow><mi>F</mi><mn>1</mn></mrow></math></span>-score and 15.13% in <span><math><mi>R</mi></math></span>-value. In cross-dataset experiments, it exceeds the baseline AUC by up to 6.25%. For potential applications, <span><math><msup><mrow><mi>D</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>Fusion can help improve content moderation on social media and aid forensic investigations by accurately identifying the tampered content.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103087"},"PeriodicalIF":14.7,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From failure to fusion: A survey on learning from bad machine learning models
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-13 DOI: 10.1016/j.inffus.2025.103122
M.Z. Naser
{"title":"From failure to fusion: A survey on learning from bad machine learning models","authors":"M.Z. Naser","doi":"10.1016/j.inffus.2025.103122","DOIUrl":"10.1016/j.inffus.2025.103122","url":null,"abstract":"<div><div>Machine learning (ML) models are ubiquitous across diverse applications; however, only a fraction achieves optimal performance, often leading to the deployment of a singular model while dismissing others as experimental failures. This paper challenges this commonly accepted practice by systematically investigating the utility of suboptimal ML models. We posit that these models encapsulate valuable information regarding data biases, architectural limitations, and systemic misalignments, which can be leveraged to enhance overall system performance. Central to our approach is the integration of information fusion techniques, which combine heterogeneous data sources to robustly analyze and contextualize the errors and biases present in underperforming models. Our methodology includes advanced negative knowledge distillation, as well as error-based curriculum learning frameworks that are derived from multiple data modalities. We propose a comprehensive debugging framework that utilizes meta-learning for failure detection and correction to enable continuous improvement through rigorous cross-validation and iterative refinement. This study stresses the importance of documenting negative outcomes to promote transparency and foster interdisciplinary collaboration to build resilient and generalizable ML systems, particularly in information fusion. We advocate for a paradigm shift in the ML community and urge both researchers and institutions to systematically harness the insights derived from so-called \"failed\" models. We then conclude this paper by discussing several challenges and possible pathways for future research.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103122"},"PeriodicalIF":14.7,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143628337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ethereum fraud detection via joint transaction language model and graph representation learning
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-13 DOI: 10.1016/j.inffus.2025.103074
Jianguo Sun , Yifan Jia , Yanbin Wang , Ye Tian , Sheng Zhang
{"title":"Ethereum fraud detection via joint transaction language model and graph representation learning","authors":"Jianguo Sun ,&nbsp;Yifan Jia ,&nbsp;Yanbin Wang ,&nbsp;Ye Tian ,&nbsp;Sheng Zhang","doi":"10.1016/j.inffus.2025.103074","DOIUrl":"10.1016/j.inffus.2025.103074","url":null,"abstract":"<div><div>Ethereum faces growing fraud threats. Current fraud detection methods, whether employing graph neural networks or sequence models, fail to consider the semantic information and similarity patterns within transactions. Moreover, these approaches do not leverage the potential synergistic benefits of combining both types of models. To address these challenges, we propose TLMG4Eth that combines a transaction language model with graph-based methods to capture semantic, similarity, and structural features of transaction data in Ethereum. We first propose a transaction language model that converts numerical transaction data into meaningful transaction sentences, enabling the model to learn explicit transaction semantics. Then, we propose a transaction attribute similarity graph to learn transaction similarity information, enabling us to capture intuitive insights into transaction anomalies. Additionally, we construct an account interaction graph to capture the structural information of the account transaction network. We employ a deep Multi-Head Attention Network to fuse transaction semantic and similarity embeddings, and ultimately propose a joint training approach for the Multi-Head Attention Network and the account interaction graph to obtain the synergistic benefits of both. Our model achieves performance improvements ranging from 9.62% to 13.2% over state-of-the-art methods on two public datasets and a newly introduced dataset. Our code is available at the following link: <span><span>https://github.com/lincozz/TLmGNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103074"},"PeriodicalIF":14.7,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate and automatic spatiotemporal calibration for multi-modal sensor system based on continuous-time optimization
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-13 DOI: 10.1016/j.inffus.2025.103071
Shengyu Li , Shuolong Chen , Xingxing Li, Yuxuan Zhou, Shiwen Wang
{"title":"Accurate and automatic spatiotemporal calibration for multi-modal sensor system based on continuous-time optimization","authors":"Shengyu Li ,&nbsp;Shuolong Chen ,&nbsp;Xingxing Li,&nbsp;Yuxuan Zhou,&nbsp;Shiwen Wang","doi":"10.1016/j.inffus.2025.103071","DOIUrl":"10.1016/j.inffus.2025.103071","url":null,"abstract":"<div><div>Current intelligent robotic applications, such as unmanned aerial vehicles (UAV) and autonomous driving, generally rely on multi-modal sensor fusion to continuously strive towards higher levels of autonomy. To achieve this goal, accurate and consistent inter-sensor spatiotemporal relationship is a fundamental prerequisite for fusing heterogeneous sensor information. Nevertheless, current calibration frameworks typically necessitate specialized tools or additional infrastructures, rendering them labor-intensive and only applicable to certain sensor combinations. To address this issue, we propose an accurate and easy-to-use spatiotemporal calibration framework tailored to current primary sensors, including inertial measurement unit (IMU), LiDAR, camera and Radar. This calibration framework can be seamlessly extended to other sensors that could independently recover ego-motion or ego-velocity, such as wheel odometry and GPS devices. A rigorous multistage initialization approach is first developed to obtain reasonable initial guesses of spatiotemporal parameters without relying on prior knowledge of environmental information or specialized movements. Leveraging the IMU-centric principle, the spatiotemporal parameters of other sensors relative to IMU can be jointly optimized and refined via continuous-time batch estimation without sharing the overlapping field-of-views (FoVs) among exteroceptive sensors. A comprehensive series of experiments is carried out to quantitatively evaluate the proposed method in both simulation and real-world scenarios. The results demonstrate that the proposed method could achieve comparable calibration accuracy against state-of-the-art target-based calibration methods and outperform targetless calibration methods in terms of consistency and repeatability.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103071"},"PeriodicalIF":14.7,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RSRule: Relation-level semantic-driven rule learning for explainable extrapolation on temporal knowledge graphs
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-12 DOI: 10.1016/j.inffus.2025.103080
Kai Chen , Xiaojuan Zhao , Xin Song , Ye Wang , Zhibin Dong , Feng Xie , Aiping Li , Yue Han , Changjian Li
{"title":"RSRule: Relation-level semantic-driven rule learning for explainable extrapolation on temporal knowledge graphs","authors":"Kai Chen ,&nbsp;Xiaojuan Zhao ,&nbsp;Xin Song ,&nbsp;Ye Wang ,&nbsp;Zhibin Dong ,&nbsp;Feng Xie ,&nbsp;Aiping Li ,&nbsp;Yue Han ,&nbsp;Changjian Li","doi":"10.1016/j.inffus.2025.103080","DOIUrl":"10.1016/j.inffus.2025.103080","url":null,"abstract":"<div><div>Explainability is crucial and valuable for extrapolation reasoning on Temporal Knowledge Graphs (TKGs). By elucidating the reasoning process, we can understand and validate the extrapolation results well, ensuring their validity and reliability. Among various extrapolation methods, rule-based approaches have significant advantages for its explicit rules and explainable reasoning paths. However, current rule-based methods primarily rely on statistics in rule learning, with a heavy dependence on the quantity and quality of the data. In reality, TKGs often suffer from incompleteness and strong sparsity, which severely limits the performance of existing rule-based methods. To address these issues, we propose a novel relation-level semantic-driven rule-based (RSRule) method for explainable extrapolation reasoning, where the relation-level semantics are fused into our rule learning process. Specifically, we concentrate on diverse contextual positional patterns within TKGs and introduce an innovative heterogeneous relation graph to learn relation-level semantics, while employing a relative time encoding to capture the periodic and non-periodic aspects of temporal evolution. Our RSRule focuses on fusing semantic information into the rule learning process, enabling the calculation of rule scores that consider both statistical and semantic aspects. Extensive experiments demonstrate the promising capacity of our RSRule from five aspects, i.e., superiority, improvement, explainability, robustness and generalization.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103080"},"PeriodicalIF":14.7,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143619906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tensor-driven face recognition: Integrating super-resolution and multilinear subspace learning for low-resolution images
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-11 DOI: 10.1016/j.inffus.2025.103075
Sana Bellili , Abdelmalik Ouamane , Ammar Chouchane , Yassine Himeur , Shadi Atalla , Wathiq Mansoor , Salah Bourennane , Faycal Bensaali
{"title":"Tensor-driven face recognition: Integrating super-resolution and multilinear subspace learning for low-resolution images","authors":"Sana Bellili ,&nbsp;Abdelmalik Ouamane ,&nbsp;Ammar Chouchane ,&nbsp;Yassine Himeur ,&nbsp;Shadi Atalla ,&nbsp;Wathiq Mansoor ,&nbsp;Salah Bourennane ,&nbsp;Faycal Bensaali","doi":"10.1016/j.inffus.2025.103075","DOIUrl":"10.1016/j.inffus.2025.103075","url":null,"abstract":"<div><div>Low-resolution face recognition poses substantial challenges in computer vision and biometrics, especially in real-world applications such as surveillance, where images are often degraded by limited resolution, low lighting, and noise. These constraints impair facial detail and complicate distinguishing features, making it difficult for conventional recognition systems to achieve high accuracy. This paper introduces a novel approach that combines super-resolution techniques with multilinear subspace learning to enhance recognition accuracy in low-resolution conditions. The proposed methodology employs high-order tensors to represent multidimensional face data and leverages Tensor Cross-View Quadratic Discriminant Analysis (TXQDA) as a powerful multilinear subspace learning tool. We evaluated the system’s performance across two distinct scenarios. First, we assessed very low-resolution face images to examine how multilinear subspace learning impacts recognition accuracy, downscaling face images to three spatial scales (16x16, 32x32, and 48x48). In the second scenario, we applied three advanced super-resolution models Real-ESRGAN, SRGAN, and SRResNet exploring the effects of a 4x upscaling factor on recognition accuracy. Our approach was validated using the Labeled Faces in the Wild (LFW), CelebFaces Attributes (CelebA), and QMUL-SurvFace datasets, achieving substantial accuracy improvements and showcasing its potential for biometric security applications. Our method achieved best-in-class verification rates, achieving 99.17% on the LFW dataset, 90.90% on the CelebA dataset, and 76.90% on the QMUL-SurvFace dataset, underscoring its effectiveness in challenging low-resolution conditions.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103075"},"PeriodicalIF":14.7,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143635809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modal multi-relational graph reasoning: A novel model for multimodal textbook comprehension 跨模态多关系图推理:多模态教科书理解的新模式
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-03-11 DOI: 10.1016/j.inffus.2025.103082
Lingyun Song , Wenqing Du , Xiaolin Han , Xinbiao Gan , Xiaoqi Wang , Xuequn Shang
{"title":"Cross-modal multi-relational graph reasoning: A novel model for multimodal textbook comprehension","authors":"Lingyun Song ,&nbsp;Wenqing Du ,&nbsp;Xiaolin Han ,&nbsp;Xinbiao Gan ,&nbsp;Xiaoqi Wang ,&nbsp;Xuequn Shang","doi":"10.1016/j.inffus.2025.103082","DOIUrl":"10.1016/j.inffus.2025.103082","url":null,"abstract":"<div><div>The ability to comprehensively understand multimodal textbook content is crucial for developing advanced intelligent tutoring systems and educational tools powered by generative AI. Earlier studies have advanced the understanding of multimodal content in educational by examining static cross-modal graphs that illustrate the relationships between visual objects and textual words. This, however, fails to account for the changes in relationship structures that characterize the visual-textual relationships in different cross-modal tasks. To tackle this issue, we present the Cross-Modal Multi-Relational Graph Reasoning (CMRGR) model. It is capable of analyzing a wide range of interactions between visual and textual components found in textbooks, allowing it to adapt its internal representation dynamically by utilizing contextual signals across different tasks. This capability is an indispensable asset for developing generative AI systems aimed at educational applications. We evaluate CMRGR’s performance on three multimodal textbook datasets, demonstrating its superiority over state-of-the-art baselines in generating accurate classifications and answers.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103082"},"PeriodicalIF":14.7,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信