{"title":"MLFormer: Unleashing Efficiency Without Attention for Multimodal Knowledge Graph Embedding","authors":"Meng Wang;Changyu Li;Feiyu Chen;Jie Shao;Ke Qin;Shuang Liang","doi":"10.1109/TCSS.2025.3620089","DOIUrl":null,"url":null,"abstract":"Multimodal knowledge graphs (MMKGs) have gained widespread adoption across various domains. However, existing transformer-based methods for MMKG representation learning primarily focus on enhancing representation performance, while overlooking time and memory costs, which reduces model efficiency. To tackle these limitations, we introduce a multimodal lightweight transformer (MLFormer) model, which not only ensures robust representation capabilities but also considerably improves computational efficiency. We find that the self-attention mechanism in transformers leads to substantial performance overheads. As a result, we optimize the traditional MMKGE model in two aspects: modality processing and modality fusion, by incorporating a filter gate and Fourier transform. Our experimental results on real-world multimodal knowledge graph completion datasets demonstrate that MLFormer achieves significant improvements in computational efficiency while maintaining competitive performance.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"12 6","pages":"5536-5549"},"PeriodicalIF":4.5000,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11224713/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal knowledge graphs (MMKGs) have gained widespread adoption across various domains. However, existing transformer-based methods for MMKG representation learning primarily focus on enhancing representation performance, while overlooking time and memory costs, which reduces model efficiency. To tackle these limitations, we introduce a multimodal lightweight transformer (MLFormer) model, which not only ensures robust representation capabilities but also considerably improves computational efficiency. We find that the self-attention mechanism in transformers leads to substantial performance overheads. As a result, we optimize the traditional MMKGE model in two aspects: modality processing and modality fusion, by incorporating a filter gate and Fourier transform. Our experimental results on real-world multimodal knowledge graph completion datasets demonstrate that MLFormer achieves significant improvements in computational efficiency while maintaining competitive performance.
期刊介绍:
IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.