Multimodal multimedia information retrieval through the integration of fuzzy clustering, OWA-based fusion, and Siamese neural networks

IF 3.2 1区 数学 Q2 COMPUTER SCIENCE, THEORY & METHODS
Saeid Sattari , Sinan Kalkan , Adnan Yazici
{"title":"Multimodal multimedia information retrieval through the integration of fuzzy clustering, OWA-based fusion, and Siamese neural networks","authors":"Saeid Sattari ,&nbsp;Sinan Kalkan ,&nbsp;Adnan Yazici","doi":"10.1016/j.fss.2025.109419","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents an end-to-end, scalable, and flexible framework for multimodal multimedia information retrieval (MMIR). This framework is designed to handle multiple data modalities, such as visual, audio, and text, frequently encountered in real-world applications. By integrating these different data types, this framework facilitates a more holistic understanding of information, thus improving the accuracy and reliability of retrieval tasks. One of the strengths of this framework is its ability to learn semantic relationships within and between modalities through advanced deep neural networks. These networks are trained on query-hit pairs generated from query logs. A major innovation of this approach lies in the efficient handling of multimodal data uncertainty through an improved fuzzy clustering technique. Additionally, the search process is refined through the use of triplet-loss Siamese networks for sophisticated reranking, as well as a novel fusion approach using the ordered weighted average (OWA) operator to combine the ranks of different retrieval systems. This framework leverages parallel processing and transfer learning for efficient feature extraction across different modalities, thus significantly improving scalability and adaptability. Performance has been rigorously evaluated through comprehensive testing on six widely recognized multimodal datasets. The results indicate that this integrated approach, which combines clustering ranking, triplet loss Siamese network for reranking, OWA-based fusion, and the alternative adaptive fuzzy means method (AAFCM) for soft clustering, consistently outperforms all previous configurations reported in the literature. Our experimental results, supported by extensive statistical analysis, confirm the effectiveness and robustness of this approach in MMIR.</div></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":"515 ","pages":"Article 109419"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011425001587","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper presents an end-to-end, scalable, and flexible framework for multimodal multimedia information retrieval (MMIR). This framework is designed to handle multiple data modalities, such as visual, audio, and text, frequently encountered in real-world applications. By integrating these different data types, this framework facilitates a more holistic understanding of information, thus improving the accuracy and reliability of retrieval tasks. One of the strengths of this framework is its ability to learn semantic relationships within and between modalities through advanced deep neural networks. These networks are trained on query-hit pairs generated from query logs. A major innovation of this approach lies in the efficient handling of multimodal data uncertainty through an improved fuzzy clustering technique. Additionally, the search process is refined through the use of triplet-loss Siamese networks for sophisticated reranking, as well as a novel fusion approach using the ordered weighted average (OWA) operator to combine the ranks of different retrieval systems. This framework leverages parallel processing and transfer learning for efficient feature extraction across different modalities, thus significantly improving scalability and adaptability. Performance has been rigorously evaluated through comprehensive testing on six widely recognized multimodal datasets. The results indicate that this integrated approach, which combines clustering ranking, triplet loss Siamese network for reranking, OWA-based fusion, and the alternative adaptive fuzzy means method (AAFCM) for soft clustering, consistently outperforms all previous configurations reported in the literature. Our experimental results, supported by extensive statistical analysis, confirm the effectiveness and robustness of this approach in MMIR.
基于模糊聚类、owa融合和Siamese神经网络的多模态多媒体信息检索
提出了一个端到端的、可扩展的、灵活的多模态多媒体信息检索框架。该框架旨在处理在实际应用程序中经常遇到的多种数据模式,如视觉、音频和文本。通过集成这些不同的数据类型,该框架有助于更全面地理解信息,从而提高检索任务的准确性和可靠性。该框架的优势之一是它能够通过高级深度神经网络学习模态内部和模态之间的语义关系。这些网络是根据查询日志生成的查询命中对进行训练的。该方法的一个主要创新在于通过改进的模糊聚类技术有效地处理多模态数据的不确定性。此外,通过使用三重损失Siamese网络进行复杂的重新排序,以及使用有序加权平均(OWA)算子组合不同检索系统的秩的新颖融合方法,改进了搜索过程。该框架利用并行处理和迁移学习在不同模式下进行有效的特征提取,从而显著提高了可扩展性和适应性。通过对六个广泛认可的多模态数据集的全面测试,对性能进行了严格的评估。结果表明,该方法结合了聚类排序、用于重新排序的三重损失Siamese网络、基于owa的融合和用于软聚类的替代自适应模糊均值方法(AAFCM),始终优于文献中报道的所有先前配置。我们的实验结果得到了广泛的统计分析的支持,证实了该方法在MMIR中的有效性和稳健性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Fuzzy Sets and Systems
Fuzzy Sets and Systems 数学-计算机:理论方法
CiteScore
6.50
自引率
17.90%
发文量
321
审稿时长
6.1 months
期刊介绍: Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies. In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信