Cross-modal multi-label image classification modeling and recognition based on nonlinear

IF 2.4 Q2 ENGINEERING, MECHANICAL
Shuping Yuan, Yang Chen, Cheng Ye, Mohammed Wasim Bhatt, Mhalasakant Saradeshmukh, Md. Shamim Hossain
{"title":"Cross-modal multi-label image classification modeling and recognition based on nonlinear","authors":"Shuping Yuan, Yang Chen, Cheng Ye, Mohammed Wasim Bhatt, Mhalasakant Saradeshmukh, Md. Shamim Hossain","doi":"10.1515/nleng-2022-0194","DOIUrl":null,"url":null,"abstract":"Abstract Recently, it has become a popular strategy in multi-label image recognition to predict those labels that co-occur in a picture. Previous work has concentrated on capturing label correlation but has neglected to correctly fuse picture features and label embeddings, which has a substantial influence on the model’s convergence efficiency and restricts future multi-label image recognition accuracy improvement. In order to better classify labeled training samples of corresponding categories in the field of image classification, a cross-modal multi-label image classification modeling and recognition method based on nonlinear is proposed. Multi-label classification models based on deep convolutional neural networks are constructed respectively. The visual classification model uses natural images and simple biomedical images with single labels to achieve heterogeneous transfer learning and homogeneous transfer learning, capturing the general features of the general field and the proprietary features of the biomedical field, while the text classification model uses the description text of simple biomedical images to achieve homogeneous transfer learning. The experimental results show that the multi-label classification model combining the two modes can obtain a hamming loss similar to the best performance of the evaluation task, and the macro average F1 value increases from 0.20 to 0.488, which is about 52.5% higher. The cross-modal multi-label image classification algorithm can better alleviate the problem of overfitting in most classes and has better cross-modal retrieval performance. In addition, the effectiveness and rationality of the two cross-modal mapping techniques are verified.","PeriodicalId":37863,"journal":{"name":"Nonlinear Engineering - Modeling and Application","volume":"32 2 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nonlinear Engineering - Modeling and Application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/nleng-2022-0194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract Recently, it has become a popular strategy in multi-label image recognition to predict those labels that co-occur in a picture. Previous work has concentrated on capturing label correlation but has neglected to correctly fuse picture features and label embeddings, which has a substantial influence on the model’s convergence efficiency and restricts future multi-label image recognition accuracy improvement. In order to better classify labeled training samples of corresponding categories in the field of image classification, a cross-modal multi-label image classification modeling and recognition method based on nonlinear is proposed. Multi-label classification models based on deep convolutional neural networks are constructed respectively. The visual classification model uses natural images and simple biomedical images with single labels to achieve heterogeneous transfer learning and homogeneous transfer learning, capturing the general features of the general field and the proprietary features of the biomedical field, while the text classification model uses the description text of simple biomedical images to achieve homogeneous transfer learning. The experimental results show that the multi-label classification model combining the two modes can obtain a hamming loss similar to the best performance of the evaluation task, and the macro average F1 value increases from 0.20 to 0.488, which is about 52.5% higher. The cross-modal multi-label image classification algorithm can better alleviate the problem of overfitting in most classes and has better cross-modal retrieval performance. In addition, the effectiveness and rationality of the two cross-modal mapping techniques are verified.
基于非线性的跨模态多标签图像分类建模与识别
摘要近年来,对图像中共存的标签进行预测已成为多标签图像识别中的一种流行策略。以往的工作主要集中在捕获标签相关性,而忽略了正确融合图像特征和标签嵌入,这对模型的收敛效率有很大影响,并制约了未来多标签图像识别精度的提高。为了在图像分类领域更好地对相应类别的标记训练样本进行分类,提出了一种基于非线性的跨模态多标签图像分类建模与识别方法。分别构建了基于深度卷积神经网络的多标签分类模型。视觉分类模型使用自然图像和带有单一标签的简单生物医学图像实现异构迁移学习和同质迁移学习,捕获一般领域的一般特征和生物医学领域的专有特征,而文本分类模型使用简单生物医学图像的描述文本实现同质迁移学习。实验结果表明,结合两种模式的多标签分类模型可以获得与评价任务最佳性能相近的汉明损失,宏观平均F1值从0.20提高到0.488,提高了约52.5%。跨模态多标签图像分类算法可以较好地缓解大多数类的过拟合问题,具有较好的跨模态检索性能。此外,还验证了两种跨模态映射技术的有效性和合理性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.20
自引率
3.60%
发文量
49
审稿时长
44 weeks
期刊介绍: The Journal of Nonlinear Engineering aims to be a platform for sharing original research results in theoretical, experimental, practical, and applied nonlinear phenomena within engineering. It serves as a forum to exchange ideas and applications of nonlinear problems across various engineering disciplines. Articles are considered for publication if they explore nonlinearities in engineering systems, offering realistic mathematical modeling, utilizing nonlinearity for new designs, stabilizing systems, understanding system behavior through nonlinearity, optimizing systems based on nonlinear interactions, and developing algorithms to harness and leverage nonlinear elements.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信