Zaiyu Pan;Shuangtian Jiang;Xiao Yang;Hai Yuan;Jun Wang
{"title":"为缺失模态的多模态生物识别生成分层跨模态图像","authors":"Zaiyu Pan;Shuangtian Jiang;Xiao Yang;Hai Yuan;Jun Wang","doi":"10.1109/TIFS.2025.3559802","DOIUrl":null,"url":null,"abstract":"Multimodal biometric recognition has shown great potential in identity authentication tasks and has attracted increasing interest recently. Currently, most existing multimodal biometric recognition algorithms require test samples with complete multimodal data. However, it often encounters the problem of missing modality data and thus suffers severe performance degradation in practical scenarios. To this end, we proposed a hierarchical cross-modal image generation for palmprint and palmvein based multimodal biometric recognition with missing modality. First, a hierarchical cross-modal image generation model is designed to achieve the pixel alignment of different modalities and reconstruct the image information of missing modality. Specifically, a cross-modal texture transfer network is utilized to implement the texture style transformation between different modalities, and then a cross-modal structure generation network is proposed to establish the correlation mapping of structural information between different modalities. Second, multimodal dynamic sparse feature fusion model is presented to obtain more discriminative and reliable representations, which can also enhance the robustness of our proposed model to dynamic changes in image quality of different modalities. The proposed model is evaluated on three multimodal biometric benchmark datasets, and experimental results demonstrate that our proposed model outperforms recent mainstream incomplete multimodal learning models.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"4308-4321"},"PeriodicalIF":6.3000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Cross-Modal Image Generation for Multimodal Biometric Recognition With Missing Modality\",\"authors\":\"Zaiyu Pan;Shuangtian Jiang;Xiao Yang;Hai Yuan;Jun Wang\",\"doi\":\"10.1109/TIFS.2025.3559802\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal biometric recognition has shown great potential in identity authentication tasks and has attracted increasing interest recently. Currently, most existing multimodal biometric recognition algorithms require test samples with complete multimodal data. However, it often encounters the problem of missing modality data and thus suffers severe performance degradation in practical scenarios. To this end, we proposed a hierarchical cross-modal image generation for palmprint and palmvein based multimodal biometric recognition with missing modality. First, a hierarchical cross-modal image generation model is designed to achieve the pixel alignment of different modalities and reconstruct the image information of missing modality. Specifically, a cross-modal texture transfer network is utilized to implement the texture style transformation between different modalities, and then a cross-modal structure generation network is proposed to establish the correlation mapping of structural information between different modalities. Second, multimodal dynamic sparse feature fusion model is presented to obtain more discriminative and reliable representations, which can also enhance the robustness of our proposed model to dynamic changes in image quality of different modalities. The proposed model is evaluated on three multimodal biometric benchmark datasets, and experimental results demonstrate that our proposed model outperforms recent mainstream incomplete multimodal learning models.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"4308-4321\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10962253/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10962253/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Hierarchical Cross-Modal Image Generation for Multimodal Biometric Recognition With Missing Modality
Multimodal biometric recognition has shown great potential in identity authentication tasks and has attracted increasing interest recently. Currently, most existing multimodal biometric recognition algorithms require test samples with complete multimodal data. However, it often encounters the problem of missing modality data and thus suffers severe performance degradation in practical scenarios. To this end, we proposed a hierarchical cross-modal image generation for palmprint and palmvein based multimodal biometric recognition with missing modality. First, a hierarchical cross-modal image generation model is designed to achieve the pixel alignment of different modalities and reconstruct the image information of missing modality. Specifically, a cross-modal texture transfer network is utilized to implement the texture style transformation between different modalities, and then a cross-modal structure generation network is proposed to establish the correlation mapping of structural information between different modalities. Second, multimodal dynamic sparse feature fusion model is presented to obtain more discriminative and reliable representations, which can also enhance the robustness of our proposed model to dynamic changes in image quality of different modalities. The proposed model is evaluated on three multimodal biometric benchmark datasets, and experimental results demonstrate that our proposed model outperforms recent mainstream incomplete multimodal learning models.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features