{"title":"结合知识蒸馏和迁移学习的可见光和热像仪人物分类传感器融合","authors":"Vijay John, Yasutomo Kawanishi","doi":"10.23919/MVA57639.2023.10215818","DOIUrl":null,"url":null,"abstract":"Visible and thermal camera-based sensor fusion has shown to address the limitations and enhance the robustness of visible camera-based person classification. In this paper, we propose to further enhance the classification accuracy of visible-thermal person classification using transfer learning, knowledge distillation, and the vision transformer. In our work, the visible-thermal person classifier is implemented using the vision transformer. The proposed classifier is trained using the transfer learning and knowledge distillation techniques. To train the proposed classifier, visible and thermal teacher models are implemented using the vision transformers. The multimodal classifier learns from the two teachers using a novel loss function which incorporates the knowledge distillation. The proposed method is validated on the public Speaking Faces dataset. A comparative analysis with baseline algorithms and an ablation study is performed. The results show that the proposed framework reports an enhanced classification accuracy.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Combining Knowledge Distillation and Transfer Learning for Sensor Fusion in Visible and Thermal Camera-based Person Classification\",\"authors\":\"Vijay John, Yasutomo Kawanishi\",\"doi\":\"10.23919/MVA57639.2023.10215818\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visible and thermal camera-based sensor fusion has shown to address the limitations and enhance the robustness of visible camera-based person classification. In this paper, we propose to further enhance the classification accuracy of visible-thermal person classification using transfer learning, knowledge distillation, and the vision transformer. In our work, the visible-thermal person classifier is implemented using the vision transformer. The proposed classifier is trained using the transfer learning and knowledge distillation techniques. To train the proposed classifier, visible and thermal teacher models are implemented using the vision transformers. The multimodal classifier learns from the two teachers using a novel loss function which incorporates the knowledge distillation. The proposed method is validated on the public Speaking Faces dataset. A comparative analysis with baseline algorithms and an ablation study is performed. The results show that the proposed framework reports an enhanced classification accuracy.\",\"PeriodicalId\":338734,\"journal\":{\"name\":\"2023 18th International Conference on Machine Vision and Applications (MVA)\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 18th International Conference on Machine Vision and Applications (MVA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/MVA57639.2023.10215818\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 18th International Conference on Machine Vision and Applications (MVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/MVA57639.2023.10215818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Combining Knowledge Distillation and Transfer Learning for Sensor Fusion in Visible and Thermal Camera-based Person Classification
Visible and thermal camera-based sensor fusion has shown to address the limitations and enhance the robustness of visible camera-based person classification. In this paper, we propose to further enhance the classification accuracy of visible-thermal person classification using transfer learning, knowledge distillation, and the vision transformer. In our work, the visible-thermal person classifier is implemented using the vision transformer. The proposed classifier is trained using the transfer learning and knowledge distillation techniques. To train the proposed classifier, visible and thermal teacher models are implemented using the vision transformers. The multimodal classifier learns from the two teachers using a novel loss function which incorporates the knowledge distillation. The proposed method is validated on the public Speaking Faces dataset. A comparative analysis with baseline algorithms and an ablation study is performed. The results show that the proposed framework reports an enhanced classification accuracy.