Minkang Chai , Lu Wei , Zheng Qian , Ran Zhang , Ye Zhu , Baoqing Zhou
{"title":"基于渐进式空频注意蒸馏学习的食物多因素解耦识别","authors":"Minkang Chai , Lu Wei , Zheng Qian , Ran Zhang , Ye Zhu , Baoqing Zhou","doi":"10.1016/j.engappai.2025.110379","DOIUrl":null,"url":null,"abstract":"<div><div>With the widespread application of image recognition technology in daily life, food image recognition faces challenges such as diverse categories and complex forms. Particularly when dealing with subtle differences between similar food items, imbalanced categories, feature ambiguities, and classification confusion caused by the coupling of multiple factors in food representation, existing models still have room for improvement in their recognition accuracy and generalization ability. Therefore, constructing a recognition model that can precisely differentiate food categories while effectively addressing the complexities of coupled factors has become a key issue in this field. In response to these challenges, we propose the innovative Progressive Spatial-Frequency Distillation Network (PSFDNet). By utilizing a unique multidimensional progressive learning strategy combined with an adaptive spatial-frequency attention mechanism, the model significantly enhances its feature extraction and discrimination capabilities within complex food structures. Additionally, we introduce the food correlation evaluation loss to decouple the mutual interference among food features effectively, thereby improving the accuracy and robustness of food image recognition. Extensive experiments verified the outstanding performance of PSFDNet across datasets, demonstrating a notable increase of 0.87% in the Top-1 recognition accuracy and a 50% increase in inference speed. Particularly in recognizing food images characterized by highly coupled features and extremely imbalanced categories, PSFDNet exhibited significant performance advantages over other methods.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"147 ","pages":"Article 110379"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Food multi-factor decoupling recognition based on progressive spatial-frequency attention distillation learning\",\"authors\":\"Minkang Chai , Lu Wei , Zheng Qian , Ran Zhang , Ye Zhu , Baoqing Zhou\",\"doi\":\"10.1016/j.engappai.2025.110379\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the widespread application of image recognition technology in daily life, food image recognition faces challenges such as diverse categories and complex forms. Particularly when dealing with subtle differences between similar food items, imbalanced categories, feature ambiguities, and classification confusion caused by the coupling of multiple factors in food representation, existing models still have room for improvement in their recognition accuracy and generalization ability. Therefore, constructing a recognition model that can precisely differentiate food categories while effectively addressing the complexities of coupled factors has become a key issue in this field. In response to these challenges, we propose the innovative Progressive Spatial-Frequency Distillation Network (PSFDNet). By utilizing a unique multidimensional progressive learning strategy combined with an adaptive spatial-frequency attention mechanism, the model significantly enhances its feature extraction and discrimination capabilities within complex food structures. Additionally, we introduce the food correlation evaluation loss to decouple the mutual interference among food features effectively, thereby improving the accuracy and robustness of food image recognition. Extensive experiments verified the outstanding performance of PSFDNet across datasets, demonstrating a notable increase of 0.87% in the Top-1 recognition accuracy and a 50% increase in inference speed. Particularly in recognizing food images characterized by highly coupled features and extremely imbalanced categories, PSFDNet exhibited significant performance advantages over other methods.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"147 \",\"pages\":\"Article 110379\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625003793\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625003793","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Food multi-factor decoupling recognition based on progressive spatial-frequency attention distillation learning
With the widespread application of image recognition technology in daily life, food image recognition faces challenges such as diverse categories and complex forms. Particularly when dealing with subtle differences between similar food items, imbalanced categories, feature ambiguities, and classification confusion caused by the coupling of multiple factors in food representation, existing models still have room for improvement in their recognition accuracy and generalization ability. Therefore, constructing a recognition model that can precisely differentiate food categories while effectively addressing the complexities of coupled factors has become a key issue in this field. In response to these challenges, we propose the innovative Progressive Spatial-Frequency Distillation Network (PSFDNet). By utilizing a unique multidimensional progressive learning strategy combined with an adaptive spatial-frequency attention mechanism, the model significantly enhances its feature extraction and discrimination capabilities within complex food structures. Additionally, we introduce the food correlation evaluation loss to decouple the mutual interference among food features effectively, thereby improving the accuracy and robustness of food image recognition. Extensive experiments verified the outstanding performance of PSFDNet across datasets, demonstrating a notable increase of 0.87% in the Top-1 recognition accuracy and a 50% increase in inference speed. Particularly in recognizing food images characterized by highly coupled features and extremely imbalanced categories, PSFDNet exhibited significant performance advantages over other methods.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.