{"title":"基于对比学习的记忆电阻神经网络原位训练多优化方案","authors":"Feier Xiong, Yue Zhou, Xiaofang Hu, Shukai Duan","doi":"10.1007/s10489-024-05957-2","DOIUrl":null,"url":null,"abstract":"<div><p>Memristor and its crossbar structure have been widely studied and proven to be naturally suitable for implementing vector-matrix multiplier (VMM) operation in neural networks, making it one of the ideal underlying hardware when deploying models on edge smart devices. However, the problem of receiving much useless information is common and the non-ideal characteristics will also affect the system training accuracy and efficiency. Considering these problems, We combine the contrastive learning (CL) into in-situ training process on the memristor crossbar, improving the model feature extraction capability and robustness. Meanwhile, to make the contrastive learning integrate with the crossbar better, we proposed a multi-optimization scheme on the network loss function, model deployment method, and gradient calculation process. We also proposed some compensation strategies to address the key non-ideal characteristics we analyzed and fitted. The test results show that under the scheme proposed, the model for deployment has a high accuracy value at the beginning, reaching 83.18% in only 2 epochs, and can quickly achieve an accuracy of 3.99% increase compared to the average accuracy of the existing algorithms with the energy consumption reduced by about 8 times.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 2","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-optimization scheme for in-situ training of memristor neural network based on contrastive learning\",\"authors\":\"Feier Xiong, Yue Zhou, Xiaofang Hu, Shukai Duan\",\"doi\":\"10.1007/s10489-024-05957-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Memristor and its crossbar structure have been widely studied and proven to be naturally suitable for implementing vector-matrix multiplier (VMM) operation in neural networks, making it one of the ideal underlying hardware when deploying models on edge smart devices. However, the problem of receiving much useless information is common and the non-ideal characteristics will also affect the system training accuracy and efficiency. Considering these problems, We combine the contrastive learning (CL) into in-situ training process on the memristor crossbar, improving the model feature extraction capability and robustness. Meanwhile, to make the contrastive learning integrate with the crossbar better, we proposed a multi-optimization scheme on the network loss function, model deployment method, and gradient calculation process. We also proposed some compensation strategies to address the key non-ideal characteristics we analyzed and fitted. The test results show that under the scheme proposed, the model for deployment has a high accuracy value at the beginning, reaching 83.18% in only 2 epochs, and can quickly achieve an accuracy of 3.99% increase compared to the average accuracy of the existing algorithms with the energy consumption reduced by about 8 times.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 2\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-024-05957-2\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-05957-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Multi-optimization scheme for in-situ training of memristor neural network based on contrastive learning
Memristor and its crossbar structure have been widely studied and proven to be naturally suitable for implementing vector-matrix multiplier (VMM) operation in neural networks, making it one of the ideal underlying hardware when deploying models on edge smart devices. However, the problem of receiving much useless information is common and the non-ideal characteristics will also affect the system training accuracy and efficiency. Considering these problems, We combine the contrastive learning (CL) into in-situ training process on the memristor crossbar, improving the model feature extraction capability and robustness. Meanwhile, to make the contrastive learning integrate with the crossbar better, we proposed a multi-optimization scheme on the network loss function, model deployment method, and gradient calculation process. We also proposed some compensation strategies to address the key non-ideal characteristics we analyzed and fitted. The test results show that under the scheme proposed, the model for deployment has a high accuracy value at the beginning, reaching 83.18% in only 2 epochs, and can quickly achieve an accuracy of 3.99% increase compared to the average accuracy of the existing algorithms with the energy consumption reduced by about 8 times.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.