{"title":"Robust supervised learning based on tensor network method","authors":"Y. W. Chen, K. Guo, Y. Pan","doi":"10.1109/YAC.2018.8406391","DOIUrl":null,"url":null,"abstract":"The formalism of Tensor Network (TN) provides a compact way to approximate many-body quantum states with 1D chain of tensors. The 1D chain of tensors is found to be efficient in capturing the local correlations between neighboring subsystems, and machine learning approaches have been proposed using artificial neural networks (NN) of similar structure. However, a long chain of tensors is difficult to train due to exploding and vanishing gradients. In this paper, we propose methods to decompose the long-chain TN into short chains, which could improve the convergence property of the training algorithm by allowing stable stochastic gradient descent (SGD). In addition, the short-chain methods are robust to network initializations. Numerical experiments show that the short-chain TN achieves almost the same classification accuracy on MNIST dataset as LeNet-5 with less trainable network parameters and connections.","PeriodicalId":226586,"journal":{"name":"2018 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/YAC.2018.8406391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The formalism of Tensor Network (TN) provides a compact way to approximate many-body quantum states with 1D chain of tensors. The 1D chain of tensors is found to be efficient in capturing the local correlations between neighboring subsystems, and machine learning approaches have been proposed using artificial neural networks (NN) of similar structure. However, a long chain of tensors is difficult to train due to exploding and vanishing gradients. In this paper, we propose methods to decompose the long-chain TN into short chains, which could improve the convergence property of the training algorithm by allowing stable stochastic gradient descent (SGD). In addition, the short-chain methods are robust to network initializations. Numerical experiments show that the short-chain TN achieves almost the same classification accuracy on MNIST dataset as LeNet-5 with less trainable network parameters and connections.