G. Dong, K. Felker, Alexey Svyatkovskiy, W. Tang, J. Kates-Harbeck
{"title":"FULLY CONVOLUTIONAL SPATIO-TEMPORAL MODELS FOR REPRESENTATION LEARNING IN PLASMA SCIENCE","authors":"G. Dong, K. Felker, Alexey Svyatkovskiy, W. Tang, J. Kates-Harbeck","doi":"10.1615/JMACHLEARNMODELCOMPUT.2021037052","DOIUrl":null,"url":null,"abstract":"We have trained a fully convolutional spatio-temporal model for fast and accurate representation learning in the challenging exemplar application area of fusion energy plasma science. The onset of major disruptions is a critically important fusion energy science (FES) issue that must be resolved for advanced tokamak. While a variety of statistical methods have been used to address the problem of tokamak disruption prediction and control, recent approaches based on deep learning have proven particularly compelling. In the present paper, we introduce further improvements to the fusion recurrent neural network (FRNN) software suite. Up to now, FRNN was based on the long short-term memory (LSTM) variant of recurrent neural networks to leverage the temporal information in the data. Here, we implement and apply the temporal convolutional neural network (TCN) architecture to the time-dependent input signals, thus rendering the FRNN architecture fully convolutional. This allows highly optimized convolution operations to carry the majority of the computational load of training, thus enabling a reduction in training time, and the effective use of high performance computing (HPC) resources for hyperparameter tuning. At the same time, the TCN based architecture achieves equal or better predictive performance when compared with the LSTM architecture for a large, representative fusion database. Across data-rich scientific disciplines, these results have implications for the resource-effective training of general spatio-temporal feature extractors based on deep learning. Moreover, this challenging exemplar case study illustrates the advantages of a predictive platform with flexible architecture selection options capable of being readily tuned and adapted for responding to prediction needs that increasingly arise in large modern observational dataset.","PeriodicalId":8424,"journal":{"name":"arXiv: Computational Physics","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Computational Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1615/JMACHLEARNMODELCOMPUT.2021037052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
We have trained a fully convolutional spatio-temporal model for fast and accurate representation learning in the challenging exemplar application area of fusion energy plasma science. The onset of major disruptions is a critically important fusion energy science (FES) issue that must be resolved for advanced tokamak. While a variety of statistical methods have been used to address the problem of tokamak disruption prediction and control, recent approaches based on deep learning have proven particularly compelling. In the present paper, we introduce further improvements to the fusion recurrent neural network (FRNN) software suite. Up to now, FRNN was based on the long short-term memory (LSTM) variant of recurrent neural networks to leverage the temporal information in the data. Here, we implement and apply the temporal convolutional neural network (TCN) architecture to the time-dependent input signals, thus rendering the FRNN architecture fully convolutional. This allows highly optimized convolution operations to carry the majority of the computational load of training, thus enabling a reduction in training time, and the effective use of high performance computing (HPC) resources for hyperparameter tuning. At the same time, the TCN based architecture achieves equal or better predictive performance when compared with the LSTM architecture for a large, representative fusion database. Across data-rich scientific disciplines, these results have implications for the resource-effective training of general spatio-temporal feature extractors based on deep learning. Moreover, this challenging exemplar case study illustrates the advantages of a predictive platform with flexible architecture selection options capable of being readily tuned and adapted for responding to prediction needs that increasingly arise in large modern observational dataset.