FULLY CONVOLUTIONAL SPATIO-TEMPORAL MODELS FOR REPRESENTATION LEARNING IN PLASMA SCIENCE

arXiv: Computational Physics Pub Date : 2020-07-20 DOI:10.1615/JMACHLEARNMODELCOMPUT.2021037052

G. Dong, K. Felker, Alexey Svyatkovskiy, W. Tang, J. Kates-Harbeck

{"title":"FULLY CONVOLUTIONAL SPATIO-TEMPORAL MODELS FOR REPRESENTATION LEARNING IN PLASMA SCIENCE","authors":"G. Dong, K. Felker, Alexey Svyatkovskiy, W. Tang, J. Kates-Harbeck","doi":"10.1615/JMACHLEARNMODELCOMPUT.2021037052","DOIUrl":null,"url":null,"abstract":"We have trained a fully convolutional spatio-temporal model for fast and accurate representation learning in the challenging exemplar application area of fusion energy plasma science. The onset of major disruptions is a critically important fusion energy science (FES) issue that must be resolved for advanced tokamak. While a variety of statistical methods have been used to address the problem of tokamak disruption prediction and control, recent approaches based on deep learning have proven particularly compelling. In the present paper, we introduce further improvements to the fusion recurrent neural network (FRNN) software suite. Up to now, FRNN was based on the long short-term memory (LSTM) variant of recurrent neural networks to leverage the temporal information in the data. Here, we implement and apply the temporal convolutional neural network (TCN) architecture to the time-dependent input signals, thus rendering the FRNN architecture fully convolutional. This allows highly optimized convolution operations to carry the majority of the computational load of training, thus enabling a reduction in training time, and the effective use of high performance computing (HPC) resources for hyperparameter tuning. At the same time, the TCN based architecture achieves equal or better predictive performance when compared with the LSTM architecture for a large, representative fusion database. Across data-rich scientific disciplines, these results have implications for the resource-effective training of general spatio-temporal feature extractors based on deep learning. Moreover, this challenging exemplar case study illustrates the advantages of a predictive platform with flexible architecture selection options capable of being readily tuned and adapted for responding to prediction needs that increasingly arise in large modern observational dataset.","PeriodicalId":8424,"journal":{"name":"arXiv: Computational Physics","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Computational Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1615/JMACHLEARNMODELCOMPUT.2021037052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

We have trained a fully convolutional spatio-temporal model for fast and accurate representation learning in the challenging exemplar application area of fusion energy plasma science. The onset of major disruptions is a critically important fusion energy science (FES) issue that must be resolved for advanced tokamak. While a variety of statistical methods have been used to address the problem of tokamak disruption prediction and control, recent approaches based on deep learning have proven particularly compelling. In the present paper, we introduce further improvements to the fusion recurrent neural network (FRNN) software suite. Up to now, FRNN was based on the long short-term memory (LSTM) variant of recurrent neural networks to leverage the temporal information in the data. Here, we implement and apply the temporal convolutional neural network (TCN) architecture to the time-dependent input signals, thus rendering the FRNN architecture fully convolutional. This allows highly optimized convolution operations to carry the majority of the computational load of training, thus enabling a reduction in training time, and the effective use of high performance computing (HPC) resources for hyperparameter tuning. At the same time, the TCN based architecture achieves equal or better predictive performance when compared with the LSTM architecture for a large, representative fusion database. Across data-rich scientific disciplines, these results have implications for the resource-effective training of general spatio-temporal feature extractors based on deep learning. Moreover, this challenging exemplar case study illustrates the advantages of a predictive platform with flexible architecture selection options capable of being readily tuned and adapted for responding to prediction needs that increasingly arise in large modern observational dataset.

查看原文本刊更多论文

等离子体科学表征学习的全卷积时空模型

在具有挑战性的聚变能等离子体科学范例应用领域，我们训练了一个用于快速准确表征学习的全卷积时空模型。重大干扰的发生是先进托卡马克必须解决的一个至关重要的聚变能科学问题。虽然已经使用了各种统计方法来解决托卡马克破坏预测和控制的问题，但最近基于深度学习的方法已被证明特别引人注目。在本文中，我们介绍了融合递归神经网络(FRNN)软件套件的进一步改进。到目前为止，FRNN是基于递归神经网络的长短期记忆(LSTM)变体来利用数据中的时间信息。在这里，我们将时间卷积神经网络(TCN)架构实现并应用于时变输入信号，从而使FRNN架构完全卷积。这允许高度优化的卷积操作来承担训练的大部分计算负载，从而减少训练时间，并有效地利用高性能计算(HPC)资源进行超参数调优。同时，对于具有代表性的大型融合数据库，与LSTM体系结构相比，基于TCN的体系结构具有相同或更好的预测性能。在数据丰富的科学学科中，这些结果对基于深度学习的通用时空特征提取器的资源有效训练具有重要意义。此外，这个具有挑战性的范例案例研究说明了具有灵活架构选择选项的预测平台的优势，该平台能够随时进行调整和调整，以响应大型现代观测数据集中日益增加的预测需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv: Computational Physics

自引率

0.00%

发文量