Yuta Suzuki, Tatsunori Taniai, Kotaro Saito, Y. Ushiku, Kanta Ono
{"title":"Self-supervised learning of materials concepts from crystal structures via deep neural networks","authors":"Yuta Suzuki, Tatsunori Taniai, Kotaro Saito, Y. Ushiku, Kanta Ono","doi":"10.1088/2632-2153/aca23d","DOIUrl":"https://doi.org/10.1088/2632-2153/aca23d","url":null,"abstract":"Material development involves laborious processes to explore the vast materials space. The key to accelerating these processes is understanding the structure-functionality relationships of materials. Machine learning has enabled large-scale analysis of underlying relationships between materials via their vector representations, or embeddings. However, the learning of material embeddings spanning most known inorganic materials has remained largely unexplored due to the expert knowledge and efforts required to annotate large-scale materials data. Here we show that our self-supervised deep learning approach can successfully learn material embeddings from crystal structures of over 120 000 materials, without any annotations, to capture the structure-functionality relationships among materials. These embeddings revealed the profound similarity between materials, or ‘materials concepts’, such as cuprate superconductors and lithium-ion battery materials from the unannotated structural data. Consequently, our results enable us to both draw a large-scale map of the materials space, capturing various materials concepts, and measure the functionality-aware similarities between materials. Our findings will enable more strategic approaches to material development.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47573492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constraints on parameter choices for successful time-series prediction with echo-state networks","authors":"L. Storm, K. Gustavsson, B. Mehlig","doi":"10.1088/2632-2153/aca1f6","DOIUrl":"https://doi.org/10.1088/2632-2153/aca1f6","url":null,"abstract":"Echo-state networks are simple models of discrete dynamical systems driven by a time series. By selecting network parameters such that the dynamics of the network is contractive, characterized by a negative maximal Lyapunov exponent, the network may synchronize with the driving signal. Exploiting this synchronization, the echo-state network may be trained to autonomously reproduce the input dynamics, enabling time-series prediction. However, while synchronization is a necessary condition for prediction, it is not sufficient. Here, we study what other conditions are necessary for successful time-series prediction. We identify two key parameters for prediction performance, and conduct a parameter sweep to find regions where prediction is successful. These regions differ significantly depending on whether full or partial phase space information about the input is provided to the network during training. We explain how these regions emerge.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43939500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coot optimization based Enhanced Global Pyramid Network for 3D hand pose estimation","authors":"Pallavi Malavath, N. Devarakonda","doi":"10.1088/2632-2153/ac9fa5","DOIUrl":"https://doi.org/10.1088/2632-2153/ac9fa5","url":null,"abstract":"Due to its importance in various applications that need human-computer interaction (HCI), the field of 3D hand pose estimation (HPE) has recently got a lot of attention. The use of technological developments, such as deep learning networks has accelerated the development of reliable 3D HPE systems. Therefore, in this paper, a 3D HPE based on Enhanced Global Pyramid Network (EGPNet) is proposed. Initially, feature extraction is done by backbone model of DetNetwork with improved EGPNet. The EGPNet is enhanced by the Smish activation function. After the feature extraction, the HPE is performed based on 3D pose correction network. Additionally, to enhance the estimation performance, Coot optimization algorithm is used to optimize the error between estimated and ground truth hand pose. The effectiveness of the proposed method is experimented on Bharatanatyam, yoga, Kathakali and sign language datasets with different networks in terms of area under the curve, median end-point-error (EPE) and mean EPE. The Coot optimization is also compared with existing optimization algorithms.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44140783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Piras, H. Peiris, A. Pontzen, Luisa Lucie-Smith, Ningyuan Guo, B. Nord
{"title":"A robust estimator of mutual information for deep learning interpretability","authors":"D. Piras, H. Peiris, A. Pontzen, Luisa Lucie-Smith, Ningyuan Guo, B. Nord","doi":"10.1088/2632-2153/acc444","DOIUrl":"https://doi.org/10.1088/2632-2153/acc444","url":null,"abstract":"We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning (DL) models. To accurately estimate MI from a finite number of samples, we present GMM-MI (pronounced ‘Jimmie’), an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficient, robust to the choice of hyperparameters and provides the uncertainty on the MI estimate due to the finite sample size. We extensively validate GMM-MI on toy data for which the ground truth MI is known, comparing its performance against established MI estimators. We then demonstrate the use of our MI estimator in the context of representation learning, working with synthetic data and physical datasets describing highly non-linear processes. We train DL models to encode high-dimensional data within a meaningful compressed (latent) representation, and use GMM-MI to quantify both the level of disentanglement between the latent variables, and their association with relevant physical quantities, thus unlocking the interpretability of the latent representation. We make GMM-MI publicly available in this GitHub repository.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49651157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simone Ciarella, J. Trinquier, M. Weigt, F. Zamponi
{"title":"Machine-learning-assisted Monte Carlo fails at sampling computationally hard problems","authors":"Simone Ciarella, J. Trinquier, M. Weigt, F. Zamponi","doi":"10.1088/2632-2153/acbe91","DOIUrl":"https://doi.org/10.1088/2632-2153/acbe91","url":null,"abstract":"Several strategies have been recently proposed in order to improve Monte Carlo sampling efficiency using machine learning tools. Here, we challenge these methods by considering a class of problems that are known to be exponentially hard to sample using conventional local Monte Carlo at low enough temperatures. In particular, we study the antiferromagnetic Potts model on a random graph, which reduces to the coloring of random graphs at zero temperature. We test several machine-learning-assisted Monte Carlo approaches, and we find that they all fail. Our work thus provides good benchmarks for future proposals for smart sampling algorithms.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48223119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sina Stocker, J. Gasteiger, Florian Becker, Stephan Günnemann, Johannes T. Margraf
{"title":"How robust are modern graph neural network potentials in long and hot molecular dynamics simulations?","authors":"Sina Stocker, J. Gasteiger, Florian Becker, Stephan Günnemann, Johannes T. Margraf","doi":"10.1088/2632-2153/ac9955","DOIUrl":"https://doi.org/10.1088/2632-2153/ac9955","url":null,"abstract":"Graph neural networks (GNNs) have emerged as a powerful machine learning approach for the prediction of molecular properties. In particular, recently proposed advanced GNN models promise quantum chemical accuracy at a fraction of the computational cost. While the capabilities of such advanced GNNs have been extensively demonstrated on benchmark datasets, there have been few applications in real atomistic simulations. Here, we therefore put the robustness of GNN interatomic potentials to the test, using the recently proposed GemNet architecture as a testbed. Models are trained on the QM7-x database of organic molecules and used to perform extensive molecular dynamics simulations. We find that low test set errors are not sufficient for obtaining stable dynamics and that severe pathologies sometimes only become apparent after hundreds of ps of dynamics. Nonetheless, highly stable and transferable GemNet potentials can be obtained with sufficiently large training sets.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46930916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Khullar, B. Nord, A. Ćiprijanović, J. Poh, Fei Xu
{"title":"DIGS: deep inference of galaxy spectra with neural posterior estimation","authors":"G. Khullar, B. Nord, A. Ćiprijanović, J. Poh, Fei Xu","doi":"10.1088/2632-2153/ac98f4","DOIUrl":"https://doi.org/10.1088/2632-2153/ac98f4","url":null,"abstract":"With the advent of billion-galaxy surveys with complex data, the need of the hour is to efficiently model galaxy spectral energy distributions (SEDs) with robust uncertainty quantification. The combination of simulation-based inference (SBI) and amortized neural posterior estimation (NPE) has been successfully used to analyse simulated and real galaxy photometry both precisely and efficiently. In this work, we utilise this combination and build on existing literature to analyse simulated noisy galaxy spectra. Here, we demonstrate a proof-of-concept study of spectra that is (a) an efficient analysis of galaxy SEDs and inference of galaxy parameters with physically interpretable uncertainties; and (b) amortized calculations of posterior distributions of said galaxy parameters at the modest cost of a few galaxy fits with Markov chain Monte Carlo (MCMC) methods. We utilise the SED generator and inference framework Prospector to generate simulated spectra, and train a dataset of 2 × 106 spectra (corresponding to a five-parameter SED model) with NPE. We show that SBI—with its combination of fast and amortized posterior estimations—is capable of inferring accurate galaxy stellar masses and metallicities. Our uncertainty constraints are comparable to or moderately weaker than traditional inverse-modelling with Bayesian MCMC methods (e.g. 0.17 and 0.26 dex in stellar mass and metallicity for a given galaxy, respectively). We also find that our inference framework conducts rapid SED inference (0.9–1.2 × 105 galaxy spectra via SBI/NPE at the cost of 1 MCMC-based fit). With this work, we set the stage for further work that focuses of SED fitting of galaxy spectra with SBI, in the era of JWST galaxy survey programs and the wide-field Roman Space Telescope spectroscopic surveys.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42575014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Luce, Ali Mahdavi, H. Wankerl, F. Marquardt
{"title":"Investigation of inverse design of multilayer thin-films with conditional invertible neural networks","authors":"Alexander Luce, Ali Mahdavi, H. Wankerl, F. Marquardt","doi":"10.1088/2632-2153/acb48d","DOIUrl":"https://doi.org/10.1088/2632-2153/acb48d","url":null,"abstract":"In this work, we apply conditional invertible neural networks (cINN) to inversely design multilayer thin-films given an optical target in order to overcome limitations of state-of-the-art optimization approaches. Usually, state-of-the-art algorithms depend on a set of carefully chosen initial thin-film parameters or employ neural networks which must be retrained for every new application. We aim to overcome those limitations by training the cINN to learn the loss landscape of all thin-film configurations within a training dataset. We show that cINNs can generate a stochastic ensemble of proposals for thin-film configurations that are reasonably close to the desired target depending only on random variables. By refining the proposed configurations further by a local optimization, we show that the generated thin-films reach the target with significantly greater precision than comparable state-of-the-art approaches. Furthermore, we tested the generative capabilities on samples which are outside of the training data distribution and found that the cINN was able to predict thin-films for out-of-distribution targets, too. The results suggest that in order to improve the generative design of thin-films, it is instructive to use established and new machine learning methods in conjunction in order to obtain the most favorable results.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47591769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A detailed study of interpretability of deep neural network based top taggers","authors":"Ayush Khot, M. Neubauer, Avik Roy","doi":"10.1088/2632-2153/ace0a1","DOIUrl":"https://doi.org/10.1088/2632-2153/ace0a1","url":null,"abstract":"Recent developments in the methods of explainable artificial intelligence (XAI) allow researchers to explore the inner workings of deep neural networks (DNNs), revealing crucial information about input–output relationships and realizing how data connects with machine learning models. In this paper we explore interpretability of DNN models designed to identify jets coming from top quark decay in high energy proton–proton collisions at the Large Hadron Collider. We review a subset of existing top tagger models and explore different quantitative methods to identify which features play the most important roles in identifying the top jets. We also investigate how and why feature importance varies across different XAI metrics, how correlations among features impact their explainability, and how latent space representations encode information as well as correlate with physically meaningful quantities. Our studies uncover some major pitfalls of existing XAI methods and illustrate how they can be overcome to obtain consistent and meaningful interpretation of these models. We additionally illustrate the activity of hidden layers as neural activation pattern diagrams and demonstrate how they can be used to understand how DNNs relay information across the layers and how this understanding can help to make such models significantly simpler by allowing effective model reoptimization and hyperparameter tuning. These studies not only facilitate a methodological approach to interpreting models but also unveil new insights about what these models learn. Incorporating these observations into augmented model design, we propose the particle flow interaction network model and demonstrate how interpretability-inspired model augmentation can improve top tagging performance.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44415275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variational quantum one-class classifier","authors":"Gunhee Park, Joonsuk Huh, D. Park","doi":"10.1088/2632-2153/acafd5","DOIUrl":"https://doi.org/10.1088/2632-2153/acafd5","url":null,"abstract":"One-class classification (OCC) is a fundamental problem in pattern recognition with a wide range of applications. This work presents a semi-supervised quantum machine learning algorithm for such a problem, which we call a variational quantum one-class classifier (VQOCC). The algorithm is suitable for noisy intermediate-scale quantum computing because the VQOCC trains a fully-parameterized quantum autoencoder with a normal dataset and does not require decoding. The performance of the VQOCC is compared with that of the one-class support vector machine (OC-SVM), the kernel principal component analysis (PCA), and the deep convolutional autoencoder (DCAE) using handwritten digit and Fashion-MNIST datasets. The numerical experiment examined various structures of VQOCC by varying data encoding, the number of parameterized quantum circuit layers, and the size of the latent feature space. The benchmark shows that the classification performance of VQOCC is comparable to that of OC-SVM and PCA, although the number of model parameters grows only logarithmically with the data size. The quantum algorithm outperformed DCAE in most cases under similar training conditions. Therefore, our algorithm constitutes an extremely compact and effective machine learning model for OCC.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44218315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}