Jian-Gang Kong, Ke-Lin Zhao, Jian Li, Qing-Xu Li, Yu Liu, Rui Zhang, Jia-Ji Zhu and Kai Chang
{"title":"Self-supervised representations and node embedding graph neural networks for accurate and multi-scale analysis of materials","authors":"Jian-Gang Kong, Ke-Lin Zhao, Jian Li, Qing-Xu Li, Yu Liu, Rui Zhang, Jia-Ji Zhu and Kai Chang","doi":"10.1088/2632-2153/ad612b","DOIUrl":"https://doi.org/10.1088/2632-2153/ad612b","url":null,"abstract":"Supervised machine learning algorithms, such as graph neural networks (GNN), have successfully predicted material properties. However, the superior performance of GNN usually relies on end-to-end learning on large material datasets, which may lose the physical insight of multi-scale information about materials. And the process of labeling data consumes many resources and inevitably introduces errors, which constrains the accuracy of prediction. We propose to train the GNN model by self-supervised learning on the node and edge information of the crystal graph. Compared with the popular manually constructed material descriptors, the self-supervised atomic representation can reach better prediction performance on material properties. Furthermore, it may provide physical insights by tuning the range information. Applying the self-supervised atomic representation on the magnetic moment datasets, we show how they can extract rules and information from the magnetic materials. To incorporate rich physical information into the GNN model, we develop the node embedding graph neural networks (NEGNN) framework and show significant improvements in the prediction performance. The self-supervised material representation and the NEGNN framework may investigate in-depth information from materials and can be applied to small datasets with increased prediction accuracy.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"64 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wojciech G Stark, Cas van der Oord, Ilyes Batatia, Yaolong Zhang, Bin Jiang, Gábor Csányi and Reinhard J Maurer
{"title":"Benchmarking of machine learning interatomic potentials for reactive hydrogen dynamics at metal surfaces","authors":"Wojciech G Stark, Cas van der Oord, Ilyes Batatia, Yaolong Zhang, Bin Jiang, Gábor Csányi and Reinhard J Maurer","doi":"10.1088/2632-2153/ad5f11","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f11","url":null,"abstract":"Simulations of chemical reaction probabilities in gas surface dynamics require the calculation of ensemble averages over many tens of thousands of reaction events to predict dynamical observables that can be compared to experiments. At the same time, the energy landscapes need to be accurately mapped, as small errors in barriers can lead to large deviations in reaction probabilities. This brings a particularly interesting challenge for machine learning interatomic potentials, which are becoming well-established tools to accelerate molecular dynamics simulations. We compare state-of-the-art machine learning interatomic potentials with a particular focus on their inference performance on CPUs and suitability for high throughput simulation of reactive chemistry at surfaces. The considered models include polarizable atom interaction neural networks (PaiNN), recursively embedded atom neural networks (REANN), the MACE equivariant graph neural network, and atomic cluster expansion potentials (ACE). The models are applied to a dataset on reactive molecular hydrogen scattering on low-index surface facets of copper. All models are assessed for their accuracy, time-to-solution, and ability to simulate reactive sticking probabilities as a function of the rovibrational initial state and kinetic incidence energy of the molecule. REANN and MACE models provide the best balance between accuracy and time-to-solution and can be considered the current state-of-the-art in gas-surface dynamics. PaiNN models require many features for the best accuracy, which causes significant losses in computational efficiency. ACE models provide the fastest time-to-solution, however, models trained on the existing dataset were not able to achieve sufficiently accurate predictions in all cases.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"19 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep probabilistic direction prediction in 3D with applications to directional dark matter detectors","authors":"Majd Ghrear, Peter Sadowski and Sven E Vahsen","doi":"10.1088/2632-2153/ad5f13","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f13","url":null,"abstract":"We present the first method to probabilistically predict 3D direction in a deep neural network model. The probabilistic predictions are modeled as a heteroscedastic von Mises-Fisher distribution on the sphere , giving a simple way to quantify aleatoric uncertainty. This approach generalizes the cosine distance loss which is a special case of our loss function when the uncertainty is assumed to be uniform across samples. We develop approximations required to make the likelihood function and gradient calculations stable. The method is applied to the task of predicting the 3D directions of electrons, the most complex signal in a class of experimental particle physics detectors designed to demonstrate the particle nature of dark matter and study solar neutrinos. Using simulated Monte Carlo data, the initial direction of recoiling electrons is inferred from their tortuous trajectories, as captured by the 3D detectors. For keV electrons in a 70% He 30% CO2 gas mixture at STP, the new approach achieves a mean cosine distance of 0.104 (26∘) compared to 0.556 (64∘) achieved by a non-machine learning algorithm. We show that the model is well-calibrated and accuracy can be increased further by removing samples with high predicted uncertainty. This advancement in probabilistic 3D directional learning could increase the sensitivity of directional dark matter detectors.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"24 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thang M Pham, Nam Do, Ha T T Pham, Hanh T Bui, Thang T Do and Manh V Hoang
{"title":"CResU-Net: a method for landslide mapping using deep learning","authors":"Thang M Pham, Nam Do, Ha T T Pham, Hanh T Bui, Thang T Do and Manh V Hoang","doi":"10.1088/2632-2153/ad5f17","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f17","url":null,"abstract":"Landslides, which can occur due to earthquakes and heavy rainfall, pose significant challenges across large areas. To effectively manage these disasters, it is crucial to have fast and reliable automatic detection methods for mapping landslides. In recent years, deep learning methods, particularly convolutional neural and fully convolutional networks, have been successfully applied to various fields, including landslide detection, with remarkable accuracy and high reliability. However, most of these models achieved high detection performance based on high-resolution satellite images. In this research, we introduce a modified Residual U-Net combined with the Convolutional Block Attention Module, a deep learning method, for automatic landslide mapping. The proposed method is trained and assessed using freely available data sets acquired from Sentinel-2 sensors, digital elevation models, and slope data from ALOS PALSAR with a spatial resolution of 10 m. Compared to the original ResU-Net model, the proposed architecture achieved higher accuracy, with the F1-score improving by 9.1% for the landslide class. Additionally, it offers a lower computational cost, with 1.38 giga multiply-accumulate operations per second (GMACS) needed to execute the model compared to 2.68 GMACS in the original model. The source code is available at https://github.com/manhhv87/LandSlideMapping.git.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"235 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141588574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"End-to-end simulation of particle physics events with flow matching and generator oversampling","authors":"F Vaselli, F Cattafesta, P Asenov, A Rizzi","doi":"10.1088/2632-2153/ad563c","DOIUrl":"https://doi.org/10.1088/2632-2153/ad563c","url":null,"abstract":"The simulation of high-energy physics collision events is a key element for data analysis at present and future particle accelerators. The comparison of simulation predictions to data allows looking for rare deviations that can be due to new phenomena not previously observed. We show that novel machine learning algorithms, specifically Normalizing Flows and Flow Matching, can be used to replicate accurate simulations from traditional approaches with several orders of magnitude of speed-up. The classical simulation chain starts from a physics process of interest, computes energy deposits of particles and electronics response, and finally employs the same reconstruction algorithms used for data. Eventually, the data are reduced to some high-level analysis format. Instead, we propose an end-to-end approach, simulating the final data format directly from physical generator inputs, skipping any intermediate steps. We use particle jets simulation as a benchmark for comparing both <italic toggle=\"yes\">discrete</italic> and <italic toggle=\"yes\">continuous</italic> Normalizing Flows models. The models are validated across a variety of metrics to identify the most accurate. We discuss the scaling of performance with the increase in training data, as well as the generalization power of these models on physical processes different from the training one. We investigate sampling multiple times from the same physical generator inputs, a procedure we name <italic toggle=\"yes\">oversampling</italic>, and we show that it can effectively reduce the statistical uncertainties of a dataset. This class of ML algorithms is found to be capable of learning the expected detector response independently of the physical input process. The speed and accuracy of the models, coupled with the stability of the training procedure, make them a compelling tool for the needs of current and future experiments.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"38 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Uncertainty quantification by direct propagation of shallow ensembles","authors":"Matthias Kellner and Michele Ceriotti","doi":"10.1088/2632-2153/ad594a","DOIUrl":"https://doi.org/10.1088/2632-2153/ad594a","url":null,"abstract":"Statistical learning algorithms provide a generally-applicable framework to sidestep time-consuming experiments, or accurate physics-based modeling, but they introduce a further source of error on top of the intrinsic limitations of the experimental or theoretical setup. Uncertainty estimation is essential to quantify this error, and to make application of data-centric approaches more trustworthy. To ensure that uncertainty quantification is used widely, one should aim for algorithms that are accurate, but also easy to implement and apply. In particular, including uncertainty quantification on top of an existing architecture should be straightforward, and add minimal computational overhead. Furthermore, it should be easy to manipulate or combine multiple machine-learning predictions, propagating uncertainty over further modeling steps. We compare several well-established uncertainty quantification frameworks against these requirements, and propose a practical approach, which we dub direct propagation of shallow ensembles, that provides a good compromise between ease of use and accuracy. We present benchmarks for generic datasets, and an in-depth study of applications to the field of atomistic machine learning for chemistry and materials. These examples underscore the importance of using a formulation that allows propagating errors without making strong assumptions on the correlations between different predictions of the model.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"13 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AMCG: a graph dual atomic-molecular conditional molecular generator","authors":"Carlo Abate, Sergio Decherchi and Andrea Cavalli","doi":"10.1088/2632-2153/ad5bbf","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5bbf","url":null,"abstract":"Drug design is both a time consuming and expensive endeavour. Computational strategies offer viable options to address this task; deep learning approaches in particular are indeed gaining traction for their capability of dealing with chemical structures. A straightforward way to represent such structures is via their molecular graph, which in turn can be naturally processed by graph neural networks. This paper introduces AMCG, a dual atomic-molecular, conditional, latent-space, generative model built around graph processing layers able to support both unconditional and conditional molecular graph generation. Among other features, AMCG is a one-shot model allowing for fast sampling, explicit atomic type histogram assignation and property optimization via gradient ascent. The model was trained on the Quantum Machines 9 (QM9) and ZINC datasets, achieving state-of-the-art performances. Together with classic benchmarks, AMCG was also tested by generating large-scale sampled sets, showing robustness in terms of sustainable throughput of valid, novel and unique molecules.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"44 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The R-mAtrIx Net","authors":"Shailesh Lal, Suvajit Majumder and Evgeny Sobko","doi":"10.1088/2632-2153/ad56f9","DOIUrl":"https://doi.org/10.1088/2632-2153/ad56f9","url":null,"abstract":"We provide a novel neural network architecture that can: i) output R-matrix for a given quantum integrable spin chain, ii) search for an integrable Hamiltonian and the corresponding R-matrix under assumptions of certain symmetries or other restrictions, iii) explore the space of Hamiltonians around already learned models and reconstruct the family of integrable spin chains which they belong to. The neural network training is done by minimizing loss functions encoding Yang–Baxter equation, regularity and other model-specific restrictions such as hermiticity. Holomorphy is implemented via the choice of activation functions. We demonstrate the work of our neural network on the spin chains of difference form with two-dimensional local space. In particular, we reconstruct the R-matrices for all 14 classes. We also demonstrate its utility as an Explorer, scanning a certain subspace of Hamiltonians and identifying integrable classes after clusterisation. The last strategy can be used in future to carve out the map of integrable spin chains with higher dimensional local space and in more general settings where no analytical methods are available.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"11 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141552911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giles C Strong, Maxime Lagrange, Aitor Orio, Anna Bordignon, Florian Bury, Tommaso Dorigo, Andrea Giammanco, Mariam Heikal, Jan Kieseler, Max Lamparth, Pablo Martínez Ruíz del Árbol, Federico Nardi, Pietro Vischia and Haitham Zaraket
{"title":"TomOpt: differential optimisation for task- and constraint-aware design of particle detectors in the context of muon tomography","authors":"Giles C Strong, Maxime Lagrange, Aitor Orio, Anna Bordignon, Florian Bury, Tommaso Dorigo, Andrea Giammanco, Mariam Heikal, Jan Kieseler, Max Lamparth, Pablo Martínez Ruíz del Árbol, Federico Nardi, Pietro Vischia and Haitham Zaraket","doi":"10.1088/2632-2153/ad52e7","DOIUrl":"https://doi.org/10.1088/2632-2153/ad52e7","url":null,"abstract":"We describe a software package, TomOpt, developed to optimise the geometrical layout and specifications of detectors designed for tomography by scattering of cosmic-ray muons. The software exploits differentiable programming for the modeling of muon interactions with detectors and scanned volumes, the inference of volume properties, and the optimisation cycle performing the loss minimisation. In doing so, we provide the first demonstration of end-to-end-differentiable and inference-aware optimisation of particle physics instruments. We study the performance of the software on a relevant benchmark scenario and discuss its potential applications. Our code is available on Github (Strong et al 2024 available at: https://github.com/GilesStrong/tomopt).","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 3 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Hagemann, Johannes Hertrich, Maren Casfor, Sebastian Heidenreich and Gabriele Steidl
{"title":"Mixed noise and posterior estimation with conditional deepGEM","authors":"Paul Hagemann, Johannes Hertrich, Maren Casfor, Sebastian Heidenreich and Gabriele Steidl","doi":"10.1088/2632-2153/ad5926","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5926","url":null,"abstract":"We develop an algorithm for jointly estimating the posterior and the noise parameters in Bayesian inverse problems, which is motivated by indirect measurements and applications from nanometrology with a mixed noise model. We propose to solve the problem by an expectation maximization (EM) algorithm. Based on the current noise parameters, we learn in the E-step a conditional normalizing flow that approximates the posterior. In the M-step, we propose to find the noise parameter updates again by an EM algorithm, which has analytical formulas. We compare the training of the conditional normalizing flow with the forward and reverse Kullback–Leibler divergence, and show that our model is able to incorporate information from many measurements, unlike previous approaches.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"86 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}