Ihda Chaerony Siffa, Markus M Becker, Klaus-Dieter Weltmann and Jan Trieschmann
{"title":"Towards a machine-learned Poisson solver for low-temperature plasma simulations in complex geometries","authors":"Ihda Chaerony Siffa, Markus M Becker, Klaus-Dieter Weltmann and Jan Trieschmann","doi":"10.1088/2632-2153/ad4230","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4230","url":null,"abstract":"Poisson’s equation plays an important role in modeling many physical systems. In electrostatic self-consistent low-temperature plasma (LTP) simulations, Poisson’s equation is solved at each simulation time step, which can amount to a significant computational cost for the entire simulation. In this paper, we describe the development of a generic machine-learned Poisson solver specifically designed for the requirements of LTP simulations in complex 2D reactor geometries on structured Cartesian grids. Here, the reactor geometries can consist of inner electrodes and dielectric materials as often found in LTP simulations. The approach leverages a hybrid CNN-transformer network architecture in combination with a weighted multiterm loss function. We train the network using highly randomized synthetic data to ensure the generalizability of the learned solver to unseen reactor geometries. The results demonstrate that the learned solver is able to produce quantitatively and qualitatively accurate solutions. Furthermore, it generalizes well on new reactor geometries such as reference geometries found in the literature. To increase the numerical accuracy of the solutions required in LTP simulations, we employ a conventional iterative solver to refine the raw predictions, especially to recover the high-frequency features not resolved by the initial prediction. With this, the proposed learned Poisson solver provides the required accuracy and is potentially faster than a pure GPU-based conventional iterative solver. This opens up new possibilities for developing a generic and high-performing learned Poisson solver for LTP systems in complex geometries.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140888473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A Bal, T Brandes, F Iemmi, M Klute, B Maier, V Mikuni and T K Årrestad
{"title":"Distilling particle knowledge for fast reconstruction at high-energy physics experiments","authors":"A Bal, T Brandes, F Iemmi, M Klute, B Maier, V Mikuni and T K Årrestad","doi":"10.1088/2632-2153/ad43b1","DOIUrl":"https://doi.org/10.1088/2632-2153/ad43b1","url":null,"abstract":"Knowledge distillation is a form of model compression that allows artificial neural networks of different sizes to learn from one another. Its main application is the compactification of large deep neural networks to free up computational resources, in particular on edge devices. In this article, we consider proton-proton collisions at the High-Luminosity Large Hadron Collider (HL-LHC) and demonstrate a successful knowledge transfer from an event-level graph neural network (GNN) to a particle-level small deep neural network (DNN). Our algorithm, DistillNet, is a DNN that is trained to learn about the provenance of particles, as provided by the soft labels that are the GNN outputs, to predict whether or not a particle originates from the primary interaction vertex. The results indicate that for this problem, which is one of the main challenges at the HL-LHC, there is minimal loss during the transfer of knowledge to the small student network, while improving significantly the computational resource needs compared to the teacher. This is demonstrated for the distilled student network on a CPU, as well as for a quantized and pruned student network deployed on an field programmable gate array. Our study proves that knowledge transfer between networks of different complexity can be used for fast artificial intelligence (AI) in high-energy physics that improves the expressiveness of observables over non-AI-based reconstruction algorithms. Such an approach can become essential at the HL-LHC experiments, e.g. to comply with the resource budget of their trigger stages.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140888488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlos Granero Belinchon and Manuel Cabeza Gallucci
{"title":"A multiscale and multicriteria generative adversarial network to synthesize 1-dimensional turbulent fields","authors":"Carlos Granero Belinchon and Manuel Cabeza Gallucci","doi":"10.1088/2632-2153/ad43b3","DOIUrl":"https://doi.org/10.1088/2632-2153/ad43b3","url":null,"abstract":"This article introduces a new neural network stochastic model to generate a 1-dimensional stochastic field with turbulent velocity statistics. Both the model architecture and training procedure ground on the Kolmogorov and Obukhov statistical theories of fully developed turbulence, so guaranteeing descriptions of (1) energy distribution, (2) energy cascade and (3) intermittency across scales in agreement with experimental observations. The model is a generative adversarial network (GAN) with multiple multiscale optimization criteria. First, we use three physics-based criteria: the variance, skewness and flatness of the increments of the generated field, that retrieve respectively the turbulent energy distribution, energy cascade and intermittency across scales. Second, the GAN criterion, based on reproducing statistical distributions, is used on segments of different length of the generated field. Furthermore, to mimic multiscale decompositions frequently used in turbulence’s studies, the model architecture is fully convolutional with kernel sizes varying along the multiple layers of the model. To train our model, we use turbulent velocity signals from grid turbulence at Modane wind tunnel.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140881389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaime Carracedo-Cosme, Prokop Hapala and Rubén Pérez
{"title":"Atomic force microscopy simulations for CO-functionalized tips with deep learning","authors":"Jaime Carracedo-Cosme, Prokop Hapala and Rubén Pérez","doi":"10.1088/2632-2153/ad3ee6","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3ee6","url":null,"abstract":"Atomic force microscopy (AFM) operating in the frequency modulation mode with a metal tip functionalized with a CO molecule is able to image the internal structure of molecules with an unprecedented resolution. The interpretation of these images is often difficult, making the support of theoretical simulations important. Current simulation methods, particularly the most accurate ones, require expertise and resources to perform ab initio calculations for the necessary inputs (i.e charge density and electrostatic potential of the molecule). Here, we propose a computationally inexpensive and fast alternative to the physical simulation of these AFM images based on a conditional generative adversarial network (CGAN), that avoids all force calculations, and uses as the only input a 2D ball–and–stick depiction of the molecule. We discuss the performance of the model when trained with different subsets extracted from the previously published QUAM-AFM database. Our CGAN reproduces accurately the intramolecular contrast observed in the simulated images for quasi–planar molecules, but has limitations for molecules with a substantial internal corrugation, due to the strictly 2D character of the input.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140840481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaoxun Fan, Andrew L Hitt, Ming Tang, Babak Sadigh and Fei Zhou
{"title":"Accelerate microstructure evolution simulation using graph neural networks with adaptive spatiotemporal resolution","authors":"Shaoxun Fan, Andrew L Hitt, Ming Tang, Babak Sadigh and Fei Zhou","doi":"10.1088/2632-2153/ad3e4b","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3e4b","url":null,"abstract":"Surrogate models driven by sizeable datasets and scientific machine-learning methods have emerged as an attractive microstructure simulation tool with the potential to deliver predictive microstructure evolution dynamics with huge savings in computational costs. Taking 2D and 3D grain growth simulations as an example, we present a completely overhauled computational framework based on graph neural networks with not only excellent agreement to both the ground truth phase-field methods and theoretical predictions, but enhanced accuracy and efficiency compared to previous works based on convolutional neural networks. These improvements can be attributed to the graph representation, both improved predictive power and a more flexible data structure amenable to adaptive mesh refinement. As the simulated microstructures coarsen, our method can adaptively adopt remeshed grids and larger timesteps to achieve further speedup. The data-to-model pipeline with training procedures together with the source codes are provided.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140840480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A Ashtari Esfahani, S Böser, N Buzinsky, M C Carmona-Benitez, R Cervantes, C Claessens, L de Viveiros, M Fertl, J A Formaggio, J K Gaison, L Gladstone, M Grando, M Guigue, J Hartse, K M Heeger, X Huyan, A M Jones, K Kazkaz, M Li, A Lindman, A Marsteller, C Matthé, R Mohiuddin, B Monreal, E C Morrison, R Mueller, J A Nikkel, E Novitski, N S Oblath, J I Peña, W Pettus, R Reimann, R G H Robertson, L Saldaña, M Schram, P L Slocum, J Stachurska, Y-H Sun, P T Surukuchi, A B Telles, F Thomas, M Thomas, L A Thorne, T Thümmler, L Tvrznikova, W Van De Pontseele, B A VanDevender, T E Weiss, T Wendler, E Zayas and A Ziegler
{"title":"Deep learning based event reconstruction for cyclotron radiation emission spectroscopy","authors":"A Ashtari Esfahani, S Böser, N Buzinsky, M C Carmona-Benitez, R Cervantes, C Claessens, L de Viveiros, M Fertl, J A Formaggio, J K Gaison, L Gladstone, M Grando, M Guigue, J Hartse, K M Heeger, X Huyan, A M Jones, K Kazkaz, M Li, A Lindman, A Marsteller, C Matthé, R Mohiuddin, B Monreal, E C Morrison, R Mueller, J A Nikkel, E Novitski, N S Oblath, J I Peña, W Pettus, R Reimann, R G H Robertson, L Saldaña, M Schram, P L Slocum, J Stachurska, Y-H Sun, P T Surukuchi, A B Telles, F Thomas, M Thomas, L A Thorne, T Thümmler, L Tvrznikova, W Van De Pontseele, B A VanDevender, T E Weiss, T Wendler, E Zayas and A Ziegler","doi":"10.1088/2632-2153/ad3ee3","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3ee3","url":null,"abstract":"The objective of the cyclotron radiation emission spectroscopy (CRES) technology is to build precise particle energy spectra. This is achieved by identifying the start frequencies of charged particle trajectories which, when exposed to an external magnetic field, leave semi-linear profiles (called tracks) in the time–frequency plane. Due to the need for excellent instrumental energy resolution in application, highly efficient and accurate track reconstruction methods are desired. Deep learning convolutional neural networks (CNNs) - particularly suited to deal with information-sparse data and which offer precise foreground localization—may be utilized to extract track properties from measured CRES signals (called events) with relative computational ease. In this work, we develop a novel machine learning based model which operates a CNN and a support vector machine in tandem to perform this reconstruction. A primary application of our method is shown on simulated CRES signals which mimic those of the Project 8 experiment—a novel effort to extract the unknown absolute neutrino mass value from a precise measurement of tritium β−-decay energy spectrum. When compared to a point-clustering based technique used as a baseline, we show a relative gain of 24.1% in event reconstruction efficiency and comparable performance in accuracy of track parameter reconstruction.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140840505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovering quantum circuit components with program synthesis","authors":"Leopoldo Sarra, Kevin Ellis and Florian Marquardt","doi":"10.1088/2632-2153/ad4252","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4252","url":null,"abstract":"Despite rapid progress in the field, it is still challenging to discover new ways to leverage quantum computation: all quantum algorithms must be designed by hand, and quantum mechanics is notoriously counterintuitive. In this paper, we study how artificial intelligence, in the form of program synthesis, may help overcome some of these difficulties, by showing how a computer can incrementally learn concepts relevant to quantum circuit synthesis with experience, and reuse them in unseen tasks. In particular, we focus on the decomposition of unitary matrices into quantum circuits, and show how, starting from a set of elementary gates, we can automatically discover a library of useful new composite gates and use them to decompose increasingly complicated unitaries.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140840482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Random feedback alignment algorithms to train neural networks: why do they align?","authors":"Dominique Chu and Florian Bacho","doi":"10.1088/2632-2153/ad3ee5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3ee5","url":null,"abstract":"Feedback alignment algorithms are an alternative to backpropagation to train neural networks, whereby some of the partial derivatives that are required to compute the gradient are replaced by random terms. This essentially transforms the update rule into a random walk in weight space. Surprisingly, learning still works with those algorithms, including training of deep neural networks. The performance of FA is generally attributed to an alignment of the update of the random walker with the true gradient—the eponymous gradient alignment—which drives an approximate gradient descent. The mechanism that leads to this alignment remains unclear, however. In this paper, we use mathematical reasoning and simulations to investigate gradient alignment. We observe that the feedback alignment update rule has fixed points, which correspond to extrema of the loss function. We show that gradient alignment is a stability criterion for those fixed points. It is only a necessary criterion for algorithm performance. Experimentally, we demonstrate that high levels of gradient alignment can lead to poor algorithm performance and that the alignment is not always driving the gradient descent.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140840534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oliver Anton, Victoria A Henderson, Elisa Da Ros, Ivan Sekulic, Sven Burger, Philipp-Immanuel Schneider and Markus Krutzik
{"title":"Review and experimental benchmarking of machine learning algorithms for efficient optimization of cold atom experiments","authors":"Oliver Anton, Victoria A Henderson, Elisa Da Ros, Ivan Sekulic, Sven Burger, Philipp-Immanuel Schneider and Markus Krutzik","doi":"10.1088/2632-2153/ad3cb6","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3cb6","url":null,"abstract":"The generation of cold atom clouds is a complex process which involves the optimization of noisy data in high dimensional parameter spaces. Optimization can be challenging both in and especially outside of the lab due to lack of time, expertise, or access for lengthy manual optimization. In recent years, it was demonstrated that machine learning offers a solution since it can optimize high dimensional problems quickly, without knowledge of the experiment itself. In this paper we present results showing the benchmarking of nine different optimization techniques and implementations, alongside their ability to optimize a rubidium (Rb) cold atom experiment. The investigations are performed on a 3D 87Rb molasses with 10 and 18 adjustable parameters, respectively, where the atom number obtained by absorption imaging was chosen as the test problem. We further compare the best performing optimizers under different effective noise conditions by reducing the signal-to-noise ratio of the images via adapting the atomic vapor pressure in the 2D+ magneto-optical trap and the detection laser frequency stability.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140812025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryan Raikman, Eric A Moreno, Ekaterina Govorkova, Ethan J Marx, Alec Gunny, William Benoit, Deep Chatterjee, Rafia Omer, Muhammed Saleem, Dylan S Rankin, Michael W Coughlin, Philip C Harris and Erik Katsavounidis
{"title":"GWAK: gravitational-wave anomalous knowledge with recurrent autoencoders","authors":"Ryan Raikman, Eric A Moreno, Ekaterina Govorkova, Ethan J Marx, Alec Gunny, William Benoit, Deep Chatterjee, Rafia Omer, Muhammed Saleem, Dylan S Rankin, Michael W Coughlin, Philip C Harris and Erik Katsavounidis","doi":"10.1088/2632-2153/ad3a31","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3a31","url":null,"abstract":"Matched-filtering detection techniques for gravitational-wave (GW) signals in ground-based interferometers rely on having well-modeled templates of the GW emission. Such techniques have been traditionally used in searches for compact binary coalescences (CBCs), and have been employed in all known GW detections so far. However, interesting science cases aside from compact mergers do not yet have accurate enough modeling to make matched filtering possible, including core-collapse supernovae and sources where stochasticity may be involved. Therefore the development of techniques to identify sources of these types is of significant interest. In this paper, we present a method of anomaly detection based on deep recurrent autoencoders to enhance the search region to unmodeled transients. We use a semi-supervised strategy that we name ‘Gravitational Wave Anomalous Knowledge’ (GWAK). While the semi-supervised approach to this problem entails a potential reduction in accuracy compared to fully supervised methods, it offers a generalizability advantage by enhancing the reach of experimental sensitivity beyond the constraints of pre-defined signal templates. We construct a low-dimensional embedded space using the GWAK method, capturing the physical signatures of distinct signals on each axis of the space. By introducing signal priors that capture some of the salient features of GW signals, we allow for the recovery of sensitivity even when an unmodeled anomaly is encountered. We show that regions of the GWAK space can identify CBCs, detector glitches and also a variety of unmodeled astrophysical sources.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140805928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}