{"title":"Improving materials property predictions for graph neural networks with minimal feature engineering","authors":"Guoing Cong, Victor Fung","doi":"10.1088/2632-2153/acefab","DOIUrl":"https://doi.org/10.1088/2632-2153/acefab","url":null,"abstract":"Graph neural networks (GNNs) have been employed in materials research to predict physical and functional properties, and have achieved superior performance in several application domains over prior machine learning approaches. Recent studies incorporate features of increasing complexity such as Gaussian radial functions, plane wave functions, and angular terms to augment the neural network models, with the expectation that these features are critical for achieving a high performance. Here, we propose a GNN that adopts edge convolution where hidden edge features evolve during training and extensive attention mechanisms, and operates on simple graphs with atoms as nodes and distances between them as edges. As a result, the same model can be used for very different tasks as no other domain-specific features are used. With a model that uses no feature engineering, we achieve performance comparable with state-of-the-art models with elaborate features for formation energy and band gap prediction with standard benchmarks; we achieve even better performance when the dataset size increases. Although some domain-specific datasets still require hand-crafted features to achieve state-of-the-art results, our selected architecture choices greatly reduce the need for elaborate feature engineering and still maintain predictive power in comparison.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44410905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-driven Dynamics Reconstruction using RBF Network","authors":"Congcong Du, X. Wang, Zhangsen Wang, Dahui Wang","doi":"10.1088/2632-2153/acec31","DOIUrl":"https://doi.org/10.1088/2632-2153/acec31","url":null,"abstract":"\u0000 Constructing the governing dynamical equations of complex systems from observational data is of great interest for both theory and applications. However, it is a difficult inverse problem to explicitly construct the dynamical equations for many real complex systems based on observational data. Here, we propose to implicitly represent the dynamical equations of a complex system using a Radial Basis Function (RBF) network trained on the observed data of the system. We show that the RBF network trained on trajectory data of the classical Lorenz and Chen system can faithfully reproduce the orbits, fixed points, and local bifurcations of the original dynamical equations. We also apply this method to electrocardiogram (ECG) data and show that the fixed points of the RBF network trained using ECG can discriminate healthy people from patients with heart disease, indicating that the method can be applied to real complex systems","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"1 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49081628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Cretu, A. Toniato, Amol Thakkar, Amin A. Debabeche, T. Laino, Alain C. Vaucher
{"title":"Standardizing chemical compounds with language models","authors":"M. Cretu, A. Toniato, Amol Thakkar, Amin A. Debabeche, T. Laino, Alain C. Vaucher","doi":"10.1088/2632-2153/ace878","DOIUrl":"https://doi.org/10.1088/2632-2153/ace878","url":null,"abstract":"With the growing amount of chemical data stored digitally, it has become crucial to represent chemical compounds accurately and consistently. Harmonized representations facilitate the extraction of insightful information from datasets, and are advantageous for machine learning applications. To achieve consistent representations throughout datasets, one relies on molecule standardization, which is typically accomplished using rule-based algorithms that modify descriptions of functional groups. Here, we present the first deep-learning model for molecular standardization. We enable custom standardization schemes based solely on data, which, as additional benefit, support standardization options that are difficult to encode into rules. Our model achieves over 98% accuracy in learning two popular rule-based standardization protocols. We then follow a transfer learning approach to standardize metal-organic compounds (for which there is currently no automated standardization practice), based on a human-curated dataset of 1512 compounds. This model predicts the expected standardized molecular format with a test accuracy of 80.7%. As standardization can be considered, more broadly, a transformation from undesired to desired representations of compounds, the same data-driven architecture can be applied to other tasks. For instance, we demonstrate the application to compound canonicalization and to the determination of major tautomers in solution, based on computed and experimental data.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45159553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The effects of topological features on convolutional neural networks—an explanatory analysis via Grad-CAM","authors":"Dongjin Lee, Seonghyeon Lee, Jae-Hun Jung","doi":"10.1088/2632-2153/ace6f3","DOIUrl":"https://doi.org/10.1088/2632-2153/ace6f3","url":null,"abstract":"Topological data analysis (TDA) characterizes the global structure of data based on topological invariants such as persistent homology, whereas convolutional neural networks (CNNs) are capable of characterizing local features in the global structure of the data. In contrast, a combined model of TDA and CNN, a family of multimodal networks, simultaneously takes the image and the corresponding topological features as the input to the network for classification, thereby significantly improving the performance of a single CNN. This innovative approach has been recently successful in various applications. However, there is a lack of explanation regarding how and why topological signatures, when combined with a CNN, improve discriminative power. In this paper, we use persistent homology to compute topological features and subsequently demonstrate both qualitatively and quantitatively the effects of topological signatures on a CNN model, for which the Grad-CAM analysis of multimodal networks and topological inverse image map are proposed and appropriately utilized. For experimental validation, we utilize two famous datasets: the transient versus bogus image dataset and the HAM10000 dataset. Using Grad-CAM analysis of multimodal networks, we demonstrate that topological features enforce the image network of a CNN to focus more on significant and meaningful regions across images rather than task-irrelevant artifacts such as background noise and texture.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45878620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive active Brownian particles searching for targets of unknown positions","authors":"Harpreet Kaur, T. Franosch, M. Caraglio","doi":"10.1088/2632-2153/ace6f4","DOIUrl":"https://doi.org/10.1088/2632-2153/ace6f4","url":null,"abstract":"Developing behavioral policies designed to efficiently solve target-search problems is a crucial issue both in nature and in the nanotechnology of the 21st century. Here, we characterize the target-search strategies of simple microswimmers in a homogeneous environment containing sparse targets of unknown positions. The microswimmers are capable of controlling their dynamics by switching between Brownian motion and an active Brownian particle and by selecting the time duration of each of the two phases. The specific conduct of a single microswimmer depends on an internal decision-making process determined by a simple neural network associated with the agent itself. Starting from a population of individuals with random behavior, we exploit the genetic algorithm NeuroEvolution of augmenting topologies to show how an evolutionary pressure based on the target-search performances of single individuals helps to find the optimal duration of the two different phases. Our findings reveal that the optimal policy strongly depends on the magnitude of the particle’s self-propulsion during the active phase and that a broad spectrum of network topology solutions exists, differing in the number of connections and hidden nodes.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42593976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of molecular field points using SE(3)-transformer model","authors":"Florian B Hinz, Amr H. Mahmoud, M. Lill","doi":"10.1088/2632-2153/ace67b","DOIUrl":"https://doi.org/10.1088/2632-2153/ace67b","url":null,"abstract":"Due to their computational efficiency, 2D fingerprints are typically used in similarity-based high-content screening. The interaction of a ligand with its target protein, however, relies on its physicochemical interactions in 3D space. Thus, ligands with different 2D scaffolds can bind to the same protein if these ligands share similar interaction patterns. Molecular fields can represent those interaction profiles. For efficiency, the extrema of those molecular fields, named field points, are used to quantify the ligand similarity in 3D. The calculation of field points involves the evaluation of the interaction energy between the ligand and a small probe shifted on a fine grid representing the molecular surface. These calculations are computationally prohibitive for large datasets of ligands, making field point representations of molecules intractable for high-content screening. Here, we overcome this roadblock by one-shot prediction of field points using generative neural networks based on the molecular structure alone. Field points are predicted by training an SE(3)-Transformer, an equivariant, attention-based graph neural network architecture, on a large set of ligands with field point data. Resulting data demonstrates the feasibility of this approach to precisely generate negative, positive and hydrophobic field points within 0.5 Å of the ground truth for a diverse set of drug-like molecules.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46691335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harbir Antil, H. Elman, Akwum Onwunta, Deepanshu Verma
{"title":"A deep neural network approach for parameterized PDEs and Bayesian inverse problems","authors":"Harbir Antil, H. Elman, Akwum Onwunta, Deepanshu Verma","doi":"10.1088/2632-2153/ace67c","DOIUrl":"https://doi.org/10.1088/2632-2153/ace67c","url":null,"abstract":"We consider the simulation of Bayesian statistical inverse problems governed by large-scale linear and nonlinear partial differential equations (PDEs). Markov chain Monte Carlo (MCMC) algorithms are standard techniques to solve such problems. However, MCMC techniques are computationally challenging as they require a prohibitive number of forward PDE solves. The goal of this paper is to introduce a fractional deep neural network (fDNN) based approach for the forward solves within an MCMC routine. Moreover, we discuss some approximation error estimates. We illustrate the efficiency of fDNN on inverse problems governed by nonlinear elliptic PDEs and the unsteady Navier–Stokes equations. In the former case, two examples are discussed, respectively depending on two and 100 parameters, with significant observed savings. The unsteady Navier–Stokes example illustrates that fDNN can outperform existing DNNs, doing a better job of capturing essential features such as vortex shedding.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46471127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semi-equivariant conditional normalizing flows, with applications to target-aware molecule generation","authors":"Eyal Rozenberg, Daniel Freedman","doi":"10.1088/2632-2153/ace58c","DOIUrl":"https://doi.org/10.1088/2632-2153/ace58c","url":null,"abstract":"Learning over the domain of 3D graphs has applications in a number of scientific and engineering disciplines, including molecular chemistry, high energy physics, and computer vision. We consider a specific problem in this domain, namely: given one such 3D graph, dubbed the base graph, our goal is to learn a conditional distribution over another such graph, dubbed the complement graph. Due to the three-dimensional nature of the graphs in question, there are certain natural invariances such a distribution should satisfy: it should be invariant to rigid body transformations that act jointly on the base graph and the complement graph, and it should also be invariant to permutations of the vertices of either graph. We propose a general method for learning the conditional probabilistic model, the central part of which is a continuous normalizing flow. We establish semi-equivariance conditions on the flow which guarantee the aforementioned invariance conditions on the conditional distribution. Additionally, we propose a graph neural network architecture which implements this flow, and which is designed to learn effectively despite the typical differences in size between the base graph and the complement graph. We demonstrate the utility of our technique in the molecular setting by training a conditional generative model which, given a receptor, can generate ligands which may successfully bind to that receptor. The resulting model, which has potential applications in drug design, displays high quality performance in the key ΔBinding metric.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41520736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Schanz, Klas Ove Möller, S. Rühl, D. S. Greenberg
{"title":"Robust detection of marine life with label-free image feature learning and probability calibration","authors":"Tobias Schanz, Klas Ove Möller, S. Rühl, D. S. Greenberg","doi":"10.1088/2632-2153/ace417","DOIUrl":"https://doi.org/10.1088/2632-2153/ace417","url":null,"abstract":"Advances in in situ marine life imaging have significantly increased the size and quality of available datasets, but automatic image analysis has not kept pace. Machine learning has shown promise for image processing, but its effectiveness is limited by several open challenges: the requirement for large expert-labeled training datasets, disagreement among experts, under-representation of various species and unreliable or overconfident predictions. To overcome these obstacles for automated underwater imaging, we combine and test recent developments in deep classifier networks and self-supervised feature learning. We use unlabeled images for pretraining deep neural networks to extract task-relevant image features, allowing learning algorithms to cope with scarcity in expert labels, and carefully evaluate performance in subsequent label-based tasks. Performance on rare classes is improved by applying data rebalancing together with a Bayesian correction to avoid biasing inferred in situ class frequencies. A divergence-based loss allows training on multiple, conflicting labels for the same image, leading to better estimates of uncertainty which we quantify with a novel accuracy measure. Together, these techniques can reduce the required label counts ∼100-fold while maintaining the accuracy of standard supervised training, shorten training time, cope with expert disagreement and reduce overconfidence.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43371752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gregory A. Daly, Jonathan E. Fieldsend, G. Hassall, G. Tabor
{"title":"Data-driven plasma modelling: surrogate collisional radiative models of fluorocarbon plasmas from deep generative autoencoders","authors":"Gregory A. Daly, Jonathan E. Fieldsend, G. Hassall, G. Tabor","doi":"10.1088/2632-2153/aced7f","DOIUrl":"https://doi.org/10.1088/2632-2153/aced7f","url":null,"abstract":"We have developed a deep generative model that can produce accurate optical emission spectra and colour images of an ICP plasma using only the applied coil power, electrode power, pressure and gas flows as inputs—essentially an empirical surrogate collisional radiative model. An autoencoder was trained on a dataset of 812 500 image/spectra pairs in argon, oxygen, Ar/O2, CF4/O2 and SF6/O2 plasmas in an industrial plasma etch tool, taken across the entire operating space of the tool. The autoencoder learns to encode the input data into a compressed latent representation and then decode it back to a reconstruction of the data. We learn to map the plasma tool’s inputs to the latent space and use the decoder to create a generative model. The model is very fast, taking just over 10 s to generate 10 000 measurements on a single GPU. This type of model can become a building block for a wide range of experiments and simulations. To aid this, we have released the underlying dataset of 812 500 image/spectra pairs used to train the model, the trained models and the model code for the community to accelerate the development and use of this exciting area of deep learning. Anyone can try the model, for free, on Google Colab.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42948754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}