Atsarina Larasati Anindya, Torbjörn Nur Olsson, Maja Jensen, Maria-Jose Garcia-Bonete, Sally P Wheatley, Maria I Bokarewa, Stefano A Mezzasalma and Gergely Katona
{"title":"Deciphering peptide-protein interactions via composition-based prediction: a case study with survivin/BIRC5","authors":"Atsarina Larasati Anindya, Torbjörn Nur Olsson, Maja Jensen, Maria-Jose Garcia-Bonete, Sally P Wheatley, Maria I Bokarewa, Stefano A Mezzasalma and Gergely Katona","doi":"10.1088/2632-2153/ad5784","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5784","url":null,"abstract":"In the realm of atomic physics and chemistry, composition emerges as the most powerful means of describing matter. Mendeleev’s periodic table and chemical formulas, while not entirely free from ambiguities, provide robust approximations for comprehending the properties of atoms, chemicals, and their collective behaviours, which stem from the dynamic interplay of their constituents. Our study illustrates that protein-protein interactions follow a similar paradigm, wherein the composition of peptides plays a pivotal role in predicting their interactions with the protein survivin, using an elegantly simple model. An analysis of these predictions within the context of the human proteome not only confirms the known cellular locations of survivin and its interaction partners, but also introduces novel insights into biological functionality. It becomes evident that electrostatic- and primary structure-based descriptions fall short in predictive power, leading us to speculate that protein interactions are orchestrated by the collective dynamics of functional groups.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enrico Ventura, Simona Cocco, Rémi Monasson and Francesco Zamponi
{"title":"Unlearning regularization for Boltzmann machines","authors":"Enrico Ventura, Simona Cocco, Rémi Monasson and Francesco Zamponi","doi":"10.1088/2632-2153/ad5a5f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5a5f","url":null,"abstract":"Boltzmann machines (BMs) are graphical models with interconnected binary units, employed for the unsupervised modeling of data distributions. When trained on real data, BMs show the tendency to behave like critical systems, displaying a high susceptibility of the model under a small rescaling of the inferred parameters. This behavior is not convenient for the purpose of generating data, because it slows down the sampling process, and induces the model to overfit the training-data. In this study, we introduce a regularization method for BMs to improve the robustness of the model under rescaling of the parameters. The new technique shares formal similarities with the unlearning algorithm, an iterative procedure used to improve memory associativity in Hopfield-like neural networks. We test our unlearning regularization on synthetic data generated by two simple models, the Curie–Weiss ferromagnetic model and the Sherrington–Kirkpatrick spin glass model. We show that it outperforms Lp-norm schemes and discuss the role of parameter initialization. Eventually, the method is applied to learn the activity of real neuronal cells, confirming its efficacy at shifting the inferred model away from criticality and coming out as a powerful candidate for actual scientific implementations.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unification of symmetries inside neural networks: transformer, feedforward and neural ODE","authors":"Koji Hashimoto, Yuji Hirono and Akiyoshi Sannai","doi":"10.1088/2632-2153/ad5927","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5927","url":null,"abstract":"Understanding the inner workings of neural networks, including transformers, remains one of the most challenging puzzles in machine learning. This study introduces a novel approach by applying the principles of gauge symmetries, a key concept in physics, to neural network architectures. By regarding model functions as physical observables, we find that parametric redundancies of various machine learning models can be interpreted as gauge symmetries. We mathematically formulate the parametric redundancies in neural ODEs, and find that their gauge symmetries are given by spacetime diffeomorphisms, which play a fundamental role in Einstein’s theory of gravity. Viewing neural ODEs as a continuum version of feedforward neural networks, we show that the parametric redundancies in feedforward neural networks are indeed lifted to diffeomorphisms in neural ODEs. We further extend our analysis to transformer models, finding natural correspondences with neural ODEs and their gauge symmetries. The concept of gauge symmetries sheds light on the complex behavior of deep learning models through physics and provides us with a unifying perspective for analyzing various machine learning architectures.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biling Wang, Michael Dohopolski, Ti Bai, Junjie Wu, Raquibul Hannan, Neil Desai, Aurelie Garant, Daniel Yang, Dan Nguyen, Mu-Han Lin, Robert Timmerman, Xinlei Wang and Steve B Jiang
{"title":"Performance deterioration of deep learning models after clinical deployment: a case study with auto-segmentation for definitive prostate cancer radiotherapy","authors":"Biling Wang, Michael Dohopolski, Ti Bai, Junjie Wu, Raquibul Hannan, Neil Desai, Aurelie Garant, Daniel Yang, Dan Nguyen, Mu-Han Lin, Robert Timmerman, Xinlei Wang and Steve B Jiang","doi":"10.1088/2632-2153/ad580f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad580f","url":null,"abstract":"Our study aims to explore the long-term performance patterns for deep learning (DL) models deployed in clinic and to investigate their efficacy in relation to evolving clinical practices. We conducted a retrospective study simulating the clinical implementation of our DL model involving 1328 prostate cancer patients treated between January 2006 and August 2022. We trained and validated a U-Net-based auto-segmentation model on data obtained from 2006 to 2011 and tested on data from 2012 to 2022, simulating the model’s clinical deployment starting in 2012. We visualized the trends of the model performance using exponentially weighted moving average (EMA) curves. Additionally, we performed Wilcoxon Rank Sum Test and multiple linear regression to investigate Dice similarity coefficient (DSC) variations across distinct periods and the impact of clinical factors, respectively. Initially, from 2012 to 2014, the model showed high performance in segmenting the prostate, rectum, and bladder. Post-2015, a notable decline in EMA DSC was observed for the prostate and rectum, while bladder contours remained stable. Key factors impacting the prostate contour quality included physician contouring styles, using various hydrogel spacers, CT scan slice thickness, MRI-guided contouring, and intravenous (IV) contrast (p < 0.0001, p < 0.0001, p = 0.0085, p = 0.0012, p < 0.0001, respectively). Rectum contour quality was notably influenced by factors such as slice thickness, physician contouring styles, and the use of various hydrogel spacers. The quality of the bladder contour was primarily affected by IV contrast. The deployed DL model exhibited a substantial decline in performance over time, aligning with the evolving clinical settings.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finetuning foundation models for joint analysis optimization in High Energy Physics","authors":"Matthias Vigl, Nicole Hartman and Lukas Heinrich","doi":"10.1088/2632-2153/ad55a3","DOIUrl":"https://doi.org/10.1088/2632-2153/ad55a3","url":null,"abstract":"In this work we demonstrate that significant gains in performance and data efficiency can be achieved in High Energy Physics (HEP) by moving beyond the standard paradigm of sequential optimization or reconstruction and analysis components. We conceptually connect HEP reconstruction and analysis to modern machine learning workflows such as pretraining, finetuning, domain adaptation and high-dimensional embedding spaces and quantify the gains in the example usecase of searches of heavy resonances decaying via an intermediate di-Higgs system to four b-jets. To our knowledge this is the first example of a low-level feature extraction network finetuned for a downstream HEP analysis objective.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparse autoregressive neural networks for classical spin systems","authors":"Indaco Biazzo, Dian Wu and Giuseppe Carleo","doi":"10.1088/2632-2153/ad5783","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5783","url":null,"abstract":"Efficient sampling and approximation of Boltzmann distributions involving large sets of binary variables, or spins, are pivotal in diverse scientific fields even beyond physics. Recent advances in generative neural networks have significantly impacted this domain. However, these neural networks are often treated as black boxes, with architectures primarily influenced by data-driven problems in computational science. Addressing this gap, we introduce a novel autoregressive neural network architecture named TwoBo, specifically designed for sparse two-body interacting spin systems. We directly incorporate the Boltzmann distribution into its architecture and parameters, resulting in enhanced convergence speed, superior free energy accuracy, and reduced trainable parameters. We perform numerical experiments on disordered, frustrated systems with more than 1000 spins on grids and random graphs, and demonstrate its advantages compared to previous autoregressive and recurrent architectures. Our findings validate a physically informed approach and suggest potential extensions to multivalued variables and many-body interaction systems, paving the way for broader applications in scientific research.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Luce, Rasoul Alaee, Fabian Knorr and Florian Marquardt
{"title":"Merging automatic differentiation and the adjoint method for photonic inverse design","authors":"Alexander Luce, Rasoul Alaee, Fabian Knorr and Florian Marquardt","doi":"10.1088/2632-2153/ad5411","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5411","url":null,"abstract":"Optimizing the shapes and topology of physical devices is crucial for both scientific and technological advancements, given their wide-ranging implications across numerous industries and research areas. Innovations in shape and topology optimization have been observed across a wide range of fields, notably structural mechanics, fluid mechanics, and more recently, photonics. Gradient-based inverse design techniques have been particularly successful for photonic and optical problems, resulting in integrated, miniaturized hardware that has set new standards in device performance. To calculate the gradients, there are typically two approaches: namely, either by implementing specialized solvers using automatic differentiation (AD) or by deriving analytical solutions for gradient calculation and adjoint sources by hand. In this work, we propose a middle ground and present a hybrid approach that leverages and enables the benefits of AD for handling gradient derivation while using existing, proven but black-box photonic solvers for numerical solutions. Utilizing the adjoint method, we make existing numerical solvers differentiable and seamlessly integrate them into an AD framework. Further, this enables users to integrate the optimization environment seamlessly with other autodifferentiable components such as machine learning, geometry generation, or intricate post-processing which could lead to better photonic design workflows. We illustrate the approach through two distinct photonic optimization problems: optimizing the Purcell factor of a magnetic dipole in the vicinity of an optical nanocavity and enhancing the light extraction efficiency of a µLED.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep learning methods for Hamiltonian parameter estimation and magnetic domain image generation in twisted van der Waals magnets","authors":"Woo Seok Lee, Taegeun Song and Kyoung-Min Kim","doi":"10.1088/2632-2153/ad56fa","DOIUrl":"https://doi.org/10.1088/2632-2153/ad56fa","url":null,"abstract":"The application of twist engineering in van der Waals magnets has opened new frontiers in the field of two-dimensional magnetism, yielding distinctive magnetic domain structures. Despite the introduction of numerous theoretical methods, limitations persist in terms of accuracy or efficiency due to the complex nature of the magnetic Hamiltonians pertinent to these systems. In this study, we introduce a deep-learning approach to tackle these challenges. Utilizing customized, fully connected networks, we develop two deep-neural-network kernels that facilitate efficient and reliable analysis of twisted van der Waals magnets. Our regression model is adept at estimating the magnetic Hamiltonian parameters of twisted bilayer CrI3 from its magnetic domain images generated through atomistic spin simulations. The ‘generative model’ excels in producing precise magnetic domain images from the provided magnetic parameters. The trained networks for these models undergo thorough validation, including statistical error analysis and assessment of robustness against noisy injections. These advancements not only extend the applicability of deep-learning methods to twisted van der Waals magnets but also streamline future investigations into these captivating yet poorly understood systems.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kevin Otto, Simon Burgis, Kristian Kersting, Reinhold Bertrand and Devendra Singh Dhami
{"title":"Machine learning meets Kepler: inverting Kepler’s equation for All vs All conjunction analysis","authors":"Kevin Otto, Simon Burgis, Kristian Kersting, Reinhold Bertrand and Devendra Singh Dhami","doi":"10.1088/2632-2153/ad51cc","DOIUrl":"https://doi.org/10.1088/2632-2153/ad51cc","url":null,"abstract":"The number of satellites in orbit around Earth is increasing rapidly, with the risk of collision rising accordingly. Trends of the global population of satellites need to be analyzed to test the viability and impact of proposed rules and laws affecting the satellite population and collision avoidance strategies. This requires large scale simulations of satellites that are propagated on long timescales to compute the large amounts of actionable close encounters (called conjunctions), which could lead to collisions. Rigorously checking for conjunctions by computing future states of orbits is computationally expensive due to the large amount of objects involved and conjunction filters are thus used to remove non-conjuncting orbit pairs from the list of possible conjunctions. In this work, we explore the possibility of machine learning (ML) based conjunction filters using several algorithms such as eXtreme Gradient Boosting, TabNet and (physics-informed) neural networks and deep operator networks. To show the viability and the potential of ML based filters, these algorithms are trained to predict the future state of orbits. For the physics-informed approaches, multiple partial differential equations are set up using the Kepler equation as a basis. The empirical results demonstrate that physics-informed deep operator networks are capable of predicting the future state of orbits using these equations (RMSE: 0.136) and outperform eXtreme Gradient Boosting (RMSE: 0.568) and TabNet (RMSE: 0.459). We also propose a filter based on the trained deep operator network which is shown to outperforms the filter capability of the commonly used perigee-apogee test and the orbit path filter on a synthetic dataset, while being on average 3.2 times faster to compute than a rigorous conjunction check.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ammar Sherif, Abubakar Abid, Mustafa Elattar and Mohamed ElHelw
{"title":"STG-MTL: scalable task grouping for multi-task learning using data maps","authors":"Ammar Sherif, Abubakar Abid, Mustafa Elattar and Mohamed ElHelw","doi":"10.1088/2632-2153/ad4e04","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4e04","url":null,"abstract":"Multi-Task Learning (MTL) is a powerful technique that has gained popularity due to its performance improvement over traditional Single-Task Learning (STL). However, MTL is often challenging because there is an exponential number of possible task groupings, which can make it difficult to choose the best one because some groupings might produce performance degradation due to negative interference between tasks. That is why existing solutions are severely suffering from scalability issues, limiting any practical application. In our paper, we propose a new data-driven method that addresses these challenges and provides a scalable and modular solution for classification task grouping based on a re-proposed data-driven features, Data Maps, which capture the training dynamics for each classification task during the MTL training. Through a theoretical comparison with other techniques, we manage to show that our approach has the superior scalability. Our experiments show a better performance and verify the method’s effectiveness, even on an unprecedented number of tasks (up to 100 tasks on CIFAR100). Being the first to work on such number of tasks, our comparisons on the resulting grouping shows similar grouping to the mentioned in the dataset, CIFAR100. Finally, we provide a modular implementation3for easier integration and testing, with examples from multiple datasets and tasks.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}