Machine Learning Science and Technology最新文献

筛选
英文 中文
Mamba time series forecasting with uncertainty quantification. 不确定量化的曼巴时间序列预测。
IF 6.3 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2025-09-30 Epub Date: 2025-07-22 DOI: 10.1088/2632-2153/adec3b
Pedro Pessoa, Paul Campitelli, Douglas P Shepherd, S Banu Ozkan, Steve Pressé
{"title":"Mamba time series forecasting with uncertainty quantification.","authors":"Pedro Pessoa, Paul Campitelli, Douglas P Shepherd, S Banu Ozkan, Steve Pressé","doi":"10.1088/2632-2153/adec3b","DOIUrl":"10.1088/2632-2153/adec3b","url":null,"abstract":"<p><p>State space models, such as Mamba, have recently garnered attention in time series forecasting (TSF) due to their ability to capture sequence patterns. However, in electricity consumption benchmarks, Mamba forecasts exhibit a mean error of approximately 8%. Similarly, in traffic occupancy benchmarks, the mean error reaches 18%. This discrepancy leaves us to wonder whether the prediction is simply inaccurate or falls within error given spread in historical data. To address this limitation, we propose a method to quantify the predictive uncertainty of Mamba forecasts. To achieve this, we propose a dual-network framework based on the Mamba architecture for probabilistic forecasting, where one network generates point forecasts while the other estimates predictive uncertainty by modeling variance. We abbreviate our tool, Mamba with probabilistic TSF, as Mamba-ProbTSF and the code for its implementation is available on GitHub https://github.com/PessoaP/Mamba-ProbTSF. Evaluating this approach on synthetic and real-world benchmark datasets, we find Kullback-Leibler divergence between the learned distributions and the data-which, in the limit of infinite data, should converge to zero if the model correctly captures the underlying probability distribution-reduced to the order of 10<sup>-3</sup> for synthetic data and 10<sup>-1</sup> for real-world benchmark. We find that in both the electricity consumption and traffic occupancy benchmark, the true trajectory stays within the predicted uncertainty interval at the two-sigma level about 95% of the time. We further compare Mamba-ProbTSF against leading probabilistic forecast methods, DeepAR and ARIMA, and show that our method consistently achieves lower forecast errors while offering more reliable uncertainty quantification. We end with a consideration of potential limitations, adjustments to improve performance, and considerations for applying this framework to processes for purely or largely stochastic dynamics where the stochastic changes accumulate as observed, for example, in pure Brownian motion or molecular dynamics trajectories.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"6 3","pages":"035012"},"PeriodicalIF":6.3,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12281171/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144699735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
32 examples of LLM applications in materials science and chemistry: towards automation, assistants, agents, and accelerated scientific discovery. 32个LLM在材料科学和化学中的应用实例:走向自动化、助理、代理和加速科学发现。
IF 4.6 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2025-09-30 Epub Date: 2025-09-29 DOI: 10.1088/2632-2153/ae011a
Yoel Zimmermann, Adib Bazgir, Alexander Al-Feghali, Mehrad Ansari, Joshua Bocarsly, L Catherine Brinson, Yuan Chiang, Defne Circi, Min-Hsueh Chiu, Nathan Daelman, Matthew L Evans, Abhijeet S Gangan, Janine George, Hassan Harb, Ghazal Khalighinejad, Sartaaj Takrim Khan, Sascha Klawohn, Magdalena Lederbauer, Soroush Mahjoubi, Bernadette Mohr, Seyed Mohamad Moosavi, Aakash Naik, Aleyna Beste Ozhan, Dieter Plessers, Aritra Roy, Fabian Schöppach, Philippe Schwaller, Carla Terboven, Katharina Ueltzen, Yue Wu, Shang Zhu, Jan Janssen, Calvin Li, Ian Foster, Ben Blaiszik
{"title":"32 examples of LLM applications in materials science and chemistry: towards automation, assistants, agents, and accelerated scientific discovery.","authors":"Yoel Zimmermann, Adib Bazgir, Alexander Al-Feghali, Mehrad Ansari, Joshua Bocarsly, L Catherine Brinson, Yuan Chiang, Defne Circi, Min-Hsueh Chiu, Nathan Daelman, Matthew L Evans, Abhijeet S Gangan, Janine George, Hassan Harb, Ghazal Khalighinejad, Sartaaj Takrim Khan, Sascha Klawohn, Magdalena Lederbauer, Soroush Mahjoubi, Bernadette Mohr, Seyed Mohamad Moosavi, Aakash Naik, Aleyna Beste Ozhan, Dieter Plessers, Aritra Roy, Fabian Schöppach, Philippe Schwaller, Carla Terboven, Katharina Ueltzen, Yue Wu, Shang Zhu, Jan Janssen, Calvin Li, Ian Foster, Ben Blaiszik","doi":"10.1088/2632-2153/ae011a","DOIUrl":"10.1088/2632-2153/ae011a","url":null,"abstract":"<p><p>Large language models (LLMs) are reshaping many aspects of materials science and chemistry research, enabling advances in molecular property prediction, materials design, scientific automation, knowledge extraction, and more. Recent developments demonstrate that the latest class of models are able to integrate structured and unstructured data, assist in hypothesis generation, and streamline research workflows. To explore the frontier of LLM capabilities across the research lifecycle, we review applications of LLMs through 32 total projects developed during the second annual LLM hackathon for applications in materials science and chemistry, a global hybrid event. These projects spanned seven key research areas: (1) molecular and material property prediction, (2) molecular and material design, (3) automation and novel interfaces, (4) scientific communication and education, (5) research data management and automation, (6) hypothesis generation and evaluation, and (7) knowledge extraction and reasoning from the scientific literature. Collectively, these applications illustrate how LLMs serve as versatile predictive models, platforms for rapid prototyping of domain-specific tools, and much more. In particular, improvements in both open source and proprietary LLM performance through the addition of reasoning, additional training data, and new techniques have expanded effectiveness, particularly in low-data environments and interdisciplinary research. As LLMs continue to improve, their integration into scientific workflows presents both new opportunities and new challenges, requiring ongoing exploration, continued refinement, and further research to address reliability, interpretability, and reproducibility.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"6 3","pages":"030701"},"PeriodicalIF":4.6,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12492978/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145233532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures. 超越欧几里得:用几何,拓扑和代数结构的现代机器学习的图解指南。
IF 4.6 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2025-09-30 Epub Date: 2025-08-01 DOI: 10.1088/2632-2153/adf375
Mathilde Papillon, Sophia Sanborn, Johan Mathe, Louisa Cornelis, Abby Bertics, Domas Buracas, Hansen J Lillemark, Christian Shewmake, Fatih Dinc, Xavier Pennec, Nina Miolane
{"title":"Beyond Euclid: an illustrated guide to modern machine learning with geometric, topological, and algebraic structures.","authors":"Mathilde Papillon, Sophia Sanborn, Johan Mathe, Louisa Cornelis, Abby Bertics, Domas Buracas, Hansen J Lillemark, Christian Shewmake, Fatih Dinc, Xavier Pennec, Nina Miolane","doi":"10.1088/2632-2153/adf375","DOIUrl":"10.1088/2632-2153/adf375","url":null,"abstract":"<p><p>The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently non-Euclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"6 3","pages":"031002"},"PeriodicalIF":4.6,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12315666/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144776367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prior guided deep difference meta-learner for fast adaptation to stylized segmentation. 先验引导深度差异元学习器快速适应程式化分割。
IF 6.3 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2025-06-30 Epub Date: 2025-04-16 DOI: 10.1088/2632-2153/adc970
Dan Nguyen, Anjali Balagopal, Ti Bai, Michael Dohopolski, Mu-Han Lin, Steve Jiang
{"title":"Prior guided deep difference meta-learner for fast adaptation to stylized segmentation.","authors":"Dan Nguyen, Anjali Balagopal, Ti Bai, Michael Dohopolski, Mu-Han Lin, Steve Jiang","doi":"10.1088/2632-2153/adc970","DOIUrl":"https://doi.org/10.1088/2632-2153/adc970","url":null,"abstract":"<p><p>Radiotherapy treatment planning requires segmenting anatomical structures in various styles, influenced by guidelines, protocols, preferences, or dose planning needs. Deep learning-based auto-segmentation models, trained on anatomical definitions, may not match local clinicians' styles at new institutions. Adapting these models can be challenging without sufficient resources. We hypothesize that consistent differences between segmentation styles and anatomical definitions can be learned from initial patients and applied to pre-trained models for more precise segmentation. We propose a Prior-guided deep difference meta-learner (DDL) to learn and adapt these differences. We collected data from 440 patients for model development and 30 for testing. The dataset includes contours of the prostate clinical target volume (CTV), parotid, and rectum. We developed a deep learning framework that segments new images with a matching style using example styles as a prior, without model retraining. The pre-trained segmentation models were adapted to three different clinician styles for post-operative CTV for prostate, parotid gland, and rectum segmentation. We tested the model's ability to learn unseen styles and compared its performance with transfer learning, using varying amounts of prior patient style data (0-10 patients). Performance was quantitatively evaluated using dice similarity coefficient (DSC) and Hausdorff distance. With exposure to only three patients for the model, the average DSC (%) improved from 78.6, 71.9, 63.0, 69.6, 52.2 and 46.3-84.4, 77.8, 73.0, 77.8, 70.5, 68.1, for CTV<sub>style1</sub>, CTV<sub>style2</sub>, CTV<sub>style3</sub>, Parotid<sub>superficial</sub>, Rectum<sub>superior</sub>, and Rectum<sub>posterior</sub>, respectively. The proposed Prior-guided DDL is a fast and effortless network for adapting a structure to new styles. The improved segmentation accuracy may result in reduced contour editing time, providing a more efficient and streamlined clinical workflow.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"6 2","pages":"025016"},"PeriodicalIF":6.3,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12001319/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144002018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quality assurance for online adaptive radiotherapy: a secondary dose verification model with geometry-encoded U-Net. 在线自适应放射治疗的质量保证:采用几何编码 U-Net 的二次剂量验证模型。
IF 6.3 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2024-12-01 Epub Date: 2024-10-11 DOI: 10.1088/2632-2153/ad829e
Shunyu Yan, Austen Maniscalco, Biling Wang, Dan Nguyen, Steve Jiang, Chenyang Shen
{"title":"Quality assurance for online adaptive radiotherapy: a secondary dose verification model with geometry-encoded U-Net.","authors":"Shunyu Yan, Austen Maniscalco, Biling Wang, Dan Nguyen, Steve Jiang, Chenyang Shen","doi":"10.1088/2632-2153/ad829e","DOIUrl":"10.1088/2632-2153/ad829e","url":null,"abstract":"<p><p>In online adaptive radiotherapy (ART), quick computation-based secondary dose verification is crucial for ensuring the quality of ART plans while the patient is positioned on the treatment couch. However, traditional dose verification algorithms are generally time-consuming, reducing the efficiency of ART workflow. This study aims to develop an ultra-fast deep-learning (DL) based secondary dose verification algorithm to accurately estimate dose distributions using computed tomography (CT) and fluence maps (FMs). We integrated FMs into the CT image domain by explicitly resolving the geometry of treatment delivery. For each gantry angle, an FM was constructed based on the optimized multi-leaf collimator apertures and corresponding monitoring units. To effectively encode treatment beam configuration, the constructed FMs were back-projected to <math><mrow><mn>30</mn></mrow> </math> cm away from the isocenter with respect to the exact geometry of the treatment machines. Then, a 3D U-Net was utilized to take the integrated CT and FM volume as input to estimate dose. Training and validation were performed on <math><mrow><mn>381</mn></mrow> </math> prostate cancer cases, with an additional <math><mrow><mn>40</mn></mrow> </math> testing cases for independent evaluation of model performance. The proposed model can estimate dose in ∼ <math><mrow><mn>15</mn></mrow> </math> ms for each patient. The average <i>γ</i> passing rate ( <math><mrow><mn>3</mn> <mi>%</mi> <mrow><mo>/</mo></mrow> <mn>2</mn> <mstyle></mstyle> <mrow><mtext>mm</mtext></mrow> </mrow> </math> , <math><mrow><mn>10</mn> <mi>%</mi></mrow> </math> threshold) for the estimated dose was 99.9% ± 0.15% on testing patients. The mean dose differences for the planning target volume and organs at risk were <math><mrow><mn>0.07</mn> <mi>%</mi> <mo>±</mo> <mn>0.34</mn> <mi>%</mi></mrow> </math> and <math><mrow><mn>0.48</mn> <mi>%</mi> <mo>±</mo> <mn>0.72</mn> <mi>%</mi></mrow> </math> , respectively. We have developed a geometry-resolved DL framework for accurate dose estimation and demonstrated its potential in real-time online ART doses verification.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 4","pages":"045013"},"PeriodicalIF":6.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11467776/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142476443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Equivariant tensor network potentials 等变张量网络势
IF 6.8 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2024-09-18 DOI: 10.1088/2632-2153/ad79b5
M Hodapp and A Shapeev
{"title":"Equivariant tensor network potentials","authors":"M Hodapp and A Shapeev","doi":"10.1088/2632-2153/ad79b5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad79b5","url":null,"abstract":"Machine-learning interatomic potentials (MLIPs) have made a significant contribution to the recent progress in the fields of computational materials and chemistry due to the MLIPs’ ability of accurately approximating energy landscapes of quantum-mechanical models while being orders of magnitude more computationally efficient. However, the computational cost and number of parameters of many state-of-the-art MLIPs increases exponentially with the number of atomic features. Tensor (non-neural) networks, based on low-rank representations of high-dimensional tensors, have been a way to reduce the number of parameters in approximating multidimensional functions, however, it is often not easy to encode the model symmetries into them. In this work we develop a formalism for rank-efficient equivariant tensor networks (ETNs), i.e. tensor networks that remain invariant under actions of SO(3) upon contraction. All the key algorithms of tensor networks like orthogonalization of cores and DMRG-based algorithms carry over to our equivariant case. Moreover, we show that many elements of modern neural network architectures like message passing, pulling, or attention mechanisms, can in some form be implemented into the ETNs. Based on ETNs, we develop a new class of polynomial-based MLIPs that demonstrate superior performance over existing MLIPs for multicomponent systems.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"4 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing ZX-diagrams with deep reinforcement learning 利用深度强化学习优化 ZX 图
IF 6.8 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2024-09-18 DOI: 10.1088/2632-2153/ad76f7
Maximilian Nägele and Florian Marquardt
{"title":"Optimizing ZX-diagrams with deep reinforcement learning","authors":"Maximilian Nägele and Florian Marquardt","doi":"10.1088/2632-2153/ad76f7","DOIUrl":"https://doi.org/10.1088/2632-2153/ad76f7","url":null,"abstract":"ZX-diagrams are a powerful graphical language for the description of quantum processes with applications in fundamental quantum mechanics, quantum circuit optimization, tensor network simulation, and many more. The utility of ZX-diagrams relies on a set of local transformation rules that can be applied to them without changing the underlying quantum process they describe. These rules can be exploited to optimize the structure of ZX-diagrams for a range of applications. However, finding an optimal sequence of transformation rules is generally an open problem. In this work, we bring together ZX-diagrams with reinforcement learning, a machine learning technique designed to discover an optimal sequence of actions in a decision-making problem and show that a trained reinforcement learning agent can significantly outperform other optimization techniques like a greedy strategy, simulated annealing, and state-of-the-art hand-crafted algorithms. The use of graph neural networks to encode the policy of the agent enables generalization to diagrams much bigger than seen during the training phase.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"43 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DiffLense: a conditional diffusion model for super-resolution of gravitational lensing data DiffLense:引力透镜数据超分辨率的条件扩散模型
IF 6.8 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2024-09-18 DOI: 10.1088/2632-2153/ad76f8
Pranath Reddy, Michael W Toomey, Hanna Parul and Sergei Gleyzer
{"title":"DiffLense: a conditional diffusion model for super-resolution of gravitational lensing data","authors":"Pranath Reddy, Michael W Toomey, Hanna Parul and Sergei Gleyzer","doi":"10.1088/2632-2153/ad76f8","DOIUrl":"https://doi.org/10.1088/2632-2153/ad76f8","url":null,"abstract":"Gravitational lensing data is frequently collected at low resolution due to instrumental limitations and observing conditions. Machine learning-based super-resolution techniques offer a method to enhance the resolution of these images, enabling more precise measurements of lensing effects and a better understanding of the matter distribution in the lensing system. This enhancement can significantly improve our knowledge of the distribution of mass within the lensing galaxy and its environment, as well as the properties of the background source being lensed. Traditional super-resolution techniques typically learn a mapping function from lower-resolution to higher-resolution samples. However, these methods are often constrained by their dependence on optimizing a fixed distance function, which can result in the loss of intricate details crucial for astrophysical analysis. In this work, we introduce DiffLense, a novel super-resolution pipeline based on a conditional diffusion model specifically designed to enhance the resolution of gravitational lensing images obtained from the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP). Our approach adopts a generative model, leveraging the detailed structural information present in Hubble space telescope (HST) counterparts. The diffusion model, trained to generate HST data, is conditioned on HSC data pre-processed with denoising techniques and thresholding to significantly reduce noise and background interference. This process leads to a more distinct and less overlapping conditional distribution during the model’s training phase. We demonstrate that DiffLense outperforms existing state-of-the-art single-image super-resolution techniques, particularly in retaining the fine details necessary for astrophysical analyses.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"70 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Masked particle modeling on sets: towards self-supervised high energy physics foundation models 集合上的掩蔽粒子建模:走向自监督高能物理基础模型
IF 6.8 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2024-09-16 DOI: 10.1088/2632-2153/ad64a8
Tobias Golling, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy and John Andrew Raine
{"title":"Masked particle modeling on sets: towards self-supervised high energy physics foundation models","authors":"Tobias Golling, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy and John Andrew Raine","doi":"10.1088/2632-2153/ad64a8","DOIUrl":"https://doi.org/10.1088/2632-2153/ad64a8","url":null,"abstract":"We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards building large foundation models for HEP that can be generically pre-trained with self-supervised learning and later fine-tuned for a variety of down-stream tasks. In MPM, particles in a set are masked and the training objective is to recover their identity, as defined by a discretized token representation of a pre-trained vector quantized variational autoencoder. We study the efficacy of the method in samples of high energy jets at collider physics experiments, including studies on the impact of discretization, permutation invariance, and ordering. We also study the fine-tuning capability of the model, showing that it can be adapted to tasks such as supervised and weakly supervised jet classification, and that the model can transfer efficiently with small fine-tuning data sets to new classes and new data domains.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"75 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transforming the bootstrap: using transformers to compute scattering amplitudes in planar N =... 转换自举法:使用转换器计算平面 N =...
IF 6.8 2区 物理与天体物理
Machine Learning Science and Technology Pub Date : 2024-09-15 DOI: 10.1088/2632-2153/ad743e
Tianji Cai, Garrett W Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer and Lance J Dixon
{"title":"Transforming the bootstrap: using transformers to compute scattering amplitudes in planar N =...","authors":"Tianji Cai, Garrett W Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer and Lance J Dixon","doi":"10.1088/2632-2153/ad743e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad743e","url":null,"abstract":"We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar Super Yang–Mills theory is a close cousin to the theory that describes Higgs boson production at the Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply transformers to predict these coefficients. The problem can be formulated in a language-like representation amenable to standard cross-entropy training objectives. We design two related experiments and show that the model achieves high accuracy ( on both tasks. Our work shows that transformers can be applied successfully to problems in theoretical physics that require exact solutions.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"12 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信