{"title":"Machine learning from limited data: Predicting biological dynamics under a time-varying external input","authors":"Hoony Kang, Keshav Srinivasan, Wolfgang Losert","doi":"arxiv-2408.07998","DOIUrl":"https://doi.org/arxiv-2408.07998","url":null,"abstract":"Reservoir computing (RC) is known as a powerful machine learning approach for\u0000learning complex dynamics from limited data. Here, we use RC to predict highly\u0000stochastic dynamics of cell shapes. We find that RC is able to predict the\u0000steady state climate from very limited data. Furthermore, the RC learns the\u0000timescale of transients from only four observations. We find that these\u0000capabilities of the RC to act as a dynamic twin allows us to also infer\u0000important statistics of cell shape dynamics of unobserved conditions.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana Fernández del Río, Michael Brennan Leong, Paulo Saraiva, Ivan Nazarov, Aditya Rastogi, Moiz Hassan, Dexian Tang, África Periáñez
{"title":"Adaptive Behavioral AI: Reinforcement Learning to Enhance Pharmacy Services","authors":"Ana Fernández del Río, Michael Brennan Leong, Paulo Saraiva, Ivan Nazarov, Aditya Rastogi, Moiz Hassan, Dexian Tang, África Periáñez","doi":"arxiv-2408.07647","DOIUrl":"https://doi.org/arxiv-2408.07647","url":null,"abstract":"Pharmacies are critical in healthcare systems, particularly in low- and\u0000middle-income countries. Procuring pharmacists with the right behavioral\u0000interventions or nudges can enhance their skills, public health awareness, and\u0000pharmacy inventory management, ensuring access to essential medicines that\u0000ultimately benefit their patients. We introduce a reinforcement learning\u0000operational system to deliver personalized behavioral interventions through\u0000mobile health applications. We illustrate its potential by discussing a series\u0000of initial experiments run with SwipeRx, an all-in-one app for pharmacists,\u0000including B2B e-commerce, in Indonesia. The proposed method has broader\u0000applications extending beyond pharmacy operations to optimize healthcare\u0000delivery.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. M. Bub, M. Piarulli, R. J. Furnstahl, S. Pastore, D. R. Phillips
{"title":"Bayesian analysis of nucleon-nucleon scattering data in pionless effective field theory","authors":"J. M. Bub, M. Piarulli, R. J. Furnstahl, S. Pastore, D. R. Phillips","doi":"arxiv-2408.02480","DOIUrl":"https://doi.org/arxiv-2408.02480","url":null,"abstract":"We perform Bayesian model calibration of two-nucleon ($NN$) low-energy\u0000constants (LECs) appearing in an $NN$ interaction based on pionless effective\u0000field theory (EFT). The calibration is carried out for potentials constructed\u0000using naive dimensional analysis in $NN$ relative momenta ($p$) up to\u0000next-to-leading order [NLO, $O(p^2)$] and next-to-next-to-next-to-leading order\u0000[N3LO, $O(p^4)$]. We consider two classes of pionless EFT potential: one that\u0000acts in all partial waves and another that is dominated by $s$-wave physics.\u0000The two classes produce broadly similar results for calibrations to $NN$ data\u0000up to $E_{rm lab}=5$ MeV. Our analysis accounts for the correlated\u0000uncertainties that arise from the truncation of the pionless EFT. We\u0000simultaneously estimate both the EFT LECs and the parameters that quantify the\u0000truncation error. This permits the first quantitative estimates of the pionless\u0000EFT breakdown scale, $Lambda_b$: the 95% intervals are $Lambda_b in\u0000[50.11,63.03]$ MeV at NLO and $Lambda_b in [72.27, 88.54]$ MeV at N3LO.\u0000Invoking naive dimensional analysis for the $NN$ potential, therefore, does not\u0000lead to consistent results across orders in pionless EFT. This exemplifies the\u0000possible use of Bayesian tools to identify inconsistencies in a proposed EFT\u0000power counting.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example","authors":"Johannes Erdmann, Florian Mausolf, Jan Lukas Späh","doi":"arxiv-2408.02743","DOIUrl":"https://doi.org/arxiv-2408.02743","url":null,"abstract":"Recently, Kolmogorov-Arnold Networks (KANs) have been proposed as an\u0000alternative to multilayer perceptrons, suggesting advantages in performance and\u0000interpretability. We study a typical binary event classification task in\u0000high-energy physics including high-level features and comment on the\u0000performance and interpretability of KANs in this context. We find that the\u0000learned activation functions of a one-layer KAN resemble the log-likelihood\u0000ratio of the input features. In deeper KANs, the activations in the first KAN\u0000layer differ from those in the one-layer KAN, which indicates that the deeper\u0000KANs learn more complex representations of the data. We study KANs with\u0000different depths and widths and we compare them to multilayer perceptrons in\u0000terms of performance and number of trainable parameters. For the chosen\u0000classification task, we do not find that KANs are more parameter efficient.\u0000However, small KANs may offer advantages in terms of interpretability that come\u0000at the cost of only a moderate loss in performance.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"112 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141935078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On marginals and profiled posteriors for cosmological parameter estimation","authors":"Martin Kerscher, Jochen Weller","doi":"arxiv-2408.02063","DOIUrl":"https://doi.org/arxiv-2408.02063","url":null,"abstract":"With several examples and in an analysis of the Pantheon+ supernova sample we\u0000discuss the properties of the marginal posterior distribution versus the\u0000profiled posterior distribution -- the profile likelihood in a Bayesian\u0000disguise. We investigate whether maximisation, as used for the profiling, or\u0000integration, as used for the marginalisation, is more appropriate. To report\u0000results we recommend the marginal posterior distribution.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141935079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TrackSorter: A Transformer-based sorting algorithm for track finding in High Energy Physics","authors":"Yash Melkani, Xiangyang Ju","doi":"arxiv-2407.21290","DOIUrl":"https://doi.org/arxiv-2407.21290","url":null,"abstract":"Track finding in particle data is a challenging pattern recognition problem\u0000in High Energy Physics. It takes as inputs a point cloud of space points and\u0000labels them so that space points created by the same particle have the same\u0000label. The list of space points with the same label is a track candidate. We\u0000argue that this pattern recognition problem can be formulated as a sorting\u0000problem, of which the inputs are a list of space points sorted by their\u0000distances away from the collision points and the outputs are the space points\u0000sorted by their labels. In this paper, we propose the TrackSorter algorithm: a\u0000Transformer-based algorithm for pattern recognition in particle data.\u0000TrackSorter uses a simple tokenization scheme to convert space points into\u0000discrete tokens. It then uses the tokenized space points as inputs and sorts\u0000the input tokens into track candidates. TrackSorter is a novel end-to-end track\u0000finding algorithm that leverages Transformer-based models to solve pattern\u0000recognition problems. It is evaluated on the TrackML dataset and has good track\u0000finding performance.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"413 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low dimensional fragment-based descriptors for property predictions in inorganic materials with machine learning","authors":"Md Mohaiminul Islam","doi":"arxiv-2407.21146","DOIUrl":"https://doi.org/arxiv-2407.21146","url":null,"abstract":"In recent times, the use of machine learning in materials design and\u0000discovery has aided to accelerate the discovery of innovative materials with\u0000extraordinary properties, which otherwise would have been driven by a laborious\u0000and time-consuming trial-and-error process. In this study, a simple yet\u0000powerful fragment-based descriptor, Low Dimensional Fragment Descriptors\u0000(LDFD), is proposed to work in conjunction with machine learning models to\u0000predict important properties of a wide range of inorganic materials such as\u0000perovskite oxides, metal halide perovskites, alloys, semiconductor, and other\u0000materials system and can also be extended to work with interfaces. To predict\u0000properties, the generation of descriptors requires only the structural formula\u0000of the materials and, in presence of identical structure in the dataset,\u0000additional system properties as input. And the generation of descriptors\u0000involves few steps, encoding the formula in binary space and reduction of\u0000dimensionality, allowing easy implementation and prediction. To evaluate\u0000descriptor performance, six known datasets with up to eight components were\u0000compared. The method was applied to properties such as band gaps of perovskites\u0000and semiconductors, lattice constant of magnetic alloys, bulk/shear modulus of\u0000superhard alloys, critical temperature of superconductors, formation enthalpy\u0000and energy above hull convex of perovskite oxides. An advanced python-based\u0000data mining tool matminer was utilized for the collection of data. The\u0000prediction accuracies are equivalent to the quality of the training data and\u0000show comparable effectiveness as previous studies. This method should be\u0000extendable to any inorganic material systems which can be subdivided into\u0000layers or crystal structures with more than one atom site, and with the\u0000progress of data mining the performance should get better with larger and\u0000unbiased datasets.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"263 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Review Generation Method Based on Large Language Models","authors":"Shican Wu, Xiao Ma, Dehui Luo, Lulu Li, Xiangcheng Shi, Xin Chang, Xiaoyun Lin, Ran Luo, Chunlei Pei, Zhi-Jian Zhao, Jinlong Gong","doi":"arxiv-2407.20906","DOIUrl":"https://doi.org/arxiv-2407.20906","url":null,"abstract":"Literature research, vital for scientific advancement, is overwhelmed by the\u0000vast ocean of available information. Addressing this, we propose an automated\u0000review generation method based on Large Language Models (LLMs) to streamline\u0000literature processing and reduce cognitive load. In case study on propane\u0000dehydrogenation (PDH) catalysts, our method swiftly generated comprehensive\u0000reviews from 343 articles, averaging seconds per article per LLM account.\u0000Extended analysis of 1041 articles provided deep insights into catalysts'\u0000composition, structure, and performance. Recognizing LLMs' hallucinations, we\u0000employed a multi-layered quality control strategy, ensuring our method's\u0000reliability and effective hallucination mitigation. Expert verification\u0000confirms the accuracy and citation integrity of generated reviews,\u0000demonstrating LLM hallucination risks reduced to below 0.5% with over 95%\u0000confidence. Released Windows application enables one-click review generation,\u0000aiding researchers in tracking advancements and recommending literature. This\u0000approach showcases LLMs' role in enhancing scientific research productivity and\u0000sets the stage for further exploration.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Cerdeno, Martin de los Rios, Andres D. Perez
{"title":"Bayesian technique to combine independently-trained Machine-Learning models applied to direct dark matter detection","authors":"David Cerdeno, Martin de los Rios, Andres D. Perez","doi":"arxiv-2407.21008","DOIUrl":"https://doi.org/arxiv-2407.21008","url":null,"abstract":"We carry out a Bayesian analysis of dark matter (DM) direct detection data to\u0000determine particle model parameters using the Truncated Marginal Neural Ratio\u0000Estimation (TMNRE) machine learning technique. TMNRE avoids an explicit\u0000calculation of the likelihood, which instead is estimated from simulated data,\u0000unlike in traditional Markov Chain Monte Carlo (MCMC) algorithms. This\u0000considerably speeds up, by several orders of magnitude, the computation of the\u0000posterior distributions, which allows to perform the Bayesian analysis of an\u0000otherwise computationally prohibitive number of benchmark points. In this\u0000article we demonstrate that, in the TMNRE framework, it is possible to include,\u0000combine, and remove different datasets in a modular fashion, which is fast and\u0000simple as there is no need to re-train the machine learning algorithm or to\u0000define a combined likelihood. In order to assess the performance of this\u0000method, we consider the case of WIMP DM with spin-dependent and independent\u0000interactions with protons and neutrons in a xenon experiment. After validating\u0000our results with MCMC, we employ the TMNRE procedure to determine the regions\u0000where the DM parameters can be reconstructed. Finally, we present CADDENA, a\u0000Python package that implements the modular Bayesian analysis of direct\u0000detection experiments described in this work.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Hallin, Gregor Kasieczka, Sabine Kraml, André Lessa, Louis Moureaux, Tore von Schwartz, David Shih
{"title":"Universal New Physics Latent Space","authors":"Anna Hallin, Gregor Kasieczka, Sabine Kraml, André Lessa, Louis Moureaux, Tore von Schwartz, David Shih","doi":"arxiv-2407.20315","DOIUrl":"https://doi.org/arxiv-2407.20315","url":null,"abstract":"We develop a machine learning method for mapping data originating from both\u0000Standard Model processes and various theories beyond the Standard Model into a\u0000unified representation (latent) space while conserving information about the\u0000relationship between the underlying theories. We apply our method to three\u0000examples of new physics at the LHC of increasing complexity, showing that\u0000models can be clustered according to their LHC phenomenology: different models\u0000are mapped to distinct regions in latent space, while indistinguishable models\u0000are mapped to the same region. This opens interesting new avenues on several\u0000fronts, such as model discrimination, selection of representative benchmark\u0000scenarios, and identifying gaps in the coverage of model space.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"130 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}