{"title":"Learning Generalized Hamiltonians using fully Symplectic Mappings","authors":"Harsh Choudhary, Chandan Gupta, Vyacheslav kungrutsev, Melvin Leok, Georgios Korpas","doi":"arxiv-2409.11138","DOIUrl":"https://doi.org/arxiv-2409.11138","url":null,"abstract":"Many important physical systems can be described as the evolution of a\u0000Hamiltonian system, which has the important property of being conservative,\u0000that is, energy is conserved throughout the evolution. Physics Informed Neural\u0000Networks and in particular Hamiltonian Neural Networks have emerged as a\u0000mechanism to incorporate structural inductive bias into the NN model. By\u0000ensuring physical invariances are conserved, the models exhibit significantly\u0000better sample complexity and out-of-distribution accuracy than standard NNs.\u0000Learning the Hamiltonian as a function of its canonical variables, typically\u0000position and velocity, from sample observations of the system thus becomes a\u0000critical task in system identification and long-term prediction of system\u0000behavior. However, to truly preserve the long-run physical conservation\u0000properties of Hamiltonian systems, one must use symplectic integrators for a\u0000forward pass of the system's simulation. While symplectic schemes have been\u0000used in the literature, they are thus far limited to situations when they\u0000reduce to explicit algorithms, which include the case of separable Hamiltonians\u0000or augmented non-separable Hamiltonians. We extend it to generalized\u0000non-separable Hamiltonians, and noting the self-adjoint property of symplectic\u0000integrators, we bypass computationally intensive backpropagation through an ODE\u0000solver. We show that the method is robust to noise and provides a good\u0000approximation of the system Hamiltonian when the state variables are sampled\u0000from a noisy observation. In the numerical results, we show the performance of\u0000the method concerning Hamiltonian reconstruction and conservation, indicating\u0000its particular advantage for non-separable systems.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A logical alarm for misaligned binary classifiers","authors":"Andrés Corrada-Emmanuel, Ilya Parker, Ramesh Bharadwaj","doi":"arxiv-2409.11052","DOIUrl":"https://doi.org/arxiv-2409.11052","url":null,"abstract":"If two agents disagree in their decisions, we may suspect they are not both\u0000correct. This intuition is formalized for evaluating agents that have carried\u0000out a binary classification task. Their agreements and disagreements on a joint\u0000test allow us to establish the only group evaluations logically consistent with\u0000their responses. This is done by establishing a set of axioms (algebraic\u0000relations) that must be universally obeyed by all evaluations of binary\u0000responders. A complete set of such axioms are possible for each ensemble of\u0000size N. The axioms for $N = 1, 2$ are used to construct a fully logical alarm -\u0000one that can prove that at least one ensemble member is malfunctioning using\u0000only unlabeled data. The similarities of this approach to formal software\u0000verification and its utility for recent agendas of safe guaranteed AI are\u0000discussed.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WaterQualityNeT: Prediction of Seasonal Water Quality of Nepal Using Hybrid Deep Learning Models","authors":"Biplov Paneru, Bishwash Paneru","doi":"arxiv-2409.10898","DOIUrl":"https://doi.org/arxiv-2409.10898","url":null,"abstract":"Ensuring a safe and uncontaminated water supply is contingent upon the\u0000monitoring of water quality, especially in developing countries such as Nepal,\u0000where water sources are susceptible to pollution. This paper presents a hybrid\u0000deep learning model for predicting Nepal's seasonal water quality using a small\u0000dataset with many water quality parameters. The model integrates convolutional\u0000neural networks (CNN) and recurrent neural networks (RNN) to exploit temporal\u0000and spatial patterns in the data. The results demonstrate significant\u0000improvements in forecast accuracy over traditional methods, providing a\u0000reliable tool for proactive control of water quality. The model that used WQI\u0000parameters to classify people into good, poor, and average groups performed 92%\u0000of the time in testing. Similarly, the R2 score was 0.97 and the root mean\u0000square error was 2.87 when predicting WQI values using regression analysis.\u0000Additionally, a multifunctional application that uses both a regression and a\u0000classification approach is built to predict WQI values.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chess Rating Estimation from Moves and Clock Times Using a CNN-LSTM","authors":"Michael Omori, Prasad Tadepalli","doi":"arxiv-2409.11506","DOIUrl":"https://doi.org/arxiv-2409.11506","url":null,"abstract":"Current rating systems update ratings incrementally and may not always\u0000accurately reflect a player's true strength at all times, especially for\u0000rapidly improving players or very rusty players. To overcome this, we explore a\u0000method to estimate player ratings directly from game moves and clock times. We\u0000compiled a benchmark dataset from Lichess, encompassing various time controls\u0000and including move sequences and clock times. Our model architecture comprises\u0000a CNN to learn positional features, which are then integrated with clock-time\u0000data into a bidirectional LSTM, predicting player ratings after each move. The\u0000model achieved an MAE of 182 rating points in the test data. Additionally, we\u0000applied our model to the 2024 IEEE Big Data Cup Chess Puzzle Difficulty\u0000Competition dataset, predicted puzzle ratings and achieved competitive results.\u0000This model is the first to use no hand-crafted features to estimate chess\u0000ratings and also the first to output a rating prediction for each move. Our\u0000method highlights the potential of using move-based rating estimation for\u0000enhancing rating systems and potentially other applications such as cheating\u0000detection.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"D2Vformer: A Flexible Time Series Prediction Model Based on Time Position Embedding","authors":"Xiaobao Song, Hao Wang, Liwei Deng, Yuxin He, Wenming Cao, Chi-Sing Leungc","doi":"arxiv-2409.11024","DOIUrl":"https://doi.org/arxiv-2409.11024","url":null,"abstract":"Time position embeddings capture the positional information of time steps,\u0000often serving as auxiliary inputs to enhance the predictive capabilities of\u0000time series models. However, existing models exhibit limitations in capturing\u0000intricate time positional information and effectively utilizing these\u0000embeddings. To address these limitations, this paper proposes a novel model\u0000called D2Vformer. Unlike typical prediction methods that rely on RNNs or\u0000Transformers, this approach can directly handle scenarios where the predicted\u0000sequence is not adjacent to the input sequence or where its length dynamically\u0000changes. In comparison to conventional methods, D2Vformer undoubtedly saves a\u0000significant amount of training resources. In D2Vformer, the Date2Vec module\u0000uses the timestamp information and feature sequences to generate time position\u0000embeddings. Afterward, D2Vformer introduces a new fusion block that utilizes an\u0000attention mechanism to explore the similarity in time positions between the\u0000embeddings of the input sequence and the predicted sequence, thereby generating\u0000predictions based on this similarity. Through extensive experiments on six\u0000datasets, we demonstrate that Date2Vec outperforms other time position\u0000embedding methods, and D2Vformer surpasses state-of-the-art methods in both\u0000fixed-length and variable-length prediction tasks.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aidan J. Hughes, Keith Worden, Nikolaos Dervilis, Timothy J. Rogers
{"title":"Cost-informed dimensionality reduction for structural digital twin technologies","authors":"Aidan J. Hughes, Keith Worden, Nikolaos Dervilis, Timothy J. Rogers","doi":"arxiv-2409.11236","DOIUrl":"https://doi.org/arxiv-2409.11236","url":null,"abstract":"Classification models are a key component of structural digital twin\u0000technologies used for supporting asset management decision-making. An important\u0000consideration when developing classification models is the dimensionality of\u0000the input, or feature space, used. If the dimensionality is too high, then the\u0000`curse of dimensionality' may rear its ugly head; manifesting as reduced\u0000predictive performance. To mitigate such effects, practitioners can employ\u0000dimensionality reduction techniques. The current paper formulates a\u0000decision-theoretic approach to dimensionality reduction for structural asset\u0000management. In this approach, the aim is to keep incurred misclassification\u0000costs to a minimum, as the dimensionality is reduced and discriminatory\u0000information may be lost. This formulation is constructed as an eigenvalue\u0000problem, with separabilities between classes weighted according to the cost of\u0000misclassifying them when considered in the context of a decision process. The\u0000approach is demonstrated using a synthetic case study.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FedNE: Surrogate-Assisted Federated Neighbor Embedding for Dimensionality Reduction","authors":"Ziwei Li, Xiaoqi Wang, Hong-You Chen, Han-Wei Shen, Wei-Lun Chao","doi":"arxiv-2409.11509","DOIUrl":"https://doi.org/arxiv-2409.11509","url":null,"abstract":"Federated learning (FL) has rapidly evolved as a promising paradigm that\u0000enables collaborative model training across distributed participants without\u0000exchanging their local data. Despite its broad applications in fields such as\u0000computer vision, graph learning, and natural language processing, the\u0000development of a data projection model that can be effectively used to\u0000visualize data in the context of FL is crucial yet remains heavily\u0000under-explored. Neighbor embedding (NE) is an essential technique for\u0000visualizing complex high-dimensional data, but collaboratively learning a joint\u0000NE model is difficult. The key challenge lies in the objective function, as\u0000effective visualization algorithms like NE require computing loss functions\u0000among pairs of data. In this paper, we introduce textsc{FedNE}, a novel\u0000approach that integrates the textsc{FedAvg} framework with the contrastive NE\u0000technique, without any requirements of shareable data. To address the lack of\u0000inter-client repulsion which is crucial for the alignment in the global\u0000embedding space, we develop a surrogate loss function that each client learns\u0000and shares with each other. Additionally, we propose a data-mixing strategy to\u0000augment the local data, aiming to relax the problems of invisible neighbors and\u0000false neighbors constructed by the local $k$NN graphs. We conduct comprehensive\u0000experiments on both synthetic and real-world datasets. The results demonstrate\u0000that our textsc{FedNE} can effectively preserve the neighborhood data\u0000structures and enhance the alignment in the global embedding space compared to\u0000several baseline methods.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zilinghan Li, Shilan He, Ze Yang, Minseok Ryu, Kibaek Kim, Ravi Madduri
{"title":"Advances in APPFL: A Comprehensive and Extensible Federated Learning Framework","authors":"Zilinghan Li, Shilan He, Ze Yang, Minseok Ryu, Kibaek Kim, Ravi Madduri","doi":"arxiv-2409.11585","DOIUrl":"https://doi.org/arxiv-2409.11585","url":null,"abstract":"Federated learning (FL) is a distributed machine learning paradigm enabling\u0000collaborative model training while preserving data privacy. In today's\u0000landscape, where most data is proprietary, confidential, and distributed, FL\u0000has become a promising approach to leverage such data effectively, particularly\u0000in sensitive domains such as medicine and the electric grid. Heterogeneity and\u0000security are the key challenges in FL, however; most existing FL frameworks\u0000either fail to address these challenges adequately or lack the flexibility to\u0000incorporate new solutions. To this end, we present the recent advances in\u0000developing APPFL, an extensible framework and benchmarking suite for federated\u0000learning, which offers comprehensive solutions for heterogeneity and security\u0000concerns, as well as user-friendly interfaces for integrating new algorithms or\u0000adapting to new applications. We demonstrate the capabilities of APPFL through\u0000extensive experiments evaluating various aspects of FL, including communication\u0000efficiency, privacy preservation, computational performance, and resource\u0000utilization. We further highlight the extensibility of APPFL through case\u0000studies in vertical, hierarchical, and decentralized FL. APPFL is open-sourced\u0000at https://github.com/APPFL/APPFL.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Riya Samanta, Bidyut Saha, Soumya K. Ghosh, Ram Babu Roy
{"title":"Optimizing TinyML: The Impact of Reduced Data Acquisition Rates for Time Series Classification on Microcontrollers","authors":"Riya Samanta, Bidyut Saha, Soumya K. Ghosh, Ram Babu Roy","doi":"arxiv-2409.10942","DOIUrl":"https://doi.org/arxiv-2409.10942","url":null,"abstract":"Tiny Machine Learning (TinyML) enables efficient, lowcost, and privacy\u0000preserving machine learning inference directly on microcontroller units (MCUs)\u0000connected to sensors. Optimizing models for these constrained environments is\u0000crucial. This paper investigates how reducing data acquisition rates affects\u0000TinyML models for time series classification, focusing on resource-constrained,\u0000battery operated IoT devices. By lowering data sampling frequency, we aim to\u0000reduce computational demands RAM usage, energy consumption, latency, and MAC\u0000operations by approximately fourfold while maintaining similar classification\u0000accuracies. Our experiments with six benchmark datasets (UCIHAR, WISDM, PAMAP2,\u0000MHEALTH, MITBIH, and PTB) showed that reducing data acquisition rates\u0000significantly cut energy consumption and computational load, with minimal\u0000accuracy loss. For example, a 75% reduction in acquisition rate for MITBIH and\u0000PTB datasets led to a 60% decrease in RAM usage, 75% reduction in MAC\u0000operations, 74% decrease in latency, and 70% reduction in energy consumption,\u0000without accuracy loss. These results offer valuable insights for deploying\u0000efficient TinyML models in constrained environments.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models","authors":"Divij Gupta, Anubhav Bhatti, Surajsinh Parmar","doi":"arxiv-2409.11302","DOIUrl":"https://doi.org/arxiv-2409.11302","url":null,"abstract":"Time Series Foundation Models (TSFMs) have recently garnered attention for\u0000their ability to model complex, large-scale time series data across domains\u0000such as retail, finance, and transportation. However, their application to\u0000sensitive, domain-specific fields like healthcare remains challenging,\u0000primarily due to the difficulty of fine-tuning these models for specialized,\u0000out-of-domain tasks with scarce publicly available datasets. In this work, we\u0000explore the use of Parameter-Efficient Fine-Tuning (PEFT) techniques to address\u0000these limitations, focusing on healthcare applications, particularly ICU vitals\u0000forecasting for sepsis patients. We introduce and evaluate two selective\u0000(BitFit and LayerNorm Tuning) and two additive (VeRA and FourierFT) PEFT\u0000techniques on multiple configurations of the Chronos TSFM for forecasting vital\u0000signs of sepsis patients. Our comparative analysis demonstrates that some of\u0000these PEFT methods outperform LoRA in terms of parameter efficiency and domain\u0000adaptation, establishing state-of-the-art (SOTA) results in ICU vital\u0000forecasting tasks. Interestingly, FourierFT applied to the Chronos (Tiny)\u0000variant surpasses the SOTA model while fine-tuning only 2,400 parameters\u0000compared to the 700K parameters of the benchmark.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}