Maarten Meire, Quinten Van Baelen, Ted Ooijevaar, Peter Karsmakers
{"title":"Constraint Guided AutoEncoders for Joint Optimization of Condition Indicator Estimation and Anomaly Detection in Machine Condition Monitoring","authors":"Maarten Meire, Quinten Van Baelen, Ted Ooijevaar, Peter Karsmakers","doi":"arxiv-2409.11807","DOIUrl":"https://doi.org/arxiv-2409.11807","url":null,"abstract":"The main goal of machine condition monitoring is, as the name implies, to\u0000monitor the condition of industrial applications. The objective of this\u0000monitoring can be mainly split into two problems. A diagnostic problem, where\u0000normal data should be distinguished from anomalous data, otherwise called\u0000Anomaly Detection (AD), or a prognostic problem, where the aim is to predict\u0000the evolution of a Condition Indicator (CI) that reflects the condition of an\u0000asset throughout its life time. When considering machine condition monitoring,\u0000it is expected that this CI shows a monotonic behavior, as the condition of a\u0000machine gradually degrades over time. This work proposes an extension to\u0000Constraint Guided AutoEncoders (CGAE), which is a robust AD method, that\u0000enables building a single model that can be used for both AD and CI estimation.\u0000For the purpose of improved CI estimation the extension incorporates a\u0000constraint that enforces the model to have monotonically increasing CI\u0000predictions over time. Experimental results indicate that the proposed\u0000algorithm performs similar, or slightly better, than CGAE, with regards to AD,\u0000while improving the monotonic behavior of the CI.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco Montagna, Simone Scardapane, Lev Telyatnikov
{"title":"Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes","authors":"Marco Montagna, Simone Scardapane, Lev Telyatnikov","doi":"arxiv-2409.12033","DOIUrl":"https://doi.org/arxiv-2409.12033","url":null,"abstract":"Graph Neural Networks based on the message-passing (MP) mechanism are a\u0000dominant approach for handling graph-structured data. However, they are\u0000inherently limited to modeling only pairwise interactions, making it difficult\u0000to explicitly capture the complexity of systems with $n$-body relations. To\u0000address this, topological deep learning has emerged as a promising field for\u0000studying and modeling higher-order interactions using various topological\u0000domains, such as simplicial and cellular complexes. While these new domains\u0000provide powerful representations, they introduce new challenges, such as\u0000effectively modeling the interactions among higher-order structures through\u0000higher-order MP. Meanwhile, structured state-space sequence models have proven\u0000to be effective for sequence modeling and have recently been adapted for graph\u0000data by encoding the neighborhood of a node as a sequence, thereby avoiding the\u0000MP mechanism. In this work, we propose a novel architecture designed to operate\u0000with simplicial complexes, utilizing the Mamba state-space model as its\u0000backbone. Our approach generates sequences for the nodes based on the\u0000neighboring cells, enabling direct communication between all higher-order\u0000structures, regardless of their rank. We extensively validate our model,\u0000demonstrating that it achieves competitive performance compared to\u0000state-of-the-art models developed for simplicial complexes.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Viacheslav Barkov, Jonas Schmidinger, Robin Gebbers, Martin Atzmueller
{"title":"An Efficient Model-Agnostic Approach for Uncertainty Estimation in Data-Restricted Pedometric Applications","authors":"Viacheslav Barkov, Jonas Schmidinger, Robin Gebbers, Martin Atzmueller","doi":"arxiv-2409.11985","DOIUrl":"https://doi.org/arxiv-2409.11985","url":null,"abstract":"This paper introduces a model-agnostic approach designed to enhance\u0000uncertainty estimation in the predictive modeling of soil properties, a crucial\u0000factor for advancing pedometrics and the practice of digital soil mapping. For\u0000addressing the typical challenge of data scarcity in soil studies, we present\u0000an improved technique for uncertainty estimation. This method is based on the\u0000transformation of regression tasks into classification problems, which not only\u0000allows for the production of reliable uncertainty estimates but also enables\u0000the application of established machine learning algorithms with competitive\u0000performance that have not yet been utilized in pedometrics. Empirical results\u0000from datasets collected from two German agricultural fields showcase the\u0000practical application of the proposed methodology. Our results and findings\u0000suggest that the proposed approach has the potential to provide better\u0000uncertainty estimation than the models commonly used in pedometrics.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extended Deep Submodular Functions","authors":"Seyed Mohammad Hosseini, Arash Jamshid, Seyed Mahdi Noormousavi, Mahdi Jafari Siavoshani, Naeimeh Omidvar","doi":"arxiv-2409.12053","DOIUrl":"https://doi.org/arxiv-2409.12053","url":null,"abstract":"We introduce a novel category of set functions called Extended Deep\u0000Submodular functions (EDSFs), which are neural network-representable. EDSFs\u0000serve as an extension of Deep Submodular Functions (DSFs), inheriting crucial\u0000properties from DSFs while addressing innate limitations. It is known that DSFs\u0000can represent a limiting subset of submodular functions. In contrast, through\u0000an analysis of polymatroid properties, we establish that EDSFs possess the\u0000capability to represent all monotone submodular functions, a notable\u0000enhancement compared to DSFs. Furthermore, our findings demonstrate that EDSFs\u0000can represent any monotone set function, indicating the family of EDSFs is\u0000equivalent to the family of all monotone set functions. Additionally, we prove\u0000that EDSFs maintain the concavity inherent in DSFs when the components of the\u0000input vector are non-negative real numbers-an essential feature in certain\u0000combinatorial optimization problems. Through extensive experiments, we\u0000illustrate that EDSFs exhibit significantly lower empirical generalization\u0000error than DSFs in the learning of coverage functions. This suggests that EDSFs\u0000present a promising advancement in the representation and learning of set\u0000functions with improved generalization capabilities.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Interpretable End-Stage Renal Disease (ESRD) Prediction: Utilizing Administrative Claims Data with Explainable AI Techniques","authors":"Yubo Li, Saba Al-Sayouri, Rema Padman","doi":"arxiv-2409.12087","DOIUrl":"https://doi.org/arxiv-2409.12087","url":null,"abstract":"This study explores the potential of utilizing administrative claims data,\u0000combined with advanced machine learning and deep learning techniques, to\u0000predict the progression of Chronic Kidney Disease (CKD) to End-Stage Renal\u0000Disease (ESRD). We analyze a comprehensive, 10-year dataset provided by a major\u0000health insurance organization to develop prediction models for multiple\u0000observation windows using traditional machine learning methods such as Random\u0000Forest and XGBoost as well as deep learning approaches such as Long Short-Term\u0000Memory (LSTM) networks. Our findings demonstrate that the LSTM model,\u0000particularly with a 24-month observation window, exhibits superior performance\u0000in predicting ESRD progression, outperforming existing models in the\u0000literature. We further apply SHapley Additive exPlanations (SHAP) analysis to\u0000enhance interpretability, providing insights into the impact of individual\u0000features on predictions at the individual patient level. This study underscores\u0000the value of leveraging administrative claims data for CKD management and\u0000predicting ESRD progression.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Snelleman, B. M. Renting, H. H. Hoos, J. N. van Rijn
{"title":"Edge-Based Graph Component Pooling","authors":"T. Snelleman, B. M. Renting, H. H. Hoos, J. N. van Rijn","doi":"arxiv-2409.11856","DOIUrl":"https://doi.org/arxiv-2409.11856","url":null,"abstract":"Graph-structured data naturally occurs in many research fields, such as\u0000chemistry and sociology. The relational information contained therein can be\u0000leveraged to statistically model graph properties through geometrical deep\u0000learning. Graph neural networks employ techniques, such as message-passing\u0000layers, to propagate local features through a graph. However, message-passing\u0000layers can be computationally expensive when dealing with large and sparse\u0000graphs. Graph pooling operators offer the possibility of removing or merging\u0000nodes in such graphs, thus lowering computational costs. However, pooling\u0000operators that remove nodes cause data loss, and pooling operators that merge\u0000nodes are often computationally expensive. We propose a pooling operator that\u0000merges nodes so as not to cause data loss but is also conceptually simple and\u0000computationally inexpensive. We empirically demonstrate that the proposed\u0000pooling operator performs statistically significantly better than edge pool on\u0000four popular benchmark datasets while reducing time complexity and the number\u0000of trainable parameters by 70.6% on average. Compared to another maximally\u0000powerful method named Graph Isomporhic Network, we show that we outperform them\u0000on two popular benchmark datasets while reducing the number of learnable\u0000parameters on average by 60.9%.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Domain Adaptation Via Data Pruning","authors":"Andrea Napoli, Paul White","doi":"arxiv-2409.12076","DOIUrl":"https://doi.org/arxiv-2409.12076","url":null,"abstract":"The removal of carefully-selected examples from training data has recently\u0000emerged as an effective way of improving the robustness of machine learning\u0000models. However, the best way to select these examples remains an open\u0000question. In this paper, we consider the problem from the perspective of\u0000unsupervised domain adaptation (UDA). We propose AdaPrune, a method for UDA\u0000whereby training examples are removed to attempt to align the training\u0000distribution to that of the target data. By adopting the maximum mean\u0000discrepancy (MMD) as the criterion for alignment, the problem can be neatly\u0000formulated and solved as an integer quadratic program. We evaluate our approach\u0000on a real-world domain shift task of bioacoustic event detection. As a method\u0000for UDA, we show that AdaPrune outperforms related techniques, and is\u0000complementary to other UDA algorithms such as CORAL. Our analysis of the\u0000relationship between the MMD and model accuracy, along with t-SNE plots,\u0000validate the proposed method as a principled and well-founded way of performing\u0000data pruning.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Grid Graph Neural Networks with Self-Attention for Computational Mechanics","authors":"Paul Garnier, Jonathan Viquerat, Elie Hachem","doi":"arxiv-2409.11899","DOIUrl":"https://doi.org/arxiv-2409.11899","url":null,"abstract":"Advancement in finite element methods have become essential in various\u0000disciplines, and in particular for Computational Fluid Dynamics (CFD), driving\u0000research efforts for improved precision and efficiency. While Convolutional\u0000Neural Networks (CNNs) have found success in CFD by mapping meshes into images,\u0000recent attention has turned to leveraging Graph Neural Networks (GNNs) for\u0000direct mesh processing. This paper introduces a novel model merging\u0000Self-Attention with Message Passing in GNNs, achieving a 15% reduction in RMSE\u0000on the well known flow past a cylinder benchmark. Furthermore, a dynamic mesh\u0000pruning technique based on Self-Attention is proposed, that leads to a robust\u0000GNN-based multigrid approach, also reducing RMSE by 15%. Additionally, a new\u0000self-supervised training method based on BERT is presented, resulting in a 25%\u0000RMSE reduction. The paper includes an ablation study and outperforms\u0000state-of-the-art models on several challenging datasets, promising advancements\u0000similar to those recently achieved in natural language and image processing.\u0000Finally, the paper introduces a dataset with meshes larger than existing ones\u0000by at least an order of magnitude. Code and Datasets will be released at\u0000https://github.com/DonsetPG/multigrid-gnn.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Semi-Supervised Learning via Representative and Diverse Sample Selection","authors":"Qian Shao, Jiangrui Kang, Qiyuan Chen, Zepeng Li, Hongxia Xu, Yiwen Cao, Jiajuan Liang, Jian Wu","doi":"arxiv-2409.11653","DOIUrl":"https://doi.org/arxiv-2409.11653","url":null,"abstract":"Semi-Supervised Learning (SSL) has become a preferred paradigm in many deep\u0000learning tasks, which reduces the need for human labor. Previous studies\u0000primarily focus on effectively utilising the labelled and unlabeled data to\u0000improve performance. However, we observe that how to select samples for\u0000labelling also significantly impacts performance, particularly under extremely\u0000low-budget settings. The sample selection task in SSL has been under-explored\u0000for a long time. To fill in this gap, we propose a Representative and Diverse\u0000Sample Selection approach (RDSS). By adopting a modified Frank-Wolfe algorithm\u0000to minimise a novel criterion $alpha$-Maximum Mean Discrepancy ($alpha$-MMD),\u0000RDSS samples a representative and diverse subset for annotation from the\u0000unlabeled data. We demonstrate that minimizing $alpha$-MMD enhances the\u0000generalization ability of low-budget learning. Experimental results show that\u0000RDSS consistently improves the performance of several popular SSL frameworks\u0000and outperforms the state-of-the-art sample selection approaches used in Active\u0000Learning (AL) and Semi-Supervised Active Learning (SSAL), even with constrained\u0000annotation budgets.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Wazed AliIntelligent Embedded Systems, Asif bin MustafaSchool of CIT, Technical University of Munich, Munich, Germany, Md. Aukerul Moin ShuvoDept. of Computer Science and Engineering, Rajshahi University of Engg. & Technology, Rajshahi, Bangladesh, Bernhard SickIntelligent Embedded Systems
{"title":"Location based Probabilistic Load Forecasting of EV Charging Sites: Deep Transfer Learning with Multi-Quantile Temporal Convolutional Network","authors":"Mohammad Wazed AliIntelligent Embedded Systems, Asif bin MustafaSchool of CIT, Technical University of Munich, Munich, Germany, Md. Aukerul Moin ShuvoDept. of Computer Science and Engineering, Rajshahi University of Engg. & Technology, Rajshahi, Bangladesh, Bernhard SickIntelligent Embedded Systems","doi":"arxiv-2409.11862","DOIUrl":"https://doi.org/arxiv-2409.11862","url":null,"abstract":"Electrification of vehicles is a potential way of reducing fossil fuel usage\u0000and thus lessening environmental pollution. Electric Vehicles (EVs) of various\u0000types for different transport modes (including air, water, and land) are\u0000evolving. Moreover, different EV user groups (commuters, commercial or domestic\u0000users, drivers) may use different charging infrastructures (public, private,\u0000home, and workplace) at various times. Therefore, usage patterns and energy\u0000demand are very stochastic. Characterizing and forecasting the charging demand\u0000of these diverse EV usage profiles is essential in preventing power outages.\u0000Previously developed data-driven load models are limited to specific use cases\u0000and locations. None of these models are simultaneously adaptive enough to\u0000transfer knowledge of day-ahead forecasting among EV charging sites of diverse\u0000locations, trained with limited data, and cost-effective. This article presents\u0000a location-based load forecasting of EV charging sites using a deep\u0000Multi-Quantile Temporal Convolutional Network (MQ-TCN) to overcome the\u0000limitations of earlier models. We conducted our experiments on data from four\u0000charging sites, namely Caltech, JPL, Office-1, and NREL, which have diverse EV\u0000user types like students, full-time and part-time employees, random visitors,\u0000etc. With a Prediction Interval Coverage Probability (PICP) score of 93.62%,\u0000our proposed deep MQ-TCN model exhibited a remarkable 28.93% improvement over\u0000the XGBoost model for a day-ahead load forecasting at the JPL charging site. By\u0000transferring knowledge with the inductive Transfer Learning (TL) approach, the\u0000MQ-TCN model achieved a 96.88% PICP score for the load forecasting task at the\u0000NREL site using only two weeks of data.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}