Machine LearningPub Date : 2024-03-22DOI: 10.1007/s10994-024-06532-z
{"title":"Optimal clustering from noisy binary feedback","authors":"","doi":"10.1007/s10994-024-06532-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06532-z","url":null,"abstract":"<h3>Abstract</h3> <p>We study the problem of clustering a set of items from binary user feedback. Such a problem arises in crowdsourcing platforms solving large-scale labeling tasks with minimal effort put on the users. For example, in some of the recent reCAPTCHA systems, users clicks (binary answers) can be used to efficiently label images. In our inference problem, items are grouped into initially unknown non-overlapping clusters. To recover these clusters, the learner sequentially presents to users a finite list of items together with a question with a binary answer selected from a fixed finite set. For each of these items, the user provides a noisy answer whose expectation is determined by the item cluster and the question and by an item-specific parameter characterizing the <em>hardness</em> of classifying the item. The objective is to devise an algorithm with a minimal cluster recovery error rate. We derive problem-specific information-theoretical lower bounds on the error rate satisfied by any algorithm, for both uniform and adaptive (list, question) selection strategies. For uniform selection, we present a simple algorithm built upon the K-means algorithm and whose performance almost matches the fundamental limits. For adaptive selection, we develop an adaptive algorithm that is inspired by the derivation of the information-theoretical error lower bounds, and in turn allocates the budget in an efficient way. The algorithm learns to select items hard to cluster and relevant questions more often. We compare the performance of our algorithms with or without the adaptive selection strategy numerically and illustrate the gain achieved by being adaptive.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"24 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140204495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TOCOL: improving contextual representation of pre-trained language models via token-level contrastive learning","authors":"Keheng Wang, Chuantao Yin, Rumei Li, Sirui Wang, Yunsen Xian, Wenge Rong, Zhang Xiong","doi":"10.1007/s10994-023-06512-9","DOIUrl":"https://doi.org/10.1007/s10994-023-06512-9","url":null,"abstract":"<p>Self-attention, which allows transformers to capture deep bidirectional contexts, plays a vital role in BERT-like pre-trained language models. However, the maximum likelihood pre-training objective of BERT may produce an anisotropic word embedding space, which leads to biased attention scores for high-frequency tokens, as they are very close to each other in representation space and thus have higher similarities. This bias may ultimately affect the encoding of global contextual information. To address this issue, we propose TOCOL, a <b>TO</b>ken-Level <b>CO</b>ntrastive <b>L</b>earning framework for improving the contextual representation of pre-trained language models, which integrates a novel self-supervised objective to the attention mechanism to reshape the word representation space and encourages PLM to capture the global semantics of sentences. Results on the GLUE Benchmark show that TOCOL brings considerable improvement over the original BERT. Furthermore, we conduct a detailed analysis and demonstrate the robustness of our approach for low-resource scenarios.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"9 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140168796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stress detection with encoding physiological signals and convolutional neural network","authors":"Michela Quadrini, Antonino Capuccio, Denise Falcone, Sebastian Daberdaku, Alessandro Blanda, Luca Bellanova, Gianluca Gerard","doi":"10.1007/s10994-023-06509-4","DOIUrl":"https://doi.org/10.1007/s10994-023-06509-4","url":null,"abstract":"<p>Stress is a significant and growing phenomenon in the modern world that leads to numerous health problems. Robust and non-invasive method developments for early and accurate stress detection are crucial in enhancing people’s quality of life. Previous researches show that using machine learning approaches on physiological signals is a reliable stress predictor by achieving significant results. However, it requires determining features by hand. Such a selection is a challenge in this context since stress determines nonspecific human responses. This work overcomes such limitations by considering STREDWES, an approach for Stress Detection from Wearable Sensors Data. STREDWES encodes signal fragments of physiological signals into images and classifies them by a Convolutional Neural Network (CNN). This study aims to study several encoding methods, including the Gramian Angular Summation/Difference Field method and Markov Transition Field, to evaluate the best way to encode signals into images in this domain. Such a study is performed on the NEURO dataset. Moreover, we investigate the usefulness of STREDWES in real scenarios by considering the SWELL dataset and a personalized approach. Finally, we compare the proposed approach with its competitors by considering the WESAD dataset. It outperforms the others.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"8 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Glacier: guided locally constrained counterfactual explanations for time series classification","authors":"Zhendong Wang, Isak Samsten, Ioanna Miliou, Rami Mochaourab, Panagiotis Papapetrou","doi":"10.1007/s10994-023-06502-x","DOIUrl":"https://doi.org/10.1007/s10994-023-06502-x","url":null,"abstract":"<p>In machine learning applications, there is a need to obtain predictive models of high performance and, most importantly, to allow end-users and practitioners to understand and act on their predictions. One way to obtain such understanding is via counterfactuals, that provide sample-based explanations in the form of recommendations on which features need to be modified from a test example so that the classification outcome of a given classifier changes from an undesired outcome to a desired one. This paper focuses on the domain of time series classification, more specifically, on defining counterfactual explanations for univariate time series. We propose <span>Glacier</span>, a model-agnostic method for generating locally-constrained counterfactual explanations for time series classification using gradient search either on the original space or on a latent space that is learned through an auto-encoder. An additional flexibility of our method is the inclusion of constraints on the counterfactual generation process that favour applying changes to particular time series points or segments while discouraging changing others. The main purpose of these constraints is to ensure more reliable counterfactuals, while increasing the efficiency of the counterfactual generation process. Two particular types of constraints are considered, i.e., example-specific constraints and global constraints. We conduct extensive experiments on 40 datasets from the UCR archive, comparing different instantiations of <span>Glacier</span> against three competitors. Our findings suggest that <span>Glacier</span> outperforms the three competitors in terms of two common metrics for counterfactuals, i.e., proximity and compactness. Moreover, <span>Glacier</span> obtains comparable counterfactual validity compared to the best of the three competitors. Finally, when comparing the unconstrained variant of <span>Glacier</span> to the constraint-based variants, we conclude that the inclusion of example-specific and global constraints yields a good performance while demonstrating the trade-off between the different metrics.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"24 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140125822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-03-05DOI: 10.1007/s10994-024-06516-z
{"title":"Neural network relief: a pruning algorithm based on neural activity","authors":"","doi":"10.1007/s10994-024-06516-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06516-z","url":null,"abstract":"<h3>Abstract</h3> <p>Current deep neural networks (DNNs) are overparameterized and use most of their neuronal connections during inference for each task. The human brain, however, developed specialized regions for different tasks and performs inference with a small fraction of its neuronal connections. We propose an iterative pruning strategy introducing a simple importance-score metric that deactivates unimportant connections, tackling overparameterization in DNNs and modulating the firing patterns. The aim is to find the smallest number of connections that is still capable of solving a given task with comparable accuracy, i.e. a simpler subnetwork. We achieve comparable performance for LeNet architectures on MNIST, and significantly higher parameter compression than state-of-the-art algorithms for VGG and ResNet architectures on CIFAR-10/100 and Tiny-ImageNet. Our approach also performs well for the two different optimizers considered—Adam and SGD. The algorithm is not designed to minimize FLOPs when considering current hardware and software implementations, although it performs reasonably when compared to the state of the art.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"26 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-03-04DOI: 10.1007/s10994-024-06518-x
Chenkang Zhang, Heng Huang, Bin Gu
{"title":"Tackle balancing constraints in semi-supervised ordinal regression","authors":"Chenkang Zhang, Heng Huang, Bin Gu","doi":"10.1007/s10994-024-06518-x","DOIUrl":"https://doi.org/10.1007/s10994-024-06518-x","url":null,"abstract":"<p>Semi-supervised ordinal regression (S<sup>2</sup>OR) has been recognized as a valuable technique to improve the performance of the ordinal regression (OR) model by leveraging available unlabeled samples. The balancing constraint is a useful approach for semi-supervised algorithms, as it can prevent the trivial solution of classifying a large number of unlabeled examples into a few classes. However, rapid training of the S<sup>2</sup>OR model with balancing constraints is still an open problem due to the difficulty in formulating and solving the corresponding optimization objective. To tackle this issue, we propose a novel form of balancing constraints and extend the traditional convex–concave procedure (CCCP) approach to solve our objective function. Additionally, we transform the convex inner loop (CIL) problem generated by the CCCP approach into a quadratic problem that resembles support vector machine, where multiple equality constraints are treated as virtual samples. As a result, we can utilize the existing fast solver to efficiently solve the CIL problem. Experimental results conducted on several benchmark and real-world datasets not only validate the effectiveness of our proposed algorithm but also demonstrate its superior performance compared to other supervised and semi-supervised algorithms</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"11 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140035943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-02-28DOI: 10.1007/s10994-023-06510-x
Xiaodong Wang, Fushing Hsieh
{"title":"An encoding approach for stable change point detection","authors":"Xiaodong Wang, Fushing Hsieh","doi":"10.1007/s10994-023-06510-x","DOIUrl":"https://doi.org/10.1007/s10994-023-06510-x","url":null,"abstract":"<p>Without imposing prior distributional knowledge underlying multivariate time series of interest, we propose a nonparametric change-point detection approach to estimate the number of change points and their locations along the temporal axis. We develop a structural subsampling procedure such that the observations are encoded into multiple sequences of Bernoulli variables. A maximum likelihood approach in conjunction with a newly developed searching algorithm is implemented to detect change points on each Bernoulli process separately. Then, aggregation statistics are proposed to collectively synthesize change-point results from all individual univariate time series into consistent and stable location estimations. We also study a weighting strategy to measure the degree of relevance for different subsampled groups. Simulation studies are conducted and shown that the proposed change-point methodology for multivariate time series has favorable performance comparing with currently available state-of-the-art nonparametric methods under various settings with different degrees of complexity. Real data analyses are finally performed on categorical, ordinal, and continuous time series taken from fields of genetics, climate, and finance.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"630 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-02-28DOI: 10.1007/s10994-024-06515-0
{"title":"Fair and green hyperparameter optimization via multi-objective and multiple information source Bayesian optimization","authors":"","doi":"10.1007/s10994-024-06515-0","DOIUrl":"https://doi.org/10.1007/s10994-024-06515-0","url":null,"abstract":"<h3>Abstract</h3> <p>It has been recently remarked that focusing only on accuracy in searching for optimal Machine Learning models amplifies biases contained in the data, leading to unfair predictions and decision supports. Recently, multi-objective hyperparameter optimization has been proposed to search for Machine Learning models which offer equally Pareto-efficient trade-offs between accuracy and fairness. Although these approaches proved to be more versatile than fairness-aware Machine Learning algorithms—which instead optimize accuracy constrained to some threshold on fairness—their carbon footprint could be dramatic, due to the large amount of energy required in the case of large datasets. We propose an approach named FanG-HPO: fair and green hyperparameter optimization (HPO), based on both multi-objective and multiple information source Bayesian optimization. FanG-HPO uses subsets of the large dataset to obtain cheap approximations (aka information sources) of both accuracy and fairness, and multi-objective Bayesian optimization to efficiently identify Pareto-efficient (accurate and fair) Machine Learning models. Experiments consider four benchmark (fairness) datasets and four Machine Learning algorithms, and provide an assessment of FanG-HPO against both fairness-aware Machine Learning approaches and two state-of-the-art Bayesian optimization tools addressing multi-objective and energy-aware optimization.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"17 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-02-26DOI: 10.1007/s10994-023-06511-w
Xiao-Yang Liu, Ziyi Xia, Hongyang Yang, Jiechao Gao, Daochen Zha, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo
{"title":"Dynamic datasets and market environments for financial reinforcement learning","authors":"Xiao-Yang Liu, Ziyi Xia, Hongyang Yang, Jiechao Gao, Daochen Zha, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo","doi":"10.1007/s10994-023-06511-w","DOIUrl":"https://doi.org/10.1007/s10994-023-06511-w","url":null,"abstract":"<p>The financial market is a particularly challenging playground for deep reinforcement learning due to its unique feature of dynamic datasets. Building high-quality market environments for training financial reinforcement learning (FinRL) agents is difficult due to major factors such as the low signal-to-noise ratio of financial data, survivorship bias of historical data, and model overfitting. In this paper, we present an updated version of FinRL-Meta, a data-centric and openly accessible library that processes dynamic datasets from real-world markets into gym-style market environments and has been actively maintained by the AI4Finance community. First, following a DataOps paradigm, we provide hundreds of market environments through an automatic data curation pipeline. Second, we provide homegrown examples and reproduce popular research papers as stepping stones for users to design new trading strategies. We also deploy the library on cloud platforms so that users can visualize their own results and assess the relative performance via community-wise competitions. Third, we provide dozens of Jupyter/Python demos organized into a curriculum and a documentation website to serve the rapidly growing community. The codes are available at https://github.com/AI4Finance-Foundation/FinRL-Meta</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"2015 2","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139981410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-02-22DOI: 10.1007/s10994-023-06505-8
Wei Wei, Da Wang, Lin Li, Jiye Liang
{"title":"Re-attentive experience replay in off-policy reinforcement learning","authors":"Wei Wei, Da Wang, Lin Li, Jiye Liang","doi":"10.1007/s10994-023-06505-8","DOIUrl":"https://doi.org/10.1007/s10994-023-06505-8","url":null,"abstract":"<p>Experience replay, which stores past samples for reuse, has become a fundamental component of off-policy reinforcement learning. Some pioneering works have indicated that prioritization or reweighting of samples with on-policiness can yield significant performance improvements. However, this method doesn’t pay enough attention to sample diversity, which may result in instability or even long-term performance slumps. In this work, we introduce a novel Re-attention criterion to reevaluate recent experiences, thus benefiting the agent from learning about them. We call this overall algorithm, Re-attentive Experience Replay (RAER). RAER employs a parameter-insensitive dynamic testing technique to enhance the attention of samples generated by policies with promising trends in overall performance. By wisely leveraging diverse samples, RAER fulfills the positive effects of on-policiness while avoiding its potential negative influences. Extensive experiments demonstrate the effectiveness of RAER in improving both performance and stability. Moreover, replacing the on-policiness component of the state-of-the-art approach with RAER can yield significant benefits.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"23 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}