Proceedings of machine learning research最新文献

筛选
英文 中文
Temporally Multi-Scale Sparse Self-Attention for Physical Activity Data Imputation. 用于体育锻炼数据估算的时空多尺度稀疏自我关注。
Hui Wei, Maxwell A Xu, Colin Samplawski, James M Rehg, Santosh Kumar, Benjamin M Marlin
{"title":"Temporally Multi-Scale Sparse Self-Attention for Physical Activity Data Imputation.","authors":"Hui Wei, Maxwell A Xu, Colin Samplawski, James M Rehg, Santosh Kumar, Benjamin M Marlin","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Wearable sensors enable health researchers to continuously collect data pertaining to the physiological state of individuals in real-world settings. However, such data can be subject to extensive missingness due to a complex combination of factors. In this work, we study the problem of imputation of missing step count data, one of the most ubiquitous forms of wearable sensor data. We construct a novel and large scale data set consisting of a training set with over 3 million hourly step count observations and a test set with over 2.5 million hourly step count observations. We propose a domain knowledge-informed sparse self-attention model for this task that captures the temporal multi-scale nature of step-count data. We assess the performance of the model relative to baselines and conduct ablation studies to verify our specific model designs.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"248 ","pages":"137-154"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11421853/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142334005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
s-SuStaIn: Scaling subtype and stage inference via simultaneous clustering of subjects and biomarkers. s-SuStaIn:通过受试者和生物标记物的同时聚类来衡量亚型和阶段推断。
Raghav Tandon, James J Lah, Cassie S Mitchell
{"title":"s-SuStaIn: Scaling subtype and stage inference via simultaneous clustering of subjects and biomarkers.","authors":"Raghav Tandon, James J Lah, Cassie S Mitchell","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Event-based models (EBM) provide an important platform for modeling disease progression. This work successfully extends previous EBM approaches to work with larger sets of biomarkers while simultaneously modeling heterogeneity in disease progression trajectories. We develop and validate the s-SuStain method for scalable event-based modeling of disease progression subtypes using large numbers of features. s-SuStaIn is typically an order of magnitude faster than its predecessor (SuStaIn). Moreover, we perform a case study with s-SuStaIn using open access cross-sectional Alzheimer's Disease Neuroimaging (ADNI) data to stage AD patients into four subtypes based on dynamic disease progression. s-SuStaIn shows that the inferred subtypes and stages predict progression to AD among MCI subjects. The subtypes show difference in AD incidence-rates and reveal clinically meaningful progression trajectories when mapped to a brain atlas.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"248 ","pages":"461-476"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11881980/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143568330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Calibrated and Conformal Prediction Improves Bayesian Optimization. 在线校准和适形预测改进了贝叶斯优化。
Shachi Deshpande, Charles Marx, Volodymyr Kuleshov
{"title":"Online Calibrated and Conformal Prediction Improves Bayesian Optimization.","authors":"Shachi Deshpande, Charles Marx, Volodymyr Kuleshov","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from calibration-i.e., an 80% predictive interval should contain the true outcome 80% of the time. Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions. We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data, and we show how to integrate these algorithms in Bayesian optimization with minimal overhead. Empirically, we find that calibrated Bayesian optimization converges to better optima in fewer steps, and we demonstrate improved performance on standard benchmark functions and hyperparameter optimization tasks.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"1450-1458"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11482741/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the estimation of persistence intensity functions and linear representations of persistence diagrams. 持久性强度函数的估计与持久性图的线性表示。
Weichen Wu, Jisu Kim, Alessandro Rinaldo
{"title":"On the estimation of persistence intensity functions and linear representations of persistence diagrams.","authors":"Weichen Wu, Jisu Kim, Alessandro Rinaldo","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Persistence diagrams are one of the most popular types of data summaries used in Topological Data Analysis. The prevailing statistical approach to analyzing persistence diagrams is concerned with filtering out topological noise. In this paper, we adopt a different viewpoint and aim at estimating the actual distribution of a random persistence diagram, which captures both topological signal and noise. To that effect, Chazal and Divol (2019) proved that, under general conditions, the expected value of a random persistence diagram is a measure admitting a Lebesgue density, called the persistence intensity function. In this paper, we are concerned with estimating the persistence intensity function and a novel, normalized version of it - called the persistence density function. We present a class of kernel-based estimators based on an i.i.d. sample of persistence diagrams and derive estimation rates in the supremum norm. As a direct corollary, we obtain uniform consistency rates for estimating linear representations of persistence diagrams, including Betti numbers and persistence surfaces. Interestingly, the persistence density function delivers stronger statistical guarantees.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"3610-3618"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12083882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144096115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Discretization for Event PredicTion (ADEPT). 事件预测自适应离散化(ADEPT)。
Jimmy Hickey, Ricardo Henao, Daniel Wojdyla, Michael Pencina, Matthew Engelhard
{"title":"Adaptive Discretization for Event PredicTion (ADEPT).","authors":"Jimmy Hickey, Ricardo Henao, Daniel Wojdyla, Michael Pencina, Matthew Engelhard","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Recently developed survival analysis methods improve upon existing approaches by predicting the probability of event occurrence in each of a number pre-specified (discrete) time intervals. By avoiding placing strong parametric assumptions on the event density, this approach tends to improve prediction performance, particularly when data are plentiful. However, in clinical settings with limited available data, it is often preferable to judiciously partition the event time space into a limited number of intervals well suited to the prediction task at hand. In this work, we develop Adaptive Discretization for Event PredicTion (ADEPT) to learn from data a set of cut points defining such a partition. We show that in two simulated datasets, we are able to recover intervals that match the underlying generative model. We then demonstrate improved prediction performance on three real-world observational datasets, including a large, newly harmonized stroke risk prediction dataset. Finally, we argue that our approach facilitates clinical decision-making by suggesting time intervals that are most appropriate for each task, in the sense that they facilitate more accurate risk prediction.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"1351-1359"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11078624/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SADI: Similarity-Aware Diffusion Model-Based Imputation for Incomplete Temporal EHR Data. SADI:基于相似性感知扩散模型的不完整时态电子病历数据推算。
Zongyu Dai, Emily Getzen, Qi Long
{"title":"SADI: Similarity-Aware Diffusion Model-Based Imputation for Incomplete Temporal EHR Data.","authors":"Zongyu Dai, Emily Getzen, Qi Long","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Missing values are prevalent in temporal electronic health records (EHR) data and are known to complicate data analysis and lead to biased results. The current state-of-the-art (SOTA) models for imputing missing values in EHR primarily leverage correlations across time points and across features, which perform well when data have strong correlation across time points, such as in ICU data where high-frequency time series data are collected. However, this is often insufficient for temporal EHR data from non-ICU settings (e.g., outpatient visits for primary care or specialty care), where data are collected at substantially sparser time points, resulting in much weaker correlation across time points. To address this methodological gap, we propose the Similarity-Aware Diffusion Model-Based Imputation (SADI), a novel imputation method that leverages the diffusion model and utilizes information across dependent variables. We apply SADI to impute incomplete temporal EHR data and propose a similarity-aware denoising function, which includes a self-attention mechanism to model the correlations between time points, features, and similar patients. To the best of our knowledge, this is the first time that the information of similar patients is directly used to construct imputation for incomplete temporal EHR data. Our extensive experiments on two datasets, the Critical Path For Alzheimer's Disease (CPAD) data and the PhysioNet Challenge 2012 data, show that SADI outperforms the current SOTA under various missing data mechanisms, including missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR).</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"4195-4203"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11391213/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Generalization Ability of Unsupervised Pretraining. 论无监督预训练的泛化能力
Yuyang Deng, Junyuan Hong, Jiayu Zhou, Mehrdad Mahdavi
{"title":"On the Generalization Ability of Unsupervised Pretraining.","authors":"Yuyang Deng, Junyuan Hong, Jiayu Zhou, Mehrdad Mahdavi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization. However, a rigorous understanding of how the representation function learned on an unlabeled dataset affects the generalization of the fine-tuned model is lacking. Existing theoretical research does not adequately account for the heterogeneity of the distribution and tasks in pre-training and fine-tuning stage. To bridge this gap, this paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase, ultimately affecting the generalization capabilities of the fine-tuned model on downstream tasks. We apply our theoretical framework to analyze generalization bound of two distinct scenarios: Context Encoder pre-training with deep neural networks and Masked Autoencoder pre-training with deep transformers, followed by fine-tuning on a binary classification task. Finally, inspired by our findings, we propose a novel regularization method during pre-training to further enhances the generalization of fine-tuned model. Overall, our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"4519-4527"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11484219/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online learning in bandits with predicted context. 有预测背景的匪帮在线学习
Yongyi Guo, Ziping Xu, Susan Murphy
{"title":"Online learning in bandits with predicted context.","authors":"Yongyi Guo, Ziping Xu, Susan Murphy","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the contextual bandit problem where at each time, the agent only has access to a noisy version of the context and the error variance (or an estimator of this variance). This setting is motivated by a wide range of applications where the true context for decision-making is unobserved, and only a prediction of the context by a potentially complex machine learning algorithm is available. When the context error is non-vanishing, classical bandit algorithms fail to achieve sublinear regret. We propose the first online algorithm in this setting with sublinear regret guarantees under mild conditions. The key idea is to extend the measurement error model in classical statistics to the online decision-making setting, which is nontrivial due to the policy being dependent on the noisy context observations. We further demonstrate the benefits of the proposed approach in simulation environments based on synthetic and real digital intervention datasets.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"2215-2223"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11501084/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods. 在线双级优化:在线梯度交替法的遗憾分析
Davoud Ataee Tarzanagh, Parvin Nazari, Bojian Hou, Li Shen, Laura Balzano
{"title":"Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods.","authors":"Davoud Ataee Tarzanagh, Parvin Nazari, Bojian Hou, Li Shen, Laura Balzano","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper introduces an <i>online bilevel optimization</i> setting in which a sequence of time-varying bilevel problems is revealed one after the other. We extend the known regret bounds for single-level online algorithms to the bilevel setting. Specifically, we provide new notions of <i>bilevel regret</i>, develop an online alternating time-averaged gradient method that is capable of leveraging smoothness, and give regret bounds in terms of the path-length of the inner and outer minimizer sequences.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"2854-2862"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11452163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142382692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Sparse Survival Trees. 最优稀疏生存树
Rui Zhang, Rui Xin, Margo Seltzer, Cynthia Rudin
{"title":"Optimal Sparse Survival Trees.","authors":"Rui Zhang, Rui Xin, Margo Seltzer, Cynthia Rudin","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for <i>survival analysis</i> due to their appealing interpretablility and their ability to capture complex relationships. However, most existing methods to produce survival trees rely on heuristic (or greedy) algorithms, which risk producing sub-optimal models. We present a dynamic-programming-with-bounds approach that finds provably-optimal sparse survival tree models, frequently in only a few seconds.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"238 ","pages":"352-360"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11417463/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信