Proceedings of machine learning research最新文献

筛选
英文 中文
Fair admission risk prediction with proportional multicalibration. 比例多重校准的公平入场风险预测。
William G La Cava, Elle Lett, Guangya Wan
{"title":"Fair admission risk prediction with proportional multicalibration.","authors":"William G La Cava,&nbsp;Elle Lett,&nbsp;Guangya Wan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Fair calibration is a widely desirable fairness criteria in risk prediction contexts. One way to measure and achieve fair calibration is with multicalibration. Multicalibration constrains calibration error among flexibly-defined subpopulations while maintaining overall calibration. However, multicalibrated models can exhibit a higher percent calibration error among groups with lower base rates than groups with higher base rates. As a result, it is possible for a decision-maker to learn to trust or distrust model predictions for specific groups. To alleviate this, we propose <i>proportional multicalibration</i>, a criteria that constrains the percent calibration error among groups and within prediction bins. We prove that satisfying proportional multicalibration bounds a model's multicalibration as well its <i>differential calibration</i>, a fairness criteria that directly measures how closely a model approximates sufficiency. Therefore, proportionally calibrated models limit the ability of decision makers to distinguish between model performance on different patient groups, which may make the models more trustworthy in practice. We provide an efficient algorithm for post-processing risk prediction models for proportional multicalibration and evaluate it empirically. We conduct simulation studies and investigate a real-world application of PMC-postprocessing to prediction of emergency department patient admissions. We observe that proportional multicalibration is a promising criteria for controlling simultaneous measures of calibration fairness of a model over intersectional groups with virtually no cost in terms of classification performance.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"209 ","pages":"350-378"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10417639/pdf/nihms-1917236.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10008223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Directed Graphical Models and Causal Discovery for Zero-Inflated Data. 零膨胀数据的有向图模型和因果发现。
Shiqing Yu, Mathias Drton, Ali Shojaie
{"title":"Directed Graphical Models and Causal Discovery for Zero-Inflated Data.","authors":"Shiqing Yu, Mathias Drton, Ali Shojaie","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>With advances in technology, gene expression measurements from single cells can be used to gain refined insights into regulatory relationships among genes. Directed graphical models are well-suited to explore such (cause-effect) relationships. However, statistical analyses of single cell data are complicated by the fact that the data often show zero-inflated expression patterns. To address this challenge, we propose directed graphical models that are based on Hurdle conditional distributions parametrized in terms of polynomials in parent variables and their 0/1 indicators of being zero or nonzero. While directed graphs for Gaussian models are only identifiable up to an equivalence class in general, we show that, under a natural and weak assumption, the exact directed acyclic graph of our zero-inflated models can be identified. We propose methods for graph recovery, apply our model to real single-cell gene expression data on T helper cells, and show simulated experiments that validate the identifiability and graph estimation methods in practice.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"213 ","pages":"27-67"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skill, or Style? Classification of Fetal Sonography Eye-Tracking Data. 技能,还是风格?胎儿超声眼动追踪数据的分类。
Clare Teng, Lior Drukker, Aris T Papageorghiou, J Alison Noble
{"title":"Skill, or Style? Classification of Fetal Sonography Eye-Tracking Data.","authors":"Clare Teng, Lior Drukker, Aris T Papageorghiou, J Alison Noble","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present a method for classifying human skill at fetal ultrasound scanning from eye-tracking and pupillary data of sonographers. Human skill characterization for this clinical task typically creates groupings of clinician skills such as expert and beginner based on the number of years of professional experience; experts typically have more than 10 years and beginners between 0-5 years. In some cases, they also include trainees who are not yet fully-qualified professionals. Prior work has considered eye movements that necessitates separating eye-tracking data into eye movements, such as fixations and saccades. Our method does not use prior assumptions about the relationship between years of experience and does not require the separation of eye-tracking data. Our best performing skill classification model achieves an F1 score of 98% and 70% for expert and trainee classes respectively. We also show that years of experience as a direct measure of skill, is significantly correlated to the expertise of a sonographer.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"210 ","pages":"184-198"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7614578/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9551454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive Representation Learning for Gaze Estimation. 用于凝视估计的对比表征学习
Swati Jindal, Roberto Manduchi
{"title":"Contrastive Representation Learning for Gaze Estimation.","authors":"Swati Jindal, Roberto Manduchi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Self-supervised learning (SSL) has become prevalent for learning representations in computer vision. Notably, SSL exploits contrastive learning to encourage visual representations to be invariant under various image transformations. The task of gaze estimation, on the other hand, demands not just invariance to various appearances but also equivariance to the geometric transformations. In this work, we propose a simple contrastive representation learning framework for gaze estimation, named <i>Gaze Contrastive Learning (GazeCLR)</i>. <i>GazeCLR</i> exploits multi-view data to promote equivariance and relies on selected data augmentation techniques that do not alter gaze directions for invariance learning. Our experiments demonstrate the effectiveness of <i>GazeCLR</i> for several settings of the gaze estimation task. Particularly, our results show that <i>GazeCLR</i> improves the performance of cross-domain gaze estimation and yields as high as 17.2% relative improvement. Moreover, the <i>GazeCLR</i> framework is competitive with state-of-the-art representation learning methods for few-shot evaluation. The code and pre-trained models are available at https://github.com/jswati31/gazeclr.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"210 ","pages":"37-49"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10270367/pdf/nihms-1862058.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9715239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-preserving patient clustering for personalized federated learning. 针对个性化联合学习的隐私保护患者聚类。
Ahmed Elhussein, Gamze Gürsoy
{"title":"Privacy-preserving patient clustering for personalized federated learning.","authors":"Ahmed Elhussein, Gamze Gürsoy","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Federated Learning (FL) is a machine learning framework that enables multiple organizations to train a model without sharing their data with a central server. However, it experiences significant performance degradation if the data is non-identically independently distributed (non-IID). This is a problem in medical settings, where variations in the patient population contribute significantly to distribution differences across hospitals. Personalized FL addresses this issue by accounting for site-specific distribution differences. Clustered FL, a Personalized FL variant, was used to address this problem by clustering patients into groups across hospitals and training separate models on each group. However, privacy concerns remained as a challenge as the clustering process requires exchange of patient-level information. This was previously solved by forming clusters using aggregated data, which led to inaccurate groups and performance degradation. In this study, we propose Privacy-preserving Community-Based Federated machine Learning (PCBFL), a novel Clustered FL framework that can cluster patients using patient-level data while protecting privacy. PCBFL uses Secure Multiparty Computation, a cryptographic technique, to securely calculate patient-level similarity scores across hospitals. We then evaluate PCBFL by training a federated mortality prediction model using 20 sites from the eICU dataset. We compare the performance gain from PCBFL against traditional and existing Clustered FL frameworks. Our results show that PCBFL successfully forms clinically meaningful cohorts of low, medium, and high-risk patients. PCBFL outperforms traditional and existing Clustered FL frameworks with an average AUC improvement of 4.3% and AUPRC improvement of 7.8%.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"219 ","pages":"150-166"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11376435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142141965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SRDA: Mobile Sensing based Fluid Overload Detection for End Stage Kidney Disease Patients using Sensor Relation Dual Autoencoder. SRDA:基于移动传感的末期肾病患者体液超负荷检测(使用传感器关系双自动编码器)。
Mingyue Tang, Jiechao Gao, Guimin Dong, Carl Yang, Bradford Campbell, Brendan Bowman, Jamie Marie Zoellner, Emaad Abdel-Rahman, Mehdi Boukhechba
{"title":"SRDA: Mobile Sensing based Fluid Overload Detection for End Stage Kidney Disease Patients using Sensor Relation Dual Autoencoder.","authors":"Mingyue Tang, Jiechao Gao, Guimin Dong, Carl Yang, Bradford Campbell, Brendan Bowman, Jamie Marie Zoellner, Emaad Abdel-Rahman, Mehdi Boukhechba","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Chronic kidney disease (CKD) is a life-threatening and prevalent disease. CKD patients, especially endstage kidney disease (ESKD) patients on hemodialysis, suffer from kidney failures and are unable to remove excessive fluid, causing fluid overload and multiple morbidities including death. Current solutions for fluid overtake monitoring such as ultrasonography and biomarkers assessment are cumbersome, discontinuous, and can only be performed in the clinic. In this paper, we propose SRDA, a latent graph learning powered fluid overload detection system based on Sensor Relation Dual Autoencoder to detect excessive fluid consumption of EKSD patients based on passively collected bio-behavioral data from smartwatch sensors. Experiments using real-world mobile sensing data indicate that SRDA outperforms the state-of-the-art baselines in both F1 score and recall, and demonstrate the potential of ubiquitous sensing for ESKD fluid intake management.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"209 ","pages":"133-146"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10873463/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139901002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal Supervised Contrastive Learning for Modeling Patient Risk Progression. 为患者风险进展建模的时间监督对比学习。
Shahriar Noroozizadeh, Jeremy C Weiss, George H Chen
{"title":"Temporal Supervised Contrastive Learning for Modeling Patient Risk Progression.","authors":"Shahriar Noroozizadeh, Jeremy C Weiss, George H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the problem of predicting how the likelihood of an outcome of interest for a patient changes over time as we observe more of the patient's data. To solve this problem, we propose a supervised contrastive learning framework that learns an embedding representation for each time step of a patient time series. Our framework learns the embedding space to have the following properties: (1) nearby points in the embedding space have similar predicted class probabilities, (2) adjacent time steps of the same time series map to nearby points in the embedding space, and (3) time steps with very different raw feature vectors map to far apart regions of the embedding space. To achieve property (3), we employ a <i>nearest neighbor pairing</i> mechanism in the raw feature space. This mechanism also serves as an alternative to \"data augmentation\", a key ingredient of contrastive learning, which lacks a standard procedure that is adequately realistic for clinical tabular data, to our knowledge. We demonstrate that our approach outperforms state-of-the-art baselines in predicting mortality of septic patients (MIMIC-III dataset) and tracking progression of cognitive impairment (ADNI dataset). Our method also consistently recovers the correct synthetic dataset embedding structure across experiments, a feat not achieved by baselines. Our ablation experiments show the pivotal role of our nearest neighbor pairing.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"225 ","pages":"403-427"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10976929/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140320101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data. 高维不完整数据的神经网络高斯过程多重插值。
Zongyu Dai, Zhiqi Bu, Qi Long
{"title":"Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data.","authors":"Zongyu Dai,&nbsp;Zhiqi Bu,&nbsp;Qi Long","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample size, such as analysis of multi-omics data, present daunting challenges. Imputation is arguably the most popular method for handling missing data, though existing imputation methods have a number of limitations. Single imputation methods such as matrix completion methods do not adequately account for imputation uncertainty and hence would yield improper statistical inference. In contrast, multiple imputation (MI) methods allow for proper inference but existing methods do not perform well in high-dimensional settings. Our work aims to address these significant methodological gaps, leveraging recent advances in neural network Gaussian process (NNGP) from a Bayesian viewpoint. We propose two NNGP-based MI methods, namely MI-NNGP, that can apply multiple imputations for missing values from a joint (posterior predictive) distribution. The MI-NNGP methods are shown to significantly outperform existing state-of-the-art methods on synthetic and real datasets, in terms of imputation error, statistical inference, robustness to missing rates, and computation costs, under three missing data mechanisms, MCAR, MAR, and MNAR. Code is available in the GitHub repository https://github.com/bestadcarry/MI-NNGP.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"189 ","pages":"265-279"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10348708/pdf/nihms-1861886.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9821494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-analysis of individualized treatment rules via sign-coherency 基于符号一致性的个体化治疗规则元分析
Proceedings of machine learning research Pub Date : 2022-11-28 DOI: 10.48550/arXiv.2211.15476
Jay Jojo Cheng, J. Huling, Guanhua Chen
{"title":"Meta-analysis of individualized treatment rules via sign-coherency","authors":"Jay Jojo Cheng, J. Huling, Guanhua Chen","doi":"10.48550/arXiv.2211.15476","DOIUrl":"https://doi.org/10.48550/arXiv.2211.15476","url":null,"abstract":"Medical treatments tailored to a patient's baseline characteristics hold the potential of improving patient outcomes while reducing negative side effects. Learning individualized treatment rules (ITRs) often requires aggregation of multiple datasets(sites); however, current ITR methodology does not take between-site heterogeneity into account, which can hurt model generalizability when deploying back to each site. To address this problem, we develop a method for individual-level meta-analysis of ITRs, which jointly learns site-specific ITRs while borrowing information about feature sign-coherency via a scientifically-motivated directionality principle. We also develop an adaptive procedure for model tuning, using information criteria tailored to the ITR learning problem. We study the proposed methods through numerical experiments to understand their performance under different levels of between-site heterogeneity and apply the methodology to estimate ITRs in a large multi-center database of electronic health records. This work extends several popular methodologies for estimating ITRs (A-learning, weighted learning) to the multiple-sites setting.","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"193 1","pages":"171-198"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47047493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data 高维不完全数据的神经网络高斯过程多重脉冲
Proceedings of machine learning research Pub Date : 2022-11-23 DOI: 10.48550/arXiv.2211.13297
Zongyu Dai, Zhiqi Bu, Q. Long
{"title":"Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data","authors":"Zongyu Dai, Zhiqi Bu, Q. Long","doi":"10.48550/arXiv.2211.13297","DOIUrl":"https://doi.org/10.48550/arXiv.2211.13297","url":null,"abstract":"Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample size, such as analysis of multi-omics data, present daunting challenges. Imputation is arguably the most popular method for handling missing data, though existing imputation methods have a number of limitations. Single imputation methods such as matrix completion methods do not adequately account for imputation uncertainty and hence would yield improper statistical inference. In contrast, multiple imputation (MI) methods allow for proper inference but existing methods do not perform well in high-dimensional settings. Our work aims to address these significant methodological gaps, leveraging recent advances in neural network Gaussian process (NNGP) from a Bayesian viewpoint. We propose two NNGP-based MI methods, namely MI-NNGP, that can apply multiple imputations for missing values from a joint (posterior predictive) distribution. The MI-NNGP methods are shown to significantly outperform existing state-of-the-art methods on synthetic and real datasets, in terms of imputation error, statistical inference, robustness to missing rates, and computation costs, under three missing data mechanisms, MCAR, MAR, and MNAR. Code is available in the GitHub repository https://github.com/bestadcarry/MI-NNGP.","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"189 1","pages":"265-279"},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43147782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信