Mehroush Banday, Sherin Zafar, Parul Agarwal, M Afshar Alam, Abubeker K M
{"title":"Early Detection of Coronary Heart Disease Using Hybrid Quantum Machine Learning Approach","authors":"Mehroush Banday, Sherin Zafar, Parul Agarwal, M Afshar Alam, Abubeker K M","doi":"arxiv-2409.10932","DOIUrl":"https://doi.org/arxiv-2409.10932","url":null,"abstract":"Coronary heart disease (CHD) is a severe cardiac disease, and hence, its\u0000early diagnosis is essential as it improves treatment results and saves money\u0000on medical care. The prevailing development of quantum computing and machine\u0000learning (ML) technologies may bring practical improvement to the performance\u0000of CHD diagnosis. Quantum machine learning (QML) is receiving tremendous\u0000interest in various disciplines due to its higher performance and capabilities.\u0000A quantum leap in the healthcare industry will increase processing power and\u0000optimise multiple models. Techniques for QML have the potential to forecast\u0000cardiac disease and help in early detection. To predict the risk of coronary\u0000heart disease, a hybrid approach utilizing an ensemble machine learning model\u0000based on QML classifiers is presented in this paper. Our approach, with its\u0000unique ability to address multidimensional healthcare data, reassures the\u0000method's robustness by fusing quantum and classical ML algorithms in a\u0000multi-step inferential framework. The marked rise in heart disease and death\u0000rates impacts worldwide human health and the global economy. Reducing cardiac\u0000morbidity and mortality requires early detection of heart disease. In this\u0000research, a hybrid approach utilizes techniques with quantum computing\u0000capabilities to tackle complex problems that are not amenable to conventional\u0000machine learning algorithms and to minimize computational expenses. The\u0000proposed method has been developed in the Raspberry Pi 5 Graphics Processing\u0000Unit (GPU) platform and tested on a broad dataset that integrates clinical and\u0000imaging data from patients suffering from CHD and healthy controls. Compared to\u0000classical machine learning models, the accuracy, sensitivity, F1 score, and\u0000specificity of the proposed hybrid QML model used with CHD are manifold higher.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HMF: A Hybrid Multi-Factor Framework for Dynamic Intraoperative Hypotension Prediction","authors":"Mingyue Cheng, Jintao Zhang, Zhiding Liu, Chunli Liu, Yanhu Xie","doi":"arxiv-2409.11064","DOIUrl":"https://doi.org/arxiv-2409.11064","url":null,"abstract":"Intraoperative hypotension (IOH) prediction using Mean Arterial Pressure\u0000(MAP) is a critical research area with significant implications for patient\u0000outcomes during surgery. However, existing approaches predominantly employ\u0000static modeling paradigms that overlook the dynamic nature of physiological\u0000signals. In this paper, we introduce a novel Hybrid Multi-Factor (HMF)\u0000framework that reformulates IOH prediction as a blood pressure forecasting\u0000task. Our framework leverages a Transformer encoder, specifically designed to\u0000effectively capture the temporal evolution of MAP series through a patch-based\u0000input representation, which segments the input physiological series into\u0000informative patches for accurate analysis. To address the challenges of\u0000distribution shift in physiological series, our approach incorporates two key\u0000innovations: (1) Symmetric normalization and de-normalization processes help\u0000mitigate distributional drift in statistical properties, thereby ensuring the\u0000model's robustness across varying conditions, and (2) Sequence decomposition,\u0000which disaggregates the input series into trend and seasonal components,\u0000allowing for a more precise modeling of inherent sequence dependencies.\u0000Extensive experiments conducted on two real-world datasets demonstrate the\u0000superior performance of our approach compared to competitive baselines,\u0000particularly in capturing the nuanced variations in input series that are\u0000crucial for accurate IOH prediction.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"53 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaqi Ding, Tingting Dan, Ziquan Wei, Hyuna Cho, Paul J. Laurienti, Won Hwa Kim, Guorong Wu
{"title":"Machine Learning on Dynamic Functional Connectivity: Promise, Pitfalls, and Interpretations","authors":"Jiaqi Ding, Tingting Dan, Ziquan Wei, Hyuna Cho, Paul J. Laurienti, Won Hwa Kim, Guorong Wu","doi":"arxiv-2409.11377","DOIUrl":"https://doi.org/arxiv-2409.11377","url":null,"abstract":"An unprecedented amount of existing functional Magnetic Resonance Imaging\u0000(fMRI) data provides a new opportunity to understand the relationship between\u0000functional fluctuation and human cognition/behavior using a data-driven\u0000approach. To that end, tremendous efforts have been made in machine learning to\u0000predict cognitive states from evolving volumetric images of\u0000blood-oxygen-level-dependent (BOLD) signals. Due to the complex nature of brain\u0000function, however, the evaluation on learning performance and discoveries are\u0000not often consistent across current state-of-the-arts (SOTA). By capitalizing\u0000on large-scale existing neuroimaging data (34,887 data samples from six public\u0000databases), we seek to establish a well-founded empirical guideline for\u0000designing deep models for functional neuroimages by linking the methodology\u0000underpinning with knowledge from the neuroscience domain. Specifically, we put\u0000the spotlight on (1) What is the current SOTA performance in cognitive task\u0000recognition and disease diagnosis using fMRI? (2) What are the limitations of\u0000current deep models? and (3) What is the general guideline for selecting the\u0000suitable machine learning backbone for new neuroimaging applications? We have\u0000conducted a comprehensive evaluation and statistical analysis, in various\u0000settings, to answer the above outstanding questions.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Edvin Listo Zec, Tom Hagander, Eric Ihre-Thomason, Sarunas Girdzijauskas
{"title":"On the effects of similarity metrics in decentralized deep learning under distributional shift","authors":"Edvin Listo Zec, Tom Hagander, Eric Ihre-Thomason, Sarunas Girdzijauskas","doi":"arxiv-2409.10720","DOIUrl":"https://doi.org/arxiv-2409.10720","url":null,"abstract":"Decentralized Learning (DL) enables privacy-preserving collaboration among\u0000organizations or users to enhance the performance of local deep learning\u0000models. However, model aggregation becomes challenging when client data is\u0000heterogeneous, and identifying compatible collaborators without direct data\u0000exchange remains a pressing issue. In this paper, we investigate the\u0000effectiveness of various similarity metrics in DL for identifying peers for\u0000model merging, conducting an empirical analysis across multiple datasets with\u0000distribution shifts. Our research provides insights into the performance of\u0000these metrics, examining their role in facilitating effective collaboration. By\u0000exploring the strengths and limitations of these metrics, we contribute to the\u0000development of robust DL methods.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios","authors":"Luning Wang, Shiyao Li, Xuefei Ning, Zhihang Yuan, Shengen Yan, Guohao Dai, Yu Wang","doi":"arxiv-2409.10593","DOIUrl":"https://doi.org/arxiv-2409.10593","url":null,"abstract":"Large Language Models (LLMs) have been widely adopted to process long-context\u0000tasks. However, the large memory overhead of the key-value (KV) cache poses\u0000significant challenges in long-context scenarios. Existing training-free KV\u0000cache compression methods typically focus on quantization and token pruning,\u0000which have compression limits, and excessive sparsity can lead to severe\u0000performance degradation. Other methods design new architectures with less KV\u0000overhead but require significant training overhead. To address the above two\u0000drawbacks, we further explore the redundancy in the channel dimension and apply\u0000an architecture-level design with minor training costs. Therefore, we introduce\u0000CSKV, a training-efficient Channel Shrinking technique for KV cache\u0000compression: (1) We first analyze the singular value distribution of the KV\u0000cache, revealing significant redundancy and compression potential along the\u0000channel dimension. Based on this observation, we propose using low-rank\u0000decomposition for key and value layers and storing the low-dimension features.\u0000(2) To preserve model performance, we introduce a bi-branch KV cache, including\u0000a window-based full-precision KV cache and a low-precision compressed KV cache.\u0000(3) To reduce the training costs, we minimize the layer-wise reconstruction\u0000loss for the compressed KV cache instead of retraining the entire LLMs.\u0000Extensive experiments show that CSKV can reduce the memory overhead of the KV\u0000cache by 80% while maintaining the model's long-context capability. Moreover,\u0000we show that our method can be seamlessly combined with quantization to further\u0000reduce the memory overhead, achieving a compression ratio of up to 95%.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LASERS: LAtent Space Encoding for Representations with Sparsity for Generative Modeling","authors":"Xin Li, Anand Sarwate","doi":"arxiv-2409.11184","DOIUrl":"https://doi.org/arxiv-2409.11184","url":null,"abstract":"Learning compact and meaningful latent space representations has been shown\u0000to be very useful in generative modeling tasks for visual data. One particular\u0000example is applying Vector Quantization (VQ) in variational autoencoders\u0000(VQ-VAEs, VQ-GANs, etc.), which has demonstrated state-of-the-art performance\u0000in many modern generative modeling applications. Quantizing the latent space\u0000has been justified by the assumption that the data themselves are inherently\u0000discrete in the latent space (like pixel values). In this paper, we propose an\u0000alternative representation of the latent space by relaxing the structural\u0000assumption than the VQ formulation. Specifically, we assume that the latent\u0000space can be approximated by a union of subspaces model corresponding to a\u0000dictionary-based representation under a sparsity constraint. The dictionary is\u0000learned/updated during the training process. We apply this approach to look at\u0000two models: Dictionary Learning Variational Autoencoders (DL-VAEs) and DL-VAEs\u0000with Generative Adversarial Networks (DL-GANs). We show empirically that our\u0000more latent space is more expressive and has leads to better representations\u0000than the VQ approach in terms of reconstruction quality at the expense of a\u0000small computational overhead for the latent space computation. Our results thus\u0000suggest that the true benefit of the VQ approach might not be from\u0000discretization of the latent space, but rather the lossy compression of the\u0000latent space. We confirm this hypothesis by showing that our sparse\u0000representations also address the codebook collapse issue as found common in\u0000VQ-family models.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rachel Pfeifer, Sudip Vhaduri, Mark Wilson, Julius Keller
{"title":"Toward Mitigating Sex Bias in Pilot Trainees' Stress and Fatigue Modeling","authors":"Rachel Pfeifer, Sudip Vhaduri, Mark Wilson, Julius Keller","doi":"arxiv-2409.10676","DOIUrl":"https://doi.org/arxiv-2409.10676","url":null,"abstract":"While researchers have been trying to understand the stress and fatigue among\u0000pilots, especially pilot trainees, and to develop stress/fatigue models to\u0000automate the process of detecting stress/fatigue, they often do not consider\u0000biases such as sex in those models. However, in a critical profession like\u0000aviation, where the demographic distribution is disproportionately skewed to\u0000one sex, it is urgent to mitigate biases for fair and safe model predictions.\u0000In this work, we investigate the perceived stress/fatigue of 69 college\u0000students, including 40 pilot trainees with around 63% male. We construct models\u0000with decision trees first without bias mitigation and then with bias mitigation\u0000using a threshold optimizer with demographic parity and equalized odds\u0000constraints 30 times with random instances. Using bias mitigation, we achieve\u0000improvements of 88.31% (demographic parity difference) and 54.26% (equalized\u0000odds difference), which are also found to be statistically significant.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aron Distelzweig, Eitan Kosman, Andreas Look, Faris Janjoš, Denesh K. Manivannan, Abhinav Valada
{"title":"Motion Forecasting via Model-Based Risk Minimization","authors":"Aron Distelzweig, Eitan Kosman, Andreas Look, Faris Janjoš, Denesh K. Manivannan, Abhinav Valada","doi":"arxiv-2409.10585","DOIUrl":"https://doi.org/arxiv-2409.10585","url":null,"abstract":"Forecasting the future trajectories of surrounding agents is crucial for\u0000autonomous vehicles to ensure safe, efficient, and comfortable route planning.\u0000While model ensembling has improved prediction accuracy in various fields, its\u0000application in trajectory prediction is limited due to the multi-modal nature\u0000of predictions. In this paper, we propose a novel sampling method applicable to\u0000trajectory prediction based on the predictions of multiple models. We first\u0000show that conventional sampling based on predicted probabilities can degrade\u0000performance due to missing alignment between models. To address this problem,\u0000we introduce a new method that generates optimal trajectories from a set of\u0000neural networks, framing it as a risk minimization problem with a variable loss\u0000function. By using state-of-the-art models as base learners, our approach\u0000constructs diverse and effective ensembles for optimal trajectory sampling.\u0000Extensive experiments on the nuScenes prediction dataset demonstrate that our\u0000method surpasses current state-of-the-art techniques, achieving top ranks on\u0000the leaderboard. We also provide a comprehensive empirical study on ensembling\u0000strategies, offering insights into their effectiveness. Our findings highlight\u0000the potential of advanced ensembling techniques in trajectory prediction,\u0000significantly improving predictive performance and paving the way for more\u0000reliable predicted trajectories.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Federated Learning for Smart Grid: A Survey on Applications and Potential Vulnerabilities","authors":"Zikai Zhang, Suman Rath, Jiaohao Xu, Tingsong Xiao","doi":"arxiv-2409.10764","DOIUrl":"https://doi.org/arxiv-2409.10764","url":null,"abstract":"The Smart Grid (SG) is a critical energy infrastructure that collects\u0000real-time electricity usage data to forecast future energy demands using\u0000information and communication technologies (ICT). Due to growing concerns about\u0000data security and privacy in SGs, federated learning (FL) has emerged as a\u0000promising training framework. FL offers a balance between privacy, efficiency,\u0000and accuracy in SGs by enabling collaborative model training without sharing\u0000private data from IoT devices. In this survey, we thoroughly review recent\u0000advancements in designing FL-based SG systems across three stages: generation,\u0000transmission and distribution, and consumption. Additionally, we explore\u0000potential vulnerabilities that may arise when implementing FL in these stages.\u0000Finally, we discuss the gap between state-of-the-art FL research and its\u0000practical applications in SGs and propose future research directions. These\u0000focus on potential attack and defense strategies for FL-based SG systems and\u0000the need to build a robust FL-based SG infrastructure. Unlike traditional\u0000surveys that address security issues in centralized machine learning methods\u0000for SG systems, this survey specifically examines the applications and security\u0000concerns in FL-based SG systems for the first time. Our aim is to inspire\u0000further research into applications and improvements in the robustness of\u0000FL-based SG systems.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesse van Remmerden, Zaharah Bukhsh, Yingqian Zhang
{"title":"Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling","authors":"Jesse van Remmerden, Zaharah Bukhsh, Yingqian Zhang","doi":"arxiv-2409.10589","DOIUrl":"https://doi.org/arxiv-2409.10589","url":null,"abstract":"The Job Shop Scheduling Problem (JSSP) is a complex combinatorial\u0000optimization problem. There has been growing interest in using online\u0000Reinforcement Learning (RL) for JSSP. While online RL can quickly find\u0000acceptable solutions, especially for larger problems, it produces lower-quality\u0000results than traditional methods like Constraint Programming (CP). A\u0000significant downside of online RL is that it cannot learn from existing data,\u0000such as solutions generated from CP, requiring them to train from scratch,\u0000leading to sample inefficiency and making them unable to learn from more\u0000optimal examples. We introduce Offline Reinforcement Learning for Learning to\u0000Dispatch (Offline-LD), a novel approach for JSSP that addresses these\u0000limitations. Offline-LD adapts two CQL-based Q-learning methods (mQRDQN and\u0000discrete mSAC) for maskable action spaces, introduces a new entropy bonus\u0000modification for discrete SAC, and exploits reward normalization through\u0000preprocessing. Our experiments show that Offline-LD outperforms online RL on\u0000both generated and benchmark instances. By introducing noise into the dataset,\u0000we achieve similar or better results than those obtained from the expert\u0000dataset, indicating that a more diverse training set is preferable because it\u0000contains counterfactual information.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}