Marco Piazza , Andrea Spinelli , Francesca Maggioni , Marzia Bedoni , Enza Messina
{"title":"A robust support vector machine approach for Raman data classification","authors":"Marco Piazza , Andrea Spinelli , Francesca Maggioni , Marzia Bedoni , Enza Messina","doi":"10.1016/j.dajour.2025.100595","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advances in healthcare technologies have led to the availability of large amounts of biological samples across several techniques and applications. In particular, in the last few years, <em>Raman spectroscopy</em> analysis of biological samples has been successfully applied for early-stage diagnosis. However, spectra’s inherent complexity and variability make the manual analysis challenging, even for domain experts. For the same reason, the use of traditional <em>Statistical Learning</em> and <em>Machine Learning</em> techniques could not guarantee for accurate and reliable results. Machine learning models, combined with robust optimization techniques, offer the possibility to improve the classification accuracy and enhance the resilience of predictive models under data uncertainty. In this paper, we investigate the performance of a novel robust formulation for <em>Support Vector Machine</em> (SVM) in classifying COVID-19 samples obtained from Raman spectroscopy. Given the noisy and perturbed nature of biological samples, we protect the classification process against uncertainty through the application of robust optimization techniques. Specifically, we consider the robust counterparts of deterministic SVM formulations using bounded-by-norm uncertainty sets. We explore the cases of both linear and kernel-induced classifiers, addressing binary and multiclass classification tasks. The effectiveness of our approach is evaluated on real-world COVID-19 Raman saliva samples provided by Italian hospitals. We assess the performance of the proposed method by comparing the results of our numerical experiments with those of a state-of-the-art classifier, showing the potential of robust classifiers in handling uncertain Raman data.</div></div>","PeriodicalId":100357,"journal":{"name":"Decision Analytics Journal","volume":"16 ","pages":"Article 100595"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Analytics Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772662225000517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recent advances in healthcare technologies have led to the availability of large amounts of biological samples across several techniques and applications. In particular, in the last few years, Raman spectroscopy analysis of biological samples has been successfully applied for early-stage diagnosis. However, spectra’s inherent complexity and variability make the manual analysis challenging, even for domain experts. For the same reason, the use of traditional Statistical Learning and Machine Learning techniques could not guarantee for accurate and reliable results. Machine learning models, combined with robust optimization techniques, offer the possibility to improve the classification accuracy and enhance the resilience of predictive models under data uncertainty. In this paper, we investigate the performance of a novel robust formulation for Support Vector Machine (SVM) in classifying COVID-19 samples obtained from Raman spectroscopy. Given the noisy and perturbed nature of biological samples, we protect the classification process against uncertainty through the application of robust optimization techniques. Specifically, we consider the robust counterparts of deterministic SVM formulations using bounded-by-norm uncertainty sets. We explore the cases of both linear and kernel-induced classifiers, addressing binary and multiclass classification tasks. The effectiveness of our approach is evaluated on real-world COVID-19 Raman saliva samples provided by Italian hospitals. We assess the performance of the proposed method by comparing the results of our numerical experiments with those of a state-of-the-art classifier, showing the potential of robust classifiers in handling uncertain Raman data.