Evaluation of machine learning and deep learning models for the classification of a single extracellular vesicles spectral library

IF 4.1 Q2 CHEMISTRY, ANALYTICAL
C. del Real Mata, Y. Lu, M. Jalali, A. Bocan, M. Khatami, L. Montermini, J. McCormack-Ilersich, W. W. Reisner, L. Garzia, J. Rak, D. Bzdok and S. Mahshid
{"title":"Evaluation of machine learning and deep learning models for the classification of a single extracellular vesicles spectral library","authors":"C. del Real Mata, Y. Lu, M. Jalali, A. Bocan, M. Khatami, L. Montermini, J. McCormack-Ilersich, W. W. Reisner, L. Garzia, J. Rak, D. Bzdok and S. Mahshid","doi":"10.1039/D5SD00091B","DOIUrl":null,"url":null,"abstract":"<p >Single extracellular vesicles (EVs) carry molecular signatures from their cell of origin, making them a pivotal non-invasive biomarker for cancer diagnosis and monitoring. However, analyzing the complex data associated with single-EVs, such as fingerprints generated <em>via</em> Surface-enhanced Raman Spectroscopy (SERS), remains challenging. To address this, a thorough comparison of machine learning models' implementations and their accuracy classification optimization is presented. A comprehensive single-EV spectral library collected with a SERS-assisted nanostructured platform including cell lines, healthy controls, and cancer patient samples is used. The performance of different learning models (random forests, support vector machines, convolutional neural networks, and linear regression as reference) was assessed for cancer detection tasks: i) multi-cell line classification and ii) cancerous <em>versus</em> non-cancerous binary classification. To improve their accuracy, we optimized spectra preprocessing, artificially increased the dataset, and implemented feature-driven classification. In sum, these methods enabled more interpretable models to perform on par with the complex one, increasing accuracy up to 12% percent-age points, even with datasets reduced to 66% of the original size. Achieving accuracies of 83% and 91% for Task-i and Task-ii, respectively.</p>","PeriodicalId":74786,"journal":{"name":"Sensors & diagnostics","volume":" 10","pages":" 869-883"},"PeriodicalIF":4.1000,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/sd/d5sd00091b?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sensors & diagnostics","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/sd/d5sd00091b","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Single extracellular vesicles (EVs) carry molecular signatures from their cell of origin, making them a pivotal non-invasive biomarker for cancer diagnosis and monitoring. However, analyzing the complex data associated with single-EVs, such as fingerprints generated via Surface-enhanced Raman Spectroscopy (SERS), remains challenging. To address this, a thorough comparison of machine learning models' implementations and their accuracy classification optimization is presented. A comprehensive single-EV spectral library collected with a SERS-assisted nanostructured platform including cell lines, healthy controls, and cancer patient samples is used. The performance of different learning models (random forests, support vector machines, convolutional neural networks, and linear regression as reference) was assessed for cancer detection tasks: i) multi-cell line classification and ii) cancerous versus non-cancerous binary classification. To improve their accuracy, we optimized spectra preprocessing, artificially increased the dataset, and implemented feature-driven classification. In sum, these methods enabled more interpretable models to perform on par with the complex one, increasing accuracy up to 12% percent-age points, even with datasets reduced to 66% of the original size. Achieving accuracies of 83% and 91% for Task-i and Task-ii, respectively.

Abstract Image

评估机器学习和深度学习模型对单个细胞外囊泡光谱库的分类
单个细胞外囊泡(ev)携带来自其起源细胞的分子特征,使其成为癌症诊断和监测的关键非侵入性生物标志物。然而,分析与单辆电动汽车相关的复杂数据,例如通过表面增强拉曼光谱(SERS)产生的指纹,仍然具有挑战性。为了解决这个问题,本文对机器学习模型的实现及其精度分类优化进行了全面的比较。使用sers辅助纳米结构平台收集的综合单ev光谱库,包括细胞系、健康对照和癌症患者样本。评估了不同学习模型(随机森林、支持向量机、卷积神经网络和线性回归作为参考)在癌症检测任务中的表现:i)多细胞系分类和ii)癌与非癌二元分类。为了提高其精度,我们优化了光谱预处理,人工增加了数据集,并实现了特征驱动分类。总而言之,这些方法使更多可解释模型的表现与复杂模型相当,即使数据集减少到原始大小的66%,准确率也提高了12%。Task-i和Task-ii的准确率分别达到83%和91%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信