Binary Markov Random Fields and interpretable mass spectra discrimination.

IF 0.8 4区 数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY
Ao Kong, Robert Azencott
{"title":"Binary Markov Random Fields and interpretable mass spectra discrimination.","authors":"Ao Kong, Robert Azencott","doi":"10.1515/sagmb-2016-0019","DOIUrl":null,"url":null,"abstract":"<p><p>For mass spectra acquired from cancer patients by MALDI or SELDI techniques, automated discrimination between cancer types or stages has often been implemented by machine learning algorithms. Nevertheless, these techniques typically lack interpretability in terms of biomarkers. In this paper, we propose a new mass spectra discrimination algorithm by parameterized Markov Random Fields to automatically generate interpretable classifiers with small groups of scored biomarkers. A dataset of 238 MALDI colorectal mass spectra and two datasets of 216 and 253 SELDI ovarian mass spectra respectively were used to test our approach. The results show that our approach reaches accuracies of 81% to 100% to discriminate between patients from different colorectal and ovarian cancer stages, and performs as well or better than previous studies on similar datasets. Moreover, our approach enables efficient planar-displays to visualize mass spectra discrimination and has good asymptotic performance for large datasets. Thus, our classifiers should facilitate the choice and planning of further experiments for biological interpretation of cancer discriminating signatures. In our experiments, the number of mass spectra for each colorectal cancer stage is roughly half of that for each ovarian cancer stage, so that we reach lower discrimination accuracy for colorectal cancer than for ovarian cancer.</p>","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2017-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Applications in Genetics and Molecular Biology","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2016-0019","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

For mass spectra acquired from cancer patients by MALDI or SELDI techniques, automated discrimination between cancer types or stages has often been implemented by machine learning algorithms. Nevertheless, these techniques typically lack interpretability in terms of biomarkers. In this paper, we propose a new mass spectra discrimination algorithm by parameterized Markov Random Fields to automatically generate interpretable classifiers with small groups of scored biomarkers. A dataset of 238 MALDI colorectal mass spectra and two datasets of 216 and 253 SELDI ovarian mass spectra respectively were used to test our approach. The results show that our approach reaches accuracies of 81% to 100% to discriminate between patients from different colorectal and ovarian cancer stages, and performs as well or better than previous studies on similar datasets. Moreover, our approach enables efficient planar-displays to visualize mass spectra discrimination and has good asymptotic performance for large datasets. Thus, our classifiers should facilitate the choice and planning of further experiments for biological interpretation of cancer discriminating signatures. In our experiments, the number of mass spectra for each colorectal cancer stage is roughly half of that for each ovarian cancer stage, so that we reach lower discrimination accuracy for colorectal cancer than for ovarian cancer.

二元马尔可夫随机场和可解释质谱鉴别。
对于通过 MALDI 或 SELDI 技术从癌症患者身上获取的质谱,通常采用机器学习算法自动区分癌症类型或阶段。然而,这些技术通常缺乏生物标记的可解释性。在本文中,我们提出了一种新的质谱判别算法,该算法通过参数化马尔可夫随机场自动生成可解释的分类器,并对生物标记物进行分组评分。我们使用了一个包含 238 个 MALDI 大肠癌质谱的数据集和两个分别包含 216 个和 253 个 SELDI 卵巢癌质谱的数据集来测试我们的方法。结果表明,我们的方法在区分不同阶段的结直肠癌和卵巢癌患者方面达到了 81% 至 100% 的准确率,与之前类似数据集的研究结果相比,我们的方法表现更好甚至更好。此外,我们的方法还能以高效的平面显示方式可视化质谱鉴别,并对大型数据集具有良好的渐进性能。因此,我们的分类器有助于选择和规划进一步的实验,以对癌症鉴别特征进行生物学解释。在我们的实验中,每个结直肠癌阶段的质谱数量大约是每个卵巢癌阶段的一半,因此我们对结直肠癌的判别准确率低于卵巢癌。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Statistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-MATHEMATICAL & COMPUTATIONAL BIOLOGY
自引率
11.10%
发文量
8
期刊介绍: Statistical Applications in Genetics and Molecular Biology seeks to publish significant research on the application of statistical ideas to problems arising from computational biology. The focus of the papers should be on the relevant statistical issues but should contain a succinct description of the relevant biological problem being considered. The range of topics is wide and will include topics such as linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarray data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies. Both original research and review articles will be warmly received.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信