Heterogeneous data fusion and selection in high-volume molecular and imaging datasets

K. Moutselos, Ilias Maglogiannis, A. Chatziioannou
{"title":"Heterogeneous data fusion and selection in high-volume molecular and imaging datasets","authors":"K. Moutselos, Ilias Maglogiannis, A. Chatziioannou","doi":"10.1109/BIBE.2012.6399761","DOIUrl":null,"url":null,"abstract":"In this work, two disparate datasets, concerning the study of the same physiological type of cutaneous melanoma but derived from different donors, one of image (dermatoscopy) and the other of molecular (trascriptomic expression) origin are utilized, so as to form an expanded in description depth, integrative dataset. Four different imputation methods are employed in order to derive the unified dataset, prior the application of backward selection together with ensemble classifiers (random forests). The various imputation schemes applied, manage to emulate the effect of biological noise on the unified dataset, adding realistic signal variation. Thus, they immunize the discovery process in the integrative dataset, from false positive artifacts, which do not have a true differential effect. The results suggest that the expansion of the feature space through the data integration and the exploitation of elaborate imputation schemes in general, aid the classification task, imparting stability as regards the derivation of the putative classifiers.","PeriodicalId":330164,"journal":{"name":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2012.6399761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In this work, two disparate datasets, concerning the study of the same physiological type of cutaneous melanoma but derived from different donors, one of image (dermatoscopy) and the other of molecular (trascriptomic expression) origin are utilized, so as to form an expanded in description depth, integrative dataset. Four different imputation methods are employed in order to derive the unified dataset, prior the application of backward selection together with ensemble classifiers (random forests). The various imputation schemes applied, manage to emulate the effect of biological noise on the unified dataset, adding realistic signal variation. Thus, they immunize the discovery process in the integrative dataset, from false positive artifacts, which do not have a true differential effect. The results suggest that the expansion of the feature space through the data integration and the exploitation of elaborate imputation schemes in general, aid the classification task, imparting stability as regards the derivation of the putative classifiers.
大容量分子和成像数据集的异构数据融合和选择
在这项工作中,利用两个不同的数据集,研究相同生理类型的皮肤黑色素瘤,但来自不同的供体,一个是图像(皮肤镜)来源,另一个是分子(转录组学表达)来源,从而形成一个扩展的描述深度,整合的数据集。为了得到统一的数据集,采用了四种不同的输入方法,然后应用了反向选择和集成分类器(随机森林)。应用的各种输入方案,设法模拟生物噪声对统一数据集的影响,增加真实的信号变化。因此,它们使整合数据集中的发现过程免受假阳性工件的影响,假阳性工件不具有真正的差异效应。结果表明,通过数据集成扩展特征空间和利用复杂的归算方案通常有助于分类任务,赋予假定分类器的推导稳定性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信