Evolving very-compact fuzzy models for gene expression data analysis

Miguel Arturo Barreto-Sanz, A. Bujard, C. Peña-Reyes
{"title":"Evolving very-compact fuzzy models for gene expression data analysis","authors":"Miguel Arturo Barreto-Sanz, A. Bujard, C. Peña-Reyes","doi":"10.1109/BIBE.2012.6399650","DOIUrl":null,"url":null,"abstract":"Selecting predicitve gene pools from thousands of gene expression values is one of the main tasks in microarray data analysis. For this purpose multivariate techniques have proven much better, in terms of predicitve value and biological relevance, than univariate techniques as they are able to capture relevant relationships and interactions between genes. An additional goal for gene-expression profiling is finding models that, besides being predictive, are also understandable so as they can provide some insight on the underlying mechanisms. Models based on fuzzy logic might, potentially, exhibit both characteristics. However, accuracy and interpretability are usually contradictory objectives, and one must accept a trade off between them. Indeed, literature shows that the approaches based on fuzzy logic may be divided in two groups: accurate but complex models (i.e, with many rules using many variables per rule) on one hand, and models with only few short rules (thus, interpretable) but exhibiting limited accuracy. We present in this paper the application of Fuzzy CoCo, our cooperative coevolutionary fuzzy modelling approach, in order to deal efficiently with the accuracy-interpretability tradeoff. Fuzzy CoCo is able to find very compact fuzzy models, in terms of number of rules and number of variables per rule, while still exhibiting high predictive power. To validate the performance of our approach, we tested Fuzzy CoCo on four known data sets addressing each one a form of cancer: Leukemia, colon, lung, and prostate. We compared our results-in terms of maximum number of rules, number of variables per rule, and accuracy-with those of other similar works (i.e., based on fuzzy logic). Our models reached similar or better accuracy while being considerably smaller.","PeriodicalId":330164,"journal":{"name":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2012.6399650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Selecting predicitve gene pools from thousands of gene expression values is one of the main tasks in microarray data analysis. For this purpose multivariate techniques have proven much better, in terms of predicitve value and biological relevance, than univariate techniques as they are able to capture relevant relationships and interactions between genes. An additional goal for gene-expression profiling is finding models that, besides being predictive, are also understandable so as they can provide some insight on the underlying mechanisms. Models based on fuzzy logic might, potentially, exhibit both characteristics. However, accuracy and interpretability are usually contradictory objectives, and one must accept a trade off between them. Indeed, literature shows that the approaches based on fuzzy logic may be divided in two groups: accurate but complex models (i.e, with many rules using many variables per rule) on one hand, and models with only few short rules (thus, interpretable) but exhibiting limited accuracy. We present in this paper the application of Fuzzy CoCo, our cooperative coevolutionary fuzzy modelling approach, in order to deal efficiently with the accuracy-interpretability tradeoff. Fuzzy CoCo is able to find very compact fuzzy models, in terms of number of rules and number of variables per rule, while still exhibiting high predictive power. To validate the performance of our approach, we tested Fuzzy CoCo on four known data sets addressing each one a form of cancer: Leukemia, colon, lung, and prostate. We compared our results-in terms of maximum number of rules, number of variables per rule, and accuracy-with those of other similar works (i.e., based on fuzzy logic). Our models reached similar or better accuracy while being considerably smaller.
进化非常紧凑的模糊模型用于基因表达数据分析
从数千个基因表达值中选择预测基因库是微阵列数据分析的主要任务之一。为此目的,多变量技术已被证明在预测价值和生物学相关性方面比单变量技术要好得多,因为它们能够捕捉基因之间的相关关系和相互作用。基因表达谱分析的另一个目标是找到除了具有预测性之外,还可以理解的模型,以便它们可以提供对潜在机制的一些见解。基于模糊逻辑的模型可能潜在地表现出这两种特征。然而,准确性和可解释性通常是相互矛盾的目标,人们必须接受两者之间的权衡。事实上,文献表明,基于模糊逻辑的方法可以分为两类:一类是精确但复杂的模型(即每个规则使用许多变量的许多规则),另一类是只有少数短规则(因此,可解释)但精度有限的模型。为了有效地处理准确性与可解释性之间的权衡问题,本文提出了模糊协同进化模糊建模方法模糊CoCo的应用。Fuzzy CoCo能够找到非常紧凑的模糊模型,就规则的数量和每个规则的变量数量而言,同时仍然表现出很高的预测能力。为了验证我们的方法的性能,我们在四个已知的数据集上测试了Fuzzy CoCo,每个数据集分别处理一种癌症:白血病、结肠癌、肺癌和前列腺癌。我们将我们的结果(根据规则的最大数量、每个规则的变量数量和准确性)与其他类似工作(即基于模糊逻辑)进行了比较。我们的模型在相当小的情况下达到了类似或更好的精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信