基因表达数据的双聚类

M. Shruthi
{"title":"基因表达数据的双聚类","authors":"M. Shruthi","doi":"10.1109/ICAMMAET.2017.8186750","DOIUrl":null,"url":null,"abstract":"Microarray technology is a tool which is essential to observe and monitor the genes in an living organism. Biclustering is a strategy to distinguish qualities that are co-directed under a subset of conditions, however are not really co-controlled crosswise over different conditions. The dataset is in the form of matrix, row matrix represents a set of genes and column matrix represents a set of conditions. Each row in a matrix corresponds to set of genes and each column represents a set of conditions. The goal of this project is to identify groups of genes sharing a common subset of regulatory units. A method is needed to select clusters of genes and conditions simultaneously, finding distinctive clusters with less number of rules generated. Feature selection might be assessed from both the proficiency and adequacy perspectives. While the productivity concerns the time needed to discover a set of components is that the adequacy is identified with the nature of the subset of elements. Hence fast clustering is used to cluster the data into two categories (relevant data, irrelevant data). K-means algorithm is employed to bicluster the data in order to divide the input data into four classes. From the calculation of the maximum and minimum specificity of each class, better accuracy is outperformed.","PeriodicalId":425974,"journal":{"name":"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Biclustering on gene expression data\",\"authors\":\"M. Shruthi\",\"doi\":\"10.1109/ICAMMAET.2017.8186750\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Microarray technology is a tool which is essential to observe and monitor the genes in an living organism. Biclustering is a strategy to distinguish qualities that are co-directed under a subset of conditions, however are not really co-controlled crosswise over different conditions. The dataset is in the form of matrix, row matrix represents a set of genes and column matrix represents a set of conditions. Each row in a matrix corresponds to set of genes and each column represents a set of conditions. The goal of this project is to identify groups of genes sharing a common subset of regulatory units. A method is needed to select clusters of genes and conditions simultaneously, finding distinctive clusters with less number of rules generated. Feature selection might be assessed from both the proficiency and adequacy perspectives. While the productivity concerns the time needed to discover a set of components is that the adequacy is identified with the nature of the subset of elements. Hence fast clustering is used to cluster the data into two categories (relevant data, irrelevant data). K-means algorithm is employed to bicluster the data in order to divide the input data into four classes. From the calculation of the maximum and minimum specificity of each class, better accuracy is outperformed.\",\"PeriodicalId\":425974,\"journal\":{\"name\":\"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAMMAET.2017.8186750\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAMMAET.2017.8186750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

微阵列技术是观察和监测生物体内基因的重要工具。双聚类是一种策略,用于区分在一组条件下共同指导的品质,而不是在不同条件下真正共同控制的品质。数据集采用矩阵的形式,行矩阵表示一组基因,列矩阵表示一组条件。矩阵中的每一行对应一组基因,每一列代表一组条件。这个项目的目标是确定一组基因共享一个共同的调控单元子集。需要一种同时选择基因簇和条件簇的方法,用较少的规则生成独特的簇。特征选择可以从熟练度和充分性两方面进行评估。虽然生产力关注的是发现一组组件所需的时间,但充分性是与元素子集的性质相一致的。因此,使用快速聚类将数据聚类为两类(相关数据和不相关数据)。采用K-means算法对数据进行聚类,将输入数据分为四类。通过对每个类别的最大和最小特异性的计算,获得了更好的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Biclustering on gene expression data
Microarray technology is a tool which is essential to observe and monitor the genes in an living organism. Biclustering is a strategy to distinguish qualities that are co-directed under a subset of conditions, however are not really co-controlled crosswise over different conditions. The dataset is in the form of matrix, row matrix represents a set of genes and column matrix represents a set of conditions. Each row in a matrix corresponds to set of genes and each column represents a set of conditions. The goal of this project is to identify groups of genes sharing a common subset of regulatory units. A method is needed to select clusters of genes and conditions simultaneously, finding distinctive clusters with less number of rules generated. Feature selection might be assessed from both the proficiency and adequacy perspectives. While the productivity concerns the time needed to discover a set of components is that the adequacy is identified with the nature of the subset of elements. Hence fast clustering is used to cluster the data into two categories (relevant data, irrelevant data). K-means algorithm is employed to bicluster the data in order to divide the input data into four classes. From the calculation of the maximum and minimum specificity of each class, better accuracy is outperformed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信