基因表达数据的双聚类

2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET) Pub Date : 2017-02-01 DOI:10.1109/ICAMMAET.2017.8186750

M. Shruthi

{"title":"基因表达数据的双聚类","authors":"M. Shruthi","doi":"10.1109/ICAMMAET.2017.8186750","DOIUrl":null,"url":null,"abstract":"Microarray technology is a tool which is essential to observe and monitor the genes in an living organism. Biclustering is a strategy to distinguish qualities that are co-directed under a subset of conditions, however are not really co-controlled crosswise over different conditions. The dataset is in the form of matrix, row matrix represents a set of genes and column matrix represents a set of conditions. Each row in a matrix corresponds to set of genes and each column represents a set of conditions. The goal of this project is to identify groups of genes sharing a common subset of regulatory units. A method is needed to select clusters of genes and conditions simultaneously, finding distinctive clusters with less number of rules generated. Feature selection might be assessed from both the proficiency and adequacy perspectives. While the productivity concerns the time needed to discover a set of components is that the adequacy is identified with the nature of the subset of elements. Hence fast clustering is used to cluster the data into two categories (relevant data, irrelevant data). K-means algorithm is employed to bicluster the data in order to divide the input data into four classes. From the calculation of the maximum and minimum specificity of each class, better accuracy is outperformed.","PeriodicalId":425974,"journal":{"name":"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Biclustering on gene expression data\",\"authors\":\"M. Shruthi\",\"doi\":\"10.1109/ICAMMAET.2017.8186750\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Microarray technology is a tool which is essential to observe and monitor the genes in an living organism. Biclustering is a strategy to distinguish qualities that are co-directed under a subset of conditions, however are not really co-controlled crosswise over different conditions. The dataset is in the form of matrix, row matrix represents a set of genes and column matrix represents a set of conditions. Each row in a matrix corresponds to set of genes and each column represents a set of conditions. The goal of this project is to identify groups of genes sharing a common subset of regulatory units. A method is needed to select clusters of genes and conditions simultaneously, finding distinctive clusters with less number of rules generated. Feature selection might be assessed from both the proficiency and adequacy perspectives. While the productivity concerns the time needed to discover a set of components is that the adequacy is identified with the nature of the subset of elements. Hence fast clustering is used to cluster the data into two categories (relevant data, irrelevant data). K-means algorithm is employed to bicluster the data in order to divide the input data into four classes. From the calculation of the maximum and minimum specificity of each class, better accuracy is outperformed.\",\"PeriodicalId\":425974,\"journal\":{\"name\":\"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAMMAET.2017.8186750\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAMMAET.2017.8186750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

微阵列技术是观察和监测生物体内基因的重要工具。双聚类是一种策略，用于区分在一组条件下共同指导的品质，而不是在不同条件下真正共同控制的品质。数据集采用矩阵的形式，行矩阵表示一组基因，列矩阵表示一组条件。矩阵中的每一行对应一组基因，每一列代表一组条件。这个项目的目标是确定一组基因共享一个共同的调控单元子集。需要一种同时选择基因簇和条件簇的方法，用较少的规则生成独特的簇。特征选择可以从熟练度和充分性两方面进行评估。虽然生产力关注的是发现一组组件所需的时间，但充分性是与元素子集的性质相一致的。因此，使用快速聚类将数据聚类为两类(相关数据和不相关数据)。采用K-means算法对数据进行聚类，将输入数据分为四类。通过对每个类别的最大和最小特异性的计算，获得了更好的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Biclustering on gene expression data

Microarray technology is a tool which is essential to observe and monitor the genes in an living organism. Biclustering is a strategy to distinguish qualities that are co-directed under a subset of conditions, however are not really co-controlled crosswise over different conditions. The dataset is in the form of matrix, row matrix represents a set of genes and column matrix represents a set of conditions. Each row in a matrix corresponds to set of genes and each column represents a set of conditions. The goal of this project is to identify groups of genes sharing a common subset of regulatory units. A method is needed to select clusters of genes and conditions simultaneously, finding distinctive clusters with less number of rules generated. Feature selection might be assessed from both the proficiency and adequacy perspectives. While the productivity concerns the time needed to discover a set of components is that the adequacy is identified with the nature of the subset of elements. Hence fast clustering is used to cluster the data into two categories (relevant data, irrelevant data). K-means algorithm is employed to bicluster the data in order to divide the input data into four classes. From the calculation of the maximum and minimum specificity of each class, better accuracy is outperformed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET)

自引率

0.00%

发文量