S. Babichev, Mohamed Ali Taif, V. Lytvynenko, V. Osypenko
{"title":"Criterial analysis of gene expression sequences to create the objective clustering inductive technology","authors":"S. Babichev, Mohamed Ali Taif, V. Lytvynenko, V. Osypenko","doi":"10.1109/ELNANO.2017.7939756","DOIUrl":null,"url":null,"abstract":"The paper presents the researches to determine the effectiveness of different criteria to estimate the complex biology objects clustering quality. The gene expression sequences of cancer patients were used as experimental data. The degree of the studied objects similarity was estimated by the comparison of the gene expression sequences profile using different metrics to estimate the objects proximity. The studies have shown that the best separating ability is obtained by using the correlation metric proximity of objects. Herewith the use of the CH criterion (Calinski-Harabasz) allows to get the most objective objects clustering by using simulated data. The presented research is focused mainly on the inductive model of the objective clustering, where the objects clustering is carried out concurrently on the two equal power subsets. In this case, the final decision about the objects grouping is accepted using the two subsets basing both on the internal clustering quality criteria estimating and the minimum value of the external criterion of clustering similarity.","PeriodicalId":333746,"journal":{"name":"2017 IEEE 37th International Conference on Electronics and Nanotechnology (ELNANO)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 37th International Conference on Electronics and Nanotechnology (ELNANO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ELNANO.2017.7939756","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38
Abstract
The paper presents the researches to determine the effectiveness of different criteria to estimate the complex biology objects clustering quality. The gene expression sequences of cancer patients were used as experimental data. The degree of the studied objects similarity was estimated by the comparison of the gene expression sequences profile using different metrics to estimate the objects proximity. The studies have shown that the best separating ability is obtained by using the correlation metric proximity of objects. Herewith the use of the CH criterion (Calinski-Harabasz) allows to get the most objective objects clustering by using simulated data. The presented research is focused mainly on the inductive model of the objective clustering, where the objects clustering is carried out concurrently on the two equal power subsets. In this case, the final decision about the objects grouping is accepted using the two subsets basing both on the internal clustering quality criteria estimating and the minimum value of the external criterion of clustering similarity.