基于IGF2蛋白结构的动物凝聚力使用了k -手段和n克

Ruth Ema Febrita, Maghfirotul Amaniyah
{"title":"基于IGF2蛋白结构的动物凝聚力使用了k -手段和n克","authors":"Ruth Ema Febrita, Maghfirotul Amaniyah","doi":"10.31294/inf.v9i2.13808","DOIUrl":null,"url":null,"abstract":"In Biology, there were various ways to determine the closeness between two individuals, such as by observing the similarity of physical morphologies then making a dendogram and also by making a phylogenetic tree to trace the kinship based on the evolutionary history. However, this approach is very difficult to do if the animal whose relatives are to be determined is not in a living condition, so it is very difficult to observe the existing physical characteristics. This study aims to provide a different approach in determining animal kinship using clustering algorithm to cluster the IGF2 protein structures. Kinship is determined using the K-Means clustering method. N-gram technique is used to break the sequence into several subsequences with the same length, because each sequence can have various length. Grouping with the K-Means method had been done and got the best results on the number of clusters as many as seven clusters, with an average silhouette coefficient of 0.331, a purityindex of 0.735, and a precisionof 0.823 which indicates the clustering process is quite effective.","PeriodicalId":32029,"journal":{"name":"Proxies Jurnal Informatika","volume":"210 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Penentuan Kekerabatan Hewan Berdasarkan Struktur Protein IGF2 Menggunakan Metode K-Means dan N-Gram\",\"authors\":\"Ruth Ema Febrita, Maghfirotul Amaniyah\",\"doi\":\"10.31294/inf.v9i2.13808\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In Biology, there were various ways to determine the closeness between two individuals, such as by observing the similarity of physical morphologies then making a dendogram and also by making a phylogenetic tree to trace the kinship based on the evolutionary history. However, this approach is very difficult to do if the animal whose relatives are to be determined is not in a living condition, so it is very difficult to observe the existing physical characteristics. This study aims to provide a different approach in determining animal kinship using clustering algorithm to cluster the IGF2 protein structures. Kinship is determined using the K-Means clustering method. N-gram technique is used to break the sequence into several subsequences with the same length, because each sequence can have various length. Grouping with the K-Means method had been done and got the best results on the number of clusters as many as seven clusters, with an average silhouette coefficient of 0.331, a purityindex of 0.735, and a precisionof 0.823 which indicates the clustering process is quite effective.\",\"PeriodicalId\":32029,\"journal\":{\"name\":\"Proxies Jurnal Informatika\",\"volume\":\"210 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proxies Jurnal Informatika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31294/inf.v9i2.13808\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proxies Jurnal Informatika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31294/inf.v9i2.13808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在生物学中,有多种方法来确定两个个体之间的亲缘关系,例如通过观察身体形态的相似性来绘制树形图,也可以通过绘制系统发育树来根据进化史来追溯亲缘关系。然而,如果要确定其亲属的动物不是处于生活状态,那么这种方法很难做到,因此很难观察到现有的身体特征。本研究旨在利用聚类算法对IGF2蛋白结构进行聚类,为确定动物亲缘关系提供一种不同的方法。亲属关系是用k均值聚类方法确定的。由于每个序列可以有不同的长度,因此使用N-gram技术将序列分解为几个长度相同的子序列。用K-Means方法进行了聚类,聚类数量达到7个,聚类结果最好,平均剪影系数为0.331,纯度指数为0.735,精度为0.823,表明聚类过程是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Penentuan Kekerabatan Hewan Berdasarkan Struktur Protein IGF2 Menggunakan Metode K-Means dan N-Gram
In Biology, there were various ways to determine the closeness between two individuals, such as by observing the similarity of physical morphologies then making a dendogram and also by making a phylogenetic tree to trace the kinship based on the evolutionary history. However, this approach is very difficult to do if the animal whose relatives are to be determined is not in a living condition, so it is very difficult to observe the existing physical characteristics. This study aims to provide a different approach in determining animal kinship using clustering algorithm to cluster the IGF2 protein structures. Kinship is determined using the K-Means clustering method. N-gram technique is used to break the sequence into several subsequences with the same length, because each sequence can have various length. Grouping with the K-Means method had been done and got the best results on the number of clusters as many as seven clusters, with an average silhouette coefficient of 0.331, a purityindex of 0.735, and a precisionof 0.823 which indicates the clustering process is quite effective.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
20
审稿时长
24 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信