Characterizing Fractal Genetic Variation in the Human Genome from the Hapmap Project

IF 6.6 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
A. Borri, A. Cerasa, P. Tonin, L. Citrigno, C. Porcaro
{"title":"Characterizing Fractal Genetic Variation in the Human Genome from the Hapmap Project","authors":"A. Borri, A. Cerasa, P. Tonin, L. Citrigno, C. Porcaro","doi":"10.1142/S0129065722500289","DOIUrl":null,"url":null,"abstract":"Over the last decades, the exuberant development of next-generation sequencing has revolutionized gene discovery. These technologies have boosted the mapping of single nucleotide polymorphisms (SNPs) across the human genome, providing a complex universe of heterogeneity characterizing individuals worldwide. Fractal dimension (FD) measures the degree of geometric irregularity, quantifying how \"complex\" a self-similar natural phenomenon is. We compared two FD algorithms, box-counting dimension (BCD) and Higuchi's fractal dimension (HFD), to characterize genome-wide patterns of SNPs extracted from the HapMap data set, which includes data from 1184 healthy subjects of eleven populations. In addition, we have used cluster and classification analysis to relate the genetic distances within chromosomes based on FD similarities to the geographical distances among the 11 global populations. We found that HFD outperformed BCD at both grand average clusterization analysis by the cophenetic correlation coefficient, in which the closest value to 1 represents the most accurate clustering solution (0.981 for the HFD and 0.956 for the BCD) and classification (79.0% accuracy, 61.7% sensitivity, and 96.4% specificity for the HFD with respect to 69.1% accuracy, 43.2% sensitivity, and 94.9% specificity for the BCD) of the 11 populations present in the HapMap data set. These results support the evidence that HFD is a reliable measure helpful in representing individual variations within all chromosomes and categorizing individuals and global populations.","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"1 1","pages":"2250028"},"PeriodicalIF":6.6000,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Neural Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1142/S0129065722500289","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 3

Abstract

Over the last decades, the exuberant development of next-generation sequencing has revolutionized gene discovery. These technologies have boosted the mapping of single nucleotide polymorphisms (SNPs) across the human genome, providing a complex universe of heterogeneity characterizing individuals worldwide. Fractal dimension (FD) measures the degree of geometric irregularity, quantifying how "complex" a self-similar natural phenomenon is. We compared two FD algorithms, box-counting dimension (BCD) and Higuchi's fractal dimension (HFD), to characterize genome-wide patterns of SNPs extracted from the HapMap data set, which includes data from 1184 healthy subjects of eleven populations. In addition, we have used cluster and classification analysis to relate the genetic distances within chromosomes based on FD similarities to the geographical distances among the 11 global populations. We found that HFD outperformed BCD at both grand average clusterization analysis by the cophenetic correlation coefficient, in which the closest value to 1 represents the most accurate clustering solution (0.981 for the HFD and 0.956 for the BCD) and classification (79.0% accuracy, 61.7% sensitivity, and 96.4% specificity for the HFD with respect to 69.1% accuracy, 43.2% sensitivity, and 94.9% specificity for the BCD) of the 11 populations present in the HapMap data set. These results support the evidence that HFD is a reliable measure helpful in representing individual variations within all chromosomes and categorizing individuals and global populations.
人类基因组分形遗传变异的特征
在过去的几十年里,下一代测序的蓬勃发展彻底改变了基因发现。这些技术促进了单核苷酸多态性(SNPs)在人类基因组中的定位,为世界各地的个体提供了一个复杂的异质性世界。分形维数(FD)测量几何不规则程度,量化自相似自然现象的“复杂程度”。我们比较了两种FD算法,盒计数维数(BCD)和Higuchi分形维数(HFD),以表征从HapMap数据集中提取的SNPs的全基因组模式,该数据集包括来自11个群体的1184名健康受试者的数据。此外,我们还使用聚类和分类分析,将基于FD相似性的染色体内遗传距离与11个全球种群之间的地理距离联系起来。我们发现,HFD在两次大平均聚类分析中均优于BCD,其中最接近1的值表示HapMap数据集中存在的11个群体的最准确的聚类解决方案(HFD为0.981,BCD为0.956)和分类(HFD的79.0%准确度、61.7%灵敏度和96.4%特异度相对于BCD的69.1%准确度、43.2%灵敏度和94.9%特异度)。这些结果支持了HFD是一种可靠的测量方法的证据,有助于表示所有染色体内的个体变异,并对个体和全球种群进行分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Neural Systems
International Journal of Neural Systems 工程技术-计算机:人工智能
CiteScore
11.30
自引率
28.80%
发文量
116
审稿时长
24 months
期刊介绍: The International Journal of Neural Systems is a monthly, rigorously peer-reviewed transdisciplinary journal focusing on information processing in both natural and artificial neural systems. Special interests include machine learning, computational neuroscience and neurology. The journal prioritizes innovative, high-impact articles spanning multiple fields, including neurosciences and computer science and engineering. It adopts an open-minded approach to this multidisciplinary field, serving as a platform for novel ideas and enhanced understanding of collective and cooperative phenomena in computationally capable systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信