GINClus: RNA structural motif clustering using graph isomorphism network.

IF 2.8 Q1 GENETICS & HEREDITY
NAR Genomics and Bioinformatics Pub Date : 2025-04-26 eCollection Date: 2025-06-01 DOI:10.1093/nargab/lqaf050
Nabila Shahnaz Khan, Md Mahfuzur Rahaman, Shaojie Zhang
{"title":"GINClus: RNA structural motif clustering using graph isomorphism network.","authors":"Nabila Shahnaz Khan, Md Mahfuzur Rahaman, Shaojie Zhang","doi":"10.1093/nargab/lqaf050","DOIUrl":null,"url":null,"abstract":"<p><p>Ribonucleic acid (RNA) structural motif identification is a crucial step for understanding RNA structure and functionality. Due to the complexity and variations of RNA 3D structures, identifying RNA structural motifs is challenging and time-consuming. Particularly, discovering new RNA structural motif families is a hard problem and still largely depends on manual analysis. In this paper, we proposed an RNA structural motif clustering tool, named GINClus, which uses a semi-supervised deep learning model to cluster RNA motif candidates (RNA loop regions) based on both base interaction and 3D structure similarities. GINClus converts base interactions and 3D structures of RNA motif candidates into graph representations and using graph isomorphism network (GIN) model in combination with <i>K</i>-means and hierarchical agglomerative clustering, GINClus clusters the RNA motif candidates based on their structural similarities. GINClus has a clustering accuracy of 87.88% for known internal loop motifs and 97.69% for known hairpin loop motifs. Using GINClus, we successfully clustered the motifs of the same families together and were able to find 927 new instances of Sarcin-ricin, Kink-turn, Tandem-shear, Hook-turn, E-loop, C-loop, T-loop, and GNRA loop motif families. We also identified 12 new RNA structural motif families with unique structure and base-pair interactions.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 2","pages":"lqaf050"},"PeriodicalIF":2.8000,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12034103/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqaf050","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Ribonucleic acid (RNA) structural motif identification is a crucial step for understanding RNA structure and functionality. Due to the complexity and variations of RNA 3D structures, identifying RNA structural motifs is challenging and time-consuming. Particularly, discovering new RNA structural motif families is a hard problem and still largely depends on manual analysis. In this paper, we proposed an RNA structural motif clustering tool, named GINClus, which uses a semi-supervised deep learning model to cluster RNA motif candidates (RNA loop regions) based on both base interaction and 3D structure similarities. GINClus converts base interactions and 3D structures of RNA motif candidates into graph representations and using graph isomorphism network (GIN) model in combination with K-means and hierarchical agglomerative clustering, GINClus clusters the RNA motif candidates based on their structural similarities. GINClus has a clustering accuracy of 87.88% for known internal loop motifs and 97.69% for known hairpin loop motifs. Using GINClus, we successfully clustered the motifs of the same families together and were able to find 927 new instances of Sarcin-ricin, Kink-turn, Tandem-shear, Hook-turn, E-loop, C-loop, T-loop, and GNRA loop motif families. We also identified 12 new RNA structural motif families with unique structure and base-pair interactions.

Abstract Image

Abstract Image

Abstract Image

GINClus:基于图同构网络的RNA结构基序聚类。
核糖核酸(RNA)结构基序的鉴定是了解RNA结构和功能的关键步骤。由于RNA三维结构的复杂性和变化性,鉴定RNA结构基序是具有挑战性和耗时的。特别是,发现新的RNA结构基序家族是一个难题,仍然很大程度上依赖于人工分析。在本文中,我们提出了一个RNA结构基序聚类工具GINClus,它使用半监督深度学习模型基于碱基相互作用和三维结构相似性对RNA基序候选(RNA环区)进行聚类。GINClus将候选RNA基序的碱基相互作用和三维结构转化为图表示,并利用图同构网络(GIN)模型结合K-means和分层聚类,基于结构相似性对候选RNA基序进行聚类。GINClus对已知的内部环模的聚类准确率为87.88%,对已知的发夹环模的聚类准确率为97.69%。使用GINClus,我们成功地将相同家族的基序聚类在一起,并能够找到927个新的Sarcin-ricin, Kink-turn, tantem -shear, Hook-turn, E-loop, C-loop, T-loop和GNRA loop基序家族。我们还发现了12个具有独特结构和碱基对相互作用的新的RNA结构基序家族。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.00
自引率
2.20%
发文量
95
审稿时长
15 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信