Yuanyuan Lin, Nianrui Wang, Jiangyan Liu, Fangqin Zhang, Zhouchao Wei, Ming Yi
{"title":"CHNSCDA: circRNA-disease association prediction based on strongly correlated heterogeneous neighbor sampling","authors":"Yuanyuan Lin, Nianrui Wang, Jiangyan Liu, Fangqin Zhang, Zhouchao Wei, Ming Yi","doi":"10.1007/s13042-024-02375-1","DOIUrl":null,"url":null,"abstract":"<p>Circular RNAs (circRNAs) are a special class of endogenous non-coding RNA molecules with a closed circular structure. Numerous studies have demonstrated that exploring the association between circRNAs and diseases is beneficial in revealing the pathogenesis of diseases. However, traditional biological experimental methods are time-consuming. Although some methods have explored the circRNA associated with diseases from different perspectives, how to effectively integrate the multi-perspective data of circRNAs has not been well studied, and the feature aggregation between heterogeneous nodes has not been fully considered. Based on these considerations, a novel computational framework, called CHNSCDA, is proposed to efficiently forecast unknown circRNA-disease associations(CDAs). Specifically, we calculate the sequence similarity and functional similarity for circRNAs, as well as the semantic similarity for diseases. Then the similarities of circRNAs and diseases are combined with Gaussian interaction profile kernels (GIPs) similarity, respectively. These similarities are fused by taking the maximum values. Moreover, circRNA-circRNA associations and disease-disease associations with strong correlations are selectively combined to construct a heterogeneous network. Subsequently, we predict the potential CDAs based on the multi-head dynamic attention mechanism and multi-layer convolutional neural network. The experimental results show that CHNSCDA outperforms the other four state-of-the-art methods and achieves an area under the ROC curve of 0.9803 in 5-fold cross validation (5-fold CV). In addition, extensive ablation comparison experiments were conducted to confirm the validity of different similarity feature aggregation methods, feature aggregation methods, and dynamic attention. Case studies further demonstrate the outstanding performance of CHNSCDA in predicting potential CDAs.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"32 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02375-1","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Circular RNAs (circRNAs) are a special class of endogenous non-coding RNA molecules with a closed circular structure. Numerous studies have demonstrated that exploring the association between circRNAs and diseases is beneficial in revealing the pathogenesis of diseases. However, traditional biological experimental methods are time-consuming. Although some methods have explored the circRNA associated with diseases from different perspectives, how to effectively integrate the multi-perspective data of circRNAs has not been well studied, and the feature aggregation between heterogeneous nodes has not been fully considered. Based on these considerations, a novel computational framework, called CHNSCDA, is proposed to efficiently forecast unknown circRNA-disease associations(CDAs). Specifically, we calculate the sequence similarity and functional similarity for circRNAs, as well as the semantic similarity for diseases. Then the similarities of circRNAs and diseases are combined with Gaussian interaction profile kernels (GIPs) similarity, respectively. These similarities are fused by taking the maximum values. Moreover, circRNA-circRNA associations and disease-disease associations with strong correlations are selectively combined to construct a heterogeneous network. Subsequently, we predict the potential CDAs based on the multi-head dynamic attention mechanism and multi-layer convolutional neural network. The experimental results show that CHNSCDA outperforms the other four state-of-the-art methods and achieves an area under the ROC curve of 0.9803 in 5-fold cross validation (5-fold CV). In addition, extensive ablation comparison experiments were conducted to confirm the validity of different similarity feature aggregation methods, feature aggregation methods, and dynamic attention. Case studies further demonstrate the outstanding performance of CHNSCDA in predicting potential CDAs.
期刊介绍:
Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data.
The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC.
Key research areas to be covered by the journal include:
Machine Learning for modeling interactions between systems
Pattern Recognition technology to support discovery of system-environment interaction
Control of system-environment interactions
Biochemical interaction in biological and biologically-inspired systems
Learning for improvement of communication schemes between systems