Bikash Baruah , Manash P. Dutta , Subhasish Banerjee , Dhruba K. Bhattacharyya
{"title":"A novel density based community detection algorithm and its application in detecting potential biomarkers of ESCC","authors":"Bikash Baruah , Manash P. Dutta , Subhasish Banerjee , Dhruba K. Bhattacharyya","doi":"10.1016/j.jocs.2024.102344","DOIUrl":null,"url":null,"abstract":"<div><p>The development of statistically and biologically competent Community Detection Algorithm (CDA) is essential for extracting hidden information from massive biological datasets. This study introduces a novel community index as well as a CDA based on the newly introduced community index. To validate the effectiveness and robustness of the communities identified by the proposed CDA, we compare with six sets of communities identified by well-known CDAs, namely, FastGreedy, infomap, labelProp, leadingEigen, louvain, and walktrap. It is observed that the proposed algorithm outperforms its competing algorithms in terms of several prominent statistical and biological measures. We implement the hardware coding with Verilog, which surprisingly reduces the computation time by 20% compared to R programming while extracting the communities. Next, the communities identified by the proposed algorithm are used for topological and biological analysis with reference to the elite genes, obtained from Genecards, to identify potential biomarkers of Esophageal Squamous Cell Carcinoma (ESCC). Finally, we discover that the genes F2RL3, CALM1, LPAR1, ARPC2, and CLDN7 carry significantly high topological and biological relevance of previously established ESCC elite genes. Further the established wet lab results also substantiate our claims. Hence, we affirm the aforesaid genes, as ESCC potential biomarkers.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"81 ","pages":"Article 102344"},"PeriodicalIF":3.1000,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877750324001376","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The development of statistically and biologically competent Community Detection Algorithm (CDA) is essential for extracting hidden information from massive biological datasets. This study introduces a novel community index as well as a CDA based on the newly introduced community index. To validate the effectiveness and robustness of the communities identified by the proposed CDA, we compare with six sets of communities identified by well-known CDAs, namely, FastGreedy, infomap, labelProp, leadingEigen, louvain, and walktrap. It is observed that the proposed algorithm outperforms its competing algorithms in terms of several prominent statistical and biological measures. We implement the hardware coding with Verilog, which surprisingly reduces the computation time by 20% compared to R programming while extracting the communities. Next, the communities identified by the proposed algorithm are used for topological and biological analysis with reference to the elite genes, obtained from Genecards, to identify potential biomarkers of Esophageal Squamous Cell Carcinoma (ESCC). Finally, we discover that the genes F2RL3, CALM1, LPAR1, ARPC2, and CLDN7 carry significantly high topological and biological relevance of previously established ESCC elite genes. Further the established wet lab results also substantiate our claims. Hence, we affirm the aforesaid genes, as ESCC potential biomarkers.
期刊介绍:
Computational Science is a rapidly growing multi- and interdisciplinary field that uses advanced computing and data analysis to understand and solve complex problems. It has reached a level of predictive capability that now firmly complements the traditional pillars of experimentation and theory.
The recent advances in experimental techniques such as detectors, on-line sensor networks and high-resolution imaging techniques, have opened up new windows into physical and biological processes at many levels of detail. The resulting data explosion allows for detailed data driven modeling and simulation.
This new discipline in science combines computational thinking, modern computational methods, devices and collateral technologies to address problems far beyond the scope of traditional numerical methods.
Computational science typically unifies three distinct elements:
• Modeling, Algorithms and Simulations (e.g. numerical and non-numerical, discrete and continuous);
• Software developed to solve science (e.g., biological, physical, and social), engineering, medicine, and humanities problems;
• Computer and information science that develops and optimizes the advanced system hardware, software, networking, and data management components (e.g. problem solving environments).