Identification of Proteins and Genes Associated with Hedgehog Signaling Pathway Involved in Neoplasm Formation Using Text-Mining Approach

IF 7.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
{"title":"Identification of Proteins and Genes Associated with Hedgehog Signaling Pathway Involved in Neoplasm Formation Using Text-Mining Approach","authors":"","doi":"10.26599/BDMA.2023.9020007","DOIUrl":null,"url":null,"abstract":"Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine, because it may help to find new therapeutic opportunities for cancer treatment and cure including personalized treatment approaches. One of the pathways known to be important for the development of neoplastic diseases and pathological processes is the Hedgehog signaling pathway that normally controls human embryonic development. Systematic accumulation of various types of biological data, including interactions between proteins, regulation of genes transcription, proteomics, and metabolomics experiments results, allows the application of computational analysis of these big data for identification of key molecular mechanisms of certain diseases and pathologies and promising therapeutic targets. The aim of this study is to develop a computational approach for revealing associations between human proteins and genes interacting with the Hedgehog pathway components, as well as for identifying their roles in the development of various types of tumors. We automatically collect sets of abstract texts from the NCBI PubMed bibliographic database. For recognition of the Hedgehog pathway proteins and genes and neoplastic diseases we use a dictionary-based named entity recognition approach, while for all other proteins and genes machine learning method is used. For association extraction, we develop a set of semantic rules. We complete the results of the text analysis with the gene set enrichment analysis. The identified key pathways that may influence the Hedgehog pathway and their roles in tumor development are then verified using the information in the literature.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"7 1","pages":"25-93"},"PeriodicalIF":7.7000,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10373000","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Mining and Analytics","FirstCategoryId":"1093","ListUrlMain":"https://ieeexplore.ieee.org/document/10373000/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine, because it may help to find new therapeutic opportunities for cancer treatment and cure including personalized treatment approaches. One of the pathways known to be important for the development of neoplastic diseases and pathological processes is the Hedgehog signaling pathway that normally controls human embryonic development. Systematic accumulation of various types of biological data, including interactions between proteins, regulation of genes transcription, proteomics, and metabolomics experiments results, allows the application of computational analysis of these big data for identification of key molecular mechanisms of certain diseases and pathologies and promising therapeutic targets. The aim of this study is to develop a computational approach for revealing associations between human proteins and genes interacting with the Hedgehog pathway components, as well as for identifying their roles in the development of various types of tumors. We automatically collect sets of abstract texts from the NCBI PubMed bibliographic database. For recognition of the Hedgehog pathway proteins and genes and neoplastic diseases we use a dictionary-based named entity recognition approach, while for all other proteins and genes machine learning method is used. For association extraction, we develop a set of semantic rules. We complete the results of the text analysis with the gene set enrichment analysis. The identified key pathways that may influence the Hedgehog pathway and their roles in tumor development are then verified using the information in the literature.
利用文本挖掘方法识别与涉及肿瘤形成的刺猬信号通路相关的蛋白质和基因
分析导致各种类型肿瘤发生的分子机制对生物学和医学至关重要,因为这有助于为癌症治疗和治愈找到新的治疗机会,包括个性化治疗方法。已知对肿瘤性疾病的发展和病理过程非常重要的途径之一是刺猬信号途径,它通常控制着人类胚胎的发育。通过系统积累各类生物数据,包括蛋白质之间的相互作用、基因转录调控、蛋白质组学和代谢组学实验结果,可以应用这些大数据进行计算分析,从而确定某些疾病和病理的关键分子机制以及有希望的治疗靶点。本研究的目的是开发一种计算方法,用于揭示与刺猬通路成分相互作用的人类蛋白质和基因之间的关联,并确定它们在各类肿瘤发生发展中的作用。我们从 NCBI PubMed 书目数据库中自动收集摘要文本集。对于刺猬通路蛋白和基因以及肿瘤性疾病的识别,我们采用了基于字典的命名实体识别方法,而对于所有其他蛋白和基因则采用了机器学习方法。在关联提取方面,我们制定了一套语义规则。我们通过基因组富集分析来完善文本分析结果。然后,我们利用文献信息对已确定的可能影响刺猬蛋白通路的关键通路及其在肿瘤发生中的作用进行了验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Big Data Mining and Analytics
Big Data Mining and Analytics Computer Science-Computer Science Applications
CiteScore
20.90
自引率
2.20%
发文量
84
期刊介绍: Big Data Mining and Analytics, a publication by Tsinghua University Press, presents groundbreaking research in the field of big data research and its applications. This comprehensive book delves into the exploration and analysis of vast amounts of data from diverse sources to uncover hidden patterns, correlations, insights, and knowledge. Featuring the latest developments, research issues, and solutions, this book offers valuable insights into the world of big data. It provides a deep understanding of data mining techniques, data analytics, and their practical applications. Big Data Mining and Analytics has gained significant recognition and is indexed and abstracted in esteemed platforms such as ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, and more. With its wealth of information and its ability to transform the way we perceive and utilize data, this book is a must-read for researchers, professionals, and anyone interested in the field of big data analytics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信